htmd.projections.vamp module#

htmd.projections.vamp.vampScore(data, lag, dim=None, units='frames', r=2, nfolds=10, blocksize=None, random_state=None, return_scores=False)#

Compute the cross-validated VAMP-2 score of projected data.

The VAMP-2 score measures how much of the slow dynamics a featurization captures. Evaluated in a cross-validated manner it gives an objective way to compare featurizations and to choose the TICA lag time and number of dimensions: a higher score is better, the trivial baseline (no slow processes resolved) is approximately 1.

Parameters:
  • data (MetricData) – The projected per-trajectory features to score.

  • lag (float) – The VAMP lag time, in the units given by units.

  • dim (int | None) – Number of VAMP dimensions to score on. If None, VAMP uses its default (variance cutoff). Vary this to choose how many dimensions to keep.

  • units (str) – The units of lag and blocksize. Can be 'frames' or a time unit given as a string.

  • r (int) – Which VAMP-r score to compute. 2 (default) is the VAMP-2 score.

  • nfolds (int) – Number of cross-validation folds.

  • blocksize (int | None) – If None (default), cross-validate over whole trajectories. If given, split trajectories into blocks of this many frames (in units) and cross-validate over the blocks. Use this when you have few long trajectories.

  • random_state (int | None) – Seed for the cross-validation fold assignment, for reproducibility.

  • return_scores (bool) – If True, return the raw per-fold score array instead of (mean, std).

Returns:

  • mean, std (float) – The mean and standard deviation of the per-fold scores. Returned unless return_scores is True.

  • scores (numpy.ndarray) – The per-fold scores. Returned only if return_scores is True.

Examples

>>> from htmd.projections.vamp import vampScore
>>> mean, std = vampScore(data, lag=20, dim=4, units="ns")