htmd.metricdatagenerator module#

class htmd.metricdatagenerator.MetricDataGenerator(fulldata, model=None, is_adaptive=False)#

Bases: object

Generate synthetic trajectories from existing projected data.

Given a clustered MetricData object (and optionally a Markov state model), this class produces new synthetic trajectories by resampling frames from the clusters of the original data. The various newTrajectories* methods implement different resampling strategies.

Parameters:
  • fulldata (MetricData) – The clustered MetricData object from which to sample frames.

  • model (Model | None) – A Markov state model built on fulldata. Required for the MSM-based sampling strategies.

  • is_adaptive (bool) – If True, starting frames are drawn from the first epoch of an adaptive sampling run.

newMetricData(datasource, trajectories=None, olddata=None)#

Convert generated trajectory indexes into a new MetricData object.

Parameters:
  • datasource (MetricData) – The MetricData object from which to collect projections and references for the sampled frames.

  • trajectories (list | None) – A list of trajectories, each given as trajectory index-frame pairs (such as those returned by the newTrajectories* methods).

  • olddata (MetricData | None) – If given, the new data is appended to this object and the merged object is returned.

Returns:

data – A MetricData object containing the generated trajectories.

Return type:

MetricData

newTrajectoriesClusterJumping(simlen, ntraj, startFrames=None, jumpprob=0.1)#

Generate trajectories by advancing one frame at a time with random cluster jumps.

At each step the trajectory advances one frame ahead. With probability jumpprob (or when the end of the current trajectory is reached) it jumps to a random frame of the same cluster to continue sampling.

Parameters:
  • simlen (int) – The length (in frames) of each new trajectory.

  • ntraj (int) – The number of new trajectories to generate.

  • startFrames (list | ndarray | None) – Starting frames as trajectory index-frame pairs. If None, starting frames are picked automatically.

  • jumpprob (float) – The per-frame probability of jumping to a new frame within the same cluster.

Returns:

ret – One array per new trajectory, each row a trajectory index-frame pair.

Return type:

list

newTrajectoriesFiller(simlen, ntraj, startFrames=None)#

Generate trajectories by chaining cluster pieces until the length is reached.

Starting from a frame, it appends pieces sampled from the corresponding clusters until each trajectory reaches simlen frames.

Parameters:
  • simlen (int) – The length (in frames) of each new trajectory.

  • ntraj (int) – The number of new trajectories to generate.

  • startFrames (list | ndarray | None) – Starting frames as trajectory index-frame pairs. If None, starting frames are picked automatically.

Returns:

ret – One array per new trajectory, each row a trajectory index-frame pair.

Return type:

list

newTrajectoriesMSM(simlen, ntraj, startFrames=None)#

Generate new synthetic trajectories sampled from the Markov state model.

At each step the next microstate is drawn from the transition probability matrix of the model, and a random frame of the corresponding cluster is selected.

Parameters:
  • simlen (int) – The length (in frames) of each new trajectory.

  • ntraj (int) – The number of new trajectories to generate.

  • startFrames (list | ndarray | None) – Starting frames as trajectory index-frame pairs. If None, starting frames are picked automatically.

Returns:

ret – One array per new trajectory, each row a trajectory index-frame pair.

Return type:

list

newTrajectoriesSimple(simlen, ntraj, startFrames=None)#

Generate trajectories by sampling whole pieces from a cluster.

For each respawning conformation, selects a random trajectory from the conformations in its cluster.

Parameters:
  • simlen (int) – The length (in frames) of each new trajectory.

  • ntraj (int) – The number of new trajectories to generate.

  • startFrames (list | ndarray | None) – Starting frames as trajectory index-frame pairs. If None, starting frames are picked automatically.

Returns:

ret – One array per new trajectory, each row a trajectory index-frame pair.

Return type:

list

parallelTest(simlen, ntraj, startFrames=None)#

Generate MSM-sampled trajectories in parallel across multiple processes.

Parameters:
  • simlen (int) – The length (in frames) of each new trajectory.

  • ntraj (int) – The number of new trajectories to generate.

  • startFrames (list | ndarray | None) – Starting frames as trajectory index-frame pairs. If None, starting frames are picked automatically.

Returns:

ret – One array per new trajectory, each row a trajectory index-frame pair.

Return type:

list

htmd.metricdatagenerator.abs2rel(absFrames, trajLengths)#

Convert absolute frame indexes into trajectory index-frame pairs.

Parameters:
  • absFrames (int | list | ndarray) – An absolute frame index or a list of absolute frame indexes.

  • trajLengths (ndarray) – The cumulative sum of the trajectory lengths (i.e. np.cumsum(trajLengths)).

Returns:

relframe – An array where each row is a trajectory index-frame pair.

Return type:

ndarray