htmd.adaptive.adaptivegoal module#

class htmd.adaptive.adaptivegoal.AdaptiveGoal#

Bases: AdaptiveMD

Adaptive sampling combining Markov state models with a goal function.

Extends AdaptiveMD by adding a directed component: a user-provided goal function scores conformations, and the respawning probability is a weighted sum of the undirected (MSM-based) and directed (goal-based) scores.

Parameters:
  • app (SimQueue object, default None) – A SimQueue class object used to retrieve and submit simulations

  • project (str, default 'adaptive') – The name of the project

  • nmin (int, default 0) – Minimum number of running simulations

  • nmax (int, default 1) – Maximum number of running simulations

  • nepochs (int, default 1000) – Stop adaptive once we have reached this number of epochs

  • nframes (int, default 0) – Stop adaptive once we have simulated this number of aggregate simulation frames.

  • inputpath (str, default 'input') – The directory used to store input folders

  • generatorspath (str, default 'generators') – The directory containing the generators

  • dryrun (boolean, default False) – A dry run means that the adaptive will retrieve and generate a new epoch but not submit the simulations

  • updateperiod (float, default 0) – When set to a value other than 0, the adaptive will run synchronously every updateperiod seconds

  • coorname (str, default 'input.coor') – Name of the file containing the starting coordinates for the new simulations

  • boxname (str, default 'input.xsc') – Name of the file containing the starting box dimensions for the new simulations. Set to ‘none’ to disable box writing.

  • lock (bool, default False) – Lock the folder while adaptive is ongoing

  • mps (int, default 0) – If mps > 0, it will run simulations using the Multi-Process Service (MPS) with the number of processes specified. If set to 0, mps is disabled

  • datapath (str, default 'data') – The directory in which the completed simulations are stored

  • filter (bool, default True) – Enable or disable filtering of trajectories.

  • filtersel (str, default 'not water') – Atom selection string for filtering. See more here

  • filteredpath (str, default 'filtered') – The directory in which the filtered simulations will be stored

  • projection (Projection object, default None) – A Projection class object or a list of objects which will be used to project the simulation data before constructing a Markov model

  • truncation (str, default None) – Method for truncating the prob distribution (None, ‘cumsum’, ‘statecut’

  • statetype ((``’micro’, ``'cluster', 'macro'), str, default 'micro') – What states (cluster, micro, macro) to use for calculations.

  • macronum (int, default 8) – The number of macrostates to produce

  • skip (int, default 1) – Allows skipping of simulation frames to reduce data. i.e. skip=3 will only keep every third frame

  • lag (int, default 1) – The lagtime used to create the Markov model. Units are in frames.

  • clustmethod (ClusterMixin class, default <class 'htmd.clustering.kcenters.KCenter'>) – Clustering algorithm used to cluster the contacts or distances

  • method (str, default '1/Mc') – Criteria used for choosing from which state to respawn from

  • ticalag (int, default 20) – Lagtime to use for TICA in frames. When using skip remember to change this accordinly.

  • ticadim (int, default 3) – Number of TICA dimensions to use. When set to 0 it disables TICA

  • contactsym (str, default None) – Contact symmetry

  • save (bool, default False) – Save the model generated

  • goalfunction (function, default None) – This function will be used to convert the goal-projected simulation data to a ranking whichcan be used for the directed component of FAST.

  • ucscale (float, default 0.5) – Scaling factor for undirected component. Directed component scaling automatically calculated as (1-uscale)

  • nosampledc (bool, default False) – Spawn only from top DC conformations without sampling

  • autoscale (bool, default False) – Automatically scales exploration and exploitation ratios depending on how stuck the adaptive is at a given goal score.

  • autoscalemult (float, default 1) – Multiplier for the scaling factor.

  • autoscaletol (float, default 0.2) – Tolerance for the scaling factor.

  • autoscalediff (int, default 10) – Diff in epochs to use for scaling factor.

  • savegoal (str, default None) – Save the goal values to the specified file

Examples

>>> crystalSS = MetricSecondaryStructure().project(Molecule('crystal.pdb'))[0]
>>>
>>> # First argument of a goal function always has to be a Molecule object
>>> def ssGoal(mol):
>>>     proj = MetricSecondaryStructure().project(mol)
>>>     ss_score = np.sum(proj == crystalSS, axis=1) / proj.shape[1]  # How many predicted SS match
>>>     return ss_score
>>>
>>> ag = AdaptiveGoal()
>>> ag.generatorspath = '../generators/'
>>> ag.nmin = 2
>>> ag.nmax = 3
>>> ag.projection = [MetricDistance('name CA', 'resname MOL', periodic='selections'), MetricDihedral()]
>>> ag.goalfunction = ssGoal
>>> ag.app = LocalGPUQueue()
>>> ag.run()
>>>
>>> # Or alternatively if we have a multi-argument goal function
>>> def ssGoalAlt(mol, ss):
>>>     proj = MetricSecondaryStructure().project(mol)
>>>     ss_score = np.sum(proj == ss, axis=1) / proj.shape[1]
>>>     return ss_score
>>> from joblib import delayed
>>> ag.goalfunction = delayed(ssGoalAlt)(crystalSS)
>>> ag.app = LocalGPUQueue()
>>> ag.run()