Adaptive sampling#

In this tutorial, we will showcase how to use adaptive sampling simulations on a molecular system. The sample system in this case is the NTL9 protein.

Let’s import HTMD and do some definitions:

from htmd.ui import *
Please cite HTMD: Doerr et al.(2016)JCTC,12,1845.
https://dx.doi.org/10.1021/acs.jctc.6b00049
Documentation: http://software.acellera.com/
To update: conda update htmd -c acellera -c psi4

You are on the latest HTMD version (unpackaged : /home/joao/maindisk/software/repos/Acellera/htmd/htmd).

Get the generators folder structure#

Get the data for this tutorial here. Alternatively, you can download the data using wget:

import os, glob
assert os.system('wget -rcN -np -nH -q --cut-dirs=2 -R index.html* http://pub.htmd.org/tutorials/adaptive-sampling/generators/') == 0
for file in glob.glob('./generators/*/run.sh'):
    os.chmod(file, 0o755)
!tree generators | head -20
generators
|-- ntl9_1ns_0
|   |-- input
|   |-- input.coor
|   |-- input.xsc
|   |-- parameters
|   |-- run.sh
|   |-- structure.pdb
|   `-- structure.psf
|-- ntl9_1ns_1
|   |-- input
|   |-- input.coor
|   |-- input.xsc
|   |-- parameters
|   |-- run.sh
|   |-- structure.pdb
|   `-- structure.psf
`-- ntl9_1ns_2
    |-- input
    |-- input.coor

Adaptive classes#

HTMD has two types of adaptive sampling:

  • AdaptiveMD (free exploration)

  • AdaptiveGoal (exploration + exploitation)

Create a directory for each type of adaptive and copy the generators into them:

os.makedirs('./adaptivemd', exist_ok=True)
os.makedirs('./adaptivegoal', exist_ok=True)
shutil.copytree('./generators', './adaptivemd/generators')
shutil.copytree('./generators', './adaptivegoal/generators')
'./adaptivegoal/generators'

AdaptiveMD#

Let’s change directory to the adaptivemd one and work there:

os.chdir('./adaptivemd')
  • Setup the queue that will be used for simulations.

  • Tell it to store completed trajectories in the data folder as this is where AdaptiveMD expects them to be by default

queue = LocalGPUQueue()
queue.datadir = './data'
ad = AdaptiveMD()
ad.app = queue
  • Set the nmin, nmax and nepochs

ad.nmin = 1
ad.nmax = 3
ad.nepochs = 3
  • Choose what projection to use for the construction of the Markov model

protsel = 'protein and name CA'
ad.projection = MetricSelfDistance(protsel)
  • Set the updateperiod of the Adaptive to define how often it will poll for completed simulations and redo the analysis

ad.updateperiod = 120 # execute every 2 minutes

Launch the AdaptiveMD run:

ad.run()
2018-03-19 11:46:45,536 - htmd.adaptive.adaptive - INFO - Processing epoch 0
2018-03-19 11:46:45,538 - htmd.adaptive.adaptive - INFO - Epoch 0, generating first batch
2018-03-19 11:46:45,563 - htmd.queues.localqueue - INFO - Trying to determine all GPU devices
2018-03-19 11:46:45,614 - htmd.queues.localqueue - INFO - Using GPU devices 0,1,2,3
2018-03-19 11:46:45,941 - htmd.queues.localqueue - INFO - Trying to determine all GPU devices
2018-03-19 11:46:45,993 - htmd.queues.localqueue - INFO - Using GPU devices 0,1,2,3
2018-03-19 11:46:45,997 - htmd.queues.localqueue - INFO - Queueing /data/joao/maindisk/software/repos/Acellera/htmd/tutorials/adaptivemd/input/e1s1_ntl9_1ns_0
2018-03-19 11:46:45,999 - htmd.queues.localqueue - INFO - Queueing /data/joao/maindisk/software/repos/Acellera/htmd/tutorials/adaptivemd/input/e1s2_ntl9_1ns_1
2018-03-19 11:46:45,999 - htmd.queues.localqueue - INFO - Running /data/joao/maindisk/software/repos/Acellera/htmd/tutorials/adaptivemd/input/e1s1_ntl9_1ns_0 on device 0
2018-03-19 11:46:46,001 - htmd.queues.localqueue - INFO - Queueing /data/joao/maindisk/software/repos/Acellera/htmd/tutorials/adaptivemd/input/e1s3_ntl9_1ns_2
2018-03-19 11:46:46,001 - htmd.queues.localqueue - INFO - Running /data/joao/maindisk/software/repos/Acellera/htmd/tutorials/adaptivemd/input/e1s2_ntl9_1ns_1 on device 1
2018-03-19 11:46:46,004 - htmd.adaptive.adaptive - INFO - Sleeping for 120 seconds.
2018-03-19 11:46:46,004 - htmd.queues.localqueue - INFO - Running /data/joao/maindisk/software/repos/Acellera/htmd/tutorials/adaptivemd/input/e1s3_ntl9_1ns_2 on device 2
2018-03-19 11:48:46,071 - htmd.adaptive.adaptive - INFO - Processing epoch 1
2018-03-19 11:48:46,072 - htmd.adaptive.adaptive - INFO - Retrieving simulations.
2018-03-19 11:48:46,073 - htmd.adaptive.adaptive - INFO - 3 simulations in progress
2018-03-19 11:48:46,075 - htmd.adaptive.adaptive - INFO - Sleeping for 120 seconds.
2018-03-19 11:49:13,578 - htmd.queues.localqueue - INFO - Completed /data/joao/maindisk/software/repos/Acellera/htmd/tutorials/adaptivemd/input/e1s2_ntl9_1ns_1
2018-03-19 11:49:13,680 - htmd.queues.localqueue - INFO - Completed /data/joao/maindisk/software/repos/Acellera/htmd/tutorials/adaptivemd/input/e1s3_ntl9_1ns_2
2018-03-19 11:49:19,629 - htmd.queues.localqueue - INFO - Completed /data/joao/maindisk/software/repos/Acellera/htmd/tutorials/adaptivemd/input/e1s1_ntl9_1ns_0
2018-03-19 11:50:46,084 - htmd.adaptive.adaptive - INFO - Processing epoch 1
2018-03-19 11:50:46,085 - htmd.adaptive.adaptive - INFO - Retrieving simulations.
2018-03-19 11:50:46,086 - htmd.adaptive.adaptive - INFO - 0 simulations in progress
2018-03-19 11:50:46,087 - htmd.adaptive.adaptiverun - INFO - Postprocessing new data
Creating simlist: 100%|██████████| 3/3 [00:00<00:00, 213.98it/s]
Filtering trajectories: 100%|██████████| 3/3 [00:00<00:00, 12.49it/s]
Projecting trajectories: 100%|██████████| 3/3 [00:00<00:00, 14.29it/s]
2018-03-19 11:50:49,429 - htmd.projections.metric - INFO - Frame step 0.10000000149011612ns was read from the trajectories. If it looks wrong, redefine it by manually setting the MetricData.fstep property.
2018-03-19 11:50:49,431 - htmd.metricdata - INFO - Dropped 0 trajectories from 3 resulting in 3
A Jupyter Widget
/home/joao/maindisk/SANDBOX/miniconda3/miniconda3/lib/python3.6/site-packages/pyemma/__init__.py:91: UserWarning: You are not using the latest release of PyEMMA. Latest is 2.5.1, you have 2.4.
  .format(latest=latest, current=current), category=UserWarning)
A Jupyter Widget
2018-03-19 11:50:50,288 - htmd.metricdata - INFO - Dropped 0 trajectories from 3 resulting in 3
19-03-18 11:50:50 pyemma.msm.estimators.implied_timescales.ImpliedTimescales[2] WARNING  Changed user setting nits to the number of available timescales nits=7
2018-03-19 11:50:50,301 - pyemma.msm.estimators.implied_timescales.ImpliedTimescales[2] - WARNING - Changed user setting nits to the number of available timescales nits=7
2018-03-19 11:50:50,636 - htmd.model - WARNING - PCCA returned empty macrostates. Reducing the number of macrostates to 3.
2018-03-19 11:50:50,678 - htmd.model - WARNING - PCCA returned empty macrostates. Reducing the number of macrostates to 2.
2018-03-19 11:50:50,688 - htmd.model - INFO - 93.3% of the data was used
2018-03-19 11:50:50,690 - htmd.model - INFO - Number of trajectories that visited each macrostate:
2018-03-19 11:50:50,691 - htmd.model - INFO - [3 3]
2018-03-19 11:50:50,692 - htmd.model - INFO - Take care! Macro 0 has been visited only in 3 trajectories:
2018-03-19 11:50:50,693 - htmd.model - INFO -
simid = 0
parent = 0
input = None
trajectory = ['filtered/e1s1_ntl9_1ns_0/output.filtered.xtc']
molfile = filtered/filtered.pdb
numframes = [None]

2018-03-19 11:50:50,694 - htmd.model - INFO -
simid = 1
parent = 1
input = None
trajectory = ['filtered/e1s2_ntl9_1ns_1/output.filtered.xtc']
molfile = filtered/filtered.pdb
numframes = [None]

2018-03-19 11:50:50,694 - htmd.model - INFO -
simid = 2
parent = 2
input = None
trajectory = ['filtered/e1s3_ntl9_1ns_2/output.filtered.xtc']
molfile = filtered/filtered.pdb
numframes = [None]

2018-03-19 11:50:50,695 - htmd.model - INFO - Take care! Macro 1 has been visited only in 3 trajectories:
2018-03-19 11:50:50,696 - htmd.model - INFO -
simid = 0
parent = 0
input = None
trajectory = ['filtered/e1s1_ntl9_1ns_0/output.filtered.xtc']
molfile = filtered/filtered.pdb
numframes = [None]

2018-03-19 11:50:50,697 - htmd.model - INFO -
simid = 1
parent = 1
input = None
trajectory = ['filtered/e1s2_ntl9_1ns_1/output.filtered.xtc']
molfile = filtered/filtered.pdb
numframes = [None]

2018-03-19 11:50:50,697 - htmd.model - INFO -
simid = 2
parent = 2
input = None
trajectory = ['filtered/e1s3_ntl9_1ns_2/output.filtered.xtc']
molfile = filtered/filtered.pdb
numframes = [None]

Writing inputs: 100%|██████████| 3/3 [00:00<00:00, 13.04it/s]
2018-03-19 11:50:51,053 - htmd.queues.localqueue - INFO - Queueing /data/joao/maindisk/software/repos/Acellera/htmd/tutorials/adaptivemd/input/e2s1_e1s3p0f8
2018-03-19 11:50:51,055 - htmd.queues.simqueue - INFO - Removed existing htmd.queues.done sentinel from input/e2s1_e1s3p0f8
2018-03-19 11:50:51,056 - htmd.queues.localqueue - INFO - Running /data/joao/maindisk/software/repos/Acellera/htmd/tutorials/adaptivemd/input/e2s1_e1s3p0f8 on device 3
2018-03-19 11:50:51,056 - htmd.queues.localqueue - INFO - Queueing /data/joao/maindisk/software/repos/Acellera/htmd/tutorials/adaptivemd/input/e2s2_e1s1p0f5
2018-03-19 11:50:51,069 - htmd.queues.simqueue - INFO - Removed existing htmd.queues.done sentinel from input/e2s2_e1s1p0f5
2018-03-19 11:50:51,073 - htmd.queues.localqueue - INFO - Running /data/joao/maindisk/software/repos/Acellera/htmd/tutorials/adaptivemd/input/e2s2_e1s1p0f5 on device 1
2018-03-19 11:50:51,073 - htmd.queues.localqueue - INFO - Queueing /data/joao/maindisk/software/repos/Acellera/htmd/tutorials/adaptivemd/input/e2s3_e1s1p0f7
2018-03-19 11:50:51,076 - htmd.queues.simqueue - INFO - Removed existing htmd.queues.done sentinel from input/e2s3_e1s1p0f7
2018-03-19 11:50:51,077 - htmd.adaptive.adaptive - INFO - Finished submitting simulations.
2018-03-19 11:50:51,078 - htmd.queues.localqueue - INFO - Running /data/joao/maindisk/software/repos/Acellera/htmd/tutorials/adaptivemd/input/e2s3_e1s1p0f7 on device 0
2018-03-19 11:50:51,088 - htmd.adaptive.adaptive - INFO - Sleeping for 120 seconds.
2018-03-19 11:52:51,201 - htmd.adaptive.adaptive - INFO - Processing epoch 2
2018-03-19 11:52:51,202 - htmd.adaptive.adaptive - INFO - Retrieving simulations.
2018-03-19 11:52:51,203 - htmd.adaptive.adaptive - INFO - 3 simulations in progress
2018-03-19 11:52:51,204 - htmd.adaptive.adaptive - INFO - Sleeping for 120 seconds.
2018-03-19 11:53:18,276 - htmd.queues.localqueue - INFO - Completed /data/joao/maindisk/software/repos/Acellera/htmd/tutorials/adaptivemd/input/e2s1_e1s3p0f8
2018-03-19 11:53:19,409 - htmd.queues.localqueue - INFO - Completed /data/joao/maindisk/software/repos/Acellera/htmd/tutorials/adaptivemd/input/e2s2_e1s1p0f5
2018-03-19 11:53:26,516 - htmd.queues.localqueue - INFO - Completed /data/joao/maindisk/software/repos/Acellera/htmd/tutorials/adaptivemd/input/e2s3_e1s1p0f7
2018-03-19 11:54:51,265 - htmd.adaptive.adaptive - INFO - Processing epoch 2
2018-03-19 11:54:51,266 - htmd.adaptive.adaptive - INFO - Retrieving simulations.
2018-03-19 11:54:51,267 - htmd.adaptive.adaptive - INFO - 0 simulations in progress
2018-03-19 11:54:51,268 - htmd.adaptive.adaptiverun - INFO - Postprocessing new data
Creating simlist: 100%|██████████| 6/6 [00:00<00:00, 377.46it/s]
Filtering trajectories: 100%|██████████| 6/6 [00:00<00:00, 25.10it/s]
Projecting trajectories: 100%|██████████| 6/6 [00:00<00:00, 20.19it/s]
2018-03-19 11:54:53,874 - htmd.projections.metric - INFO - Frame step 0.10000000149011612ns was read from the trajectories. If it looks wrong, redefine it by manually setting the MetricData.fstep property.
2018-03-19 11:54:53,876 - htmd.metricdata - INFO - Dropped 0 trajectories from 6 resulting in 6
A Jupyter Widget
A Jupyter Widget
2018-03-19 11:54:54,633 - htmd.metricdata - INFO - Dropped 0 trajectories from 6 resulting in 6
2018-03-19 11:54:56,750 - htmd.model - INFO - 100.0% of the data was used
2018-03-19 11:54:56,753 - htmd.model - INFO - Number of trajectories that visited each macrostate:
2018-03-19 11:54:56,753 - htmd.model - INFO - [1 1 1 2 3 4 6 6]
2018-03-19 11:54:56,755 - htmd.model - INFO - Take care! Macro 0 has been visited only in 1 trajectories:
2018-03-19 11:54:56,755 - htmd.model - INFO -
simid = 0
parent = 0
input = None
trajectory = ['filtered/e1s1_ntl9_1ns_0/output.filtered.xtc']
molfile = filtered/filtered.pdb
numframes = [10]

2018-03-19 11:54:56,756 - htmd.model - INFO - Take care! Macro 1 has been visited only in 1 trajectories:
2018-03-19 11:54:56,757 - htmd.model - INFO -
simid = 0
parent = 0
input = None
trajectory = ['filtered/e1s1_ntl9_1ns_0/output.filtered.xtc']
molfile = filtered/filtered.pdb
numframes = [10]

2018-03-19 11:54:56,763 - htmd.model - INFO - Take care! Macro 2 has been visited only in 1 trajectories:
2018-03-19 11:54:56,765 - htmd.model - INFO -
simid = 0
parent = 0
input = None
trajectory = ['filtered/e1s1_ntl9_1ns_0/output.filtered.xtc']
molfile = filtered/filtered.pdb
numframes = [10]

2018-03-19 11:54:56,767 - htmd.model - INFO - Take care! Macro 3 has been visited only in 2 trajectories:
2018-03-19 11:54:56,768 - htmd.model - INFO -
simid = 2
parent = 2
input = None
trajectory = ['filtered/e1s3_ntl9_1ns_2/output.filtered.xtc']
molfile = filtered/filtered.pdb
numframes = [10]

2018-03-19 11:54:56,768 - htmd.model - INFO -
simid = 4
parent = 4
input = None
trajectory = ['filtered/e2s2_e1s1p0f5/output.filtered.xtc']
molfile = filtered/filtered.pdb
numframes = [None]

2018-03-19 11:54:56,769 - htmd.model - INFO - Take care! Macro 4 has been visited only in 3 trajectories:
2018-03-19 11:54:56,770 - htmd.model - INFO -
simid = 0
parent = 0
input = None
trajectory = ['filtered/e1s1_ntl9_1ns_0/output.filtered.xtc']
molfile = filtered/filtered.pdb
numframes = [10]

2018-03-19 11:54:56,772 - htmd.model - INFO -
simid = 3
parent = 3
input = None
trajectory = ['filtered/e2s1_e1s3p0f8/output.filtered.xtc']
molfile = filtered/filtered.pdb
numframes = [None]

2018-03-19 11:54:56,775 - htmd.model - INFO -
simid = 4
parent = 4
input = None
trajectory = ['filtered/e2s2_e1s1p0f5/output.filtered.xtc']
molfile = filtered/filtered.pdb
numframes = [None]

Writing inputs: 100%|██████████| 3/3 [00:00<00:00, 10.62it/s]
2018-03-19 11:54:57,196 - htmd.queues.localqueue - INFO - Queueing /data/joao/maindisk/software/repos/Acellera/htmd/tutorials/adaptivemd/input/e3s1_e1s1p0f3
2018-03-19 11:54:57,197 - htmd.queues.simqueue - INFO - Removed existing htmd.queues.done sentinel from input/e3s1_e1s1p0f3
2018-03-19 11:54:57,198 - htmd.queues.localqueue - INFO - Queueing /data/joao/maindisk/software/repos/Acellera/htmd/tutorials/adaptivemd/input/e3s2_e1s1p0f3
2018-03-19 11:54:57,198 - htmd.queues.localqueue - INFO - Running /data/joao/maindisk/software/repos/Acellera/htmd/tutorials/adaptivemd/input/e3s1_e1s1p0f3 on device 3
2018-03-19 11:54:57,200 - htmd.queues.simqueue - INFO - Removed existing htmd.queues.done sentinel from input/e3s2_e1s1p0f3
2018-03-19 11:54:57,203 - htmd.queues.localqueue - INFO - Running /data/joao/maindisk/software/repos/Acellera/htmd/tutorials/adaptivemd/input/e3s2_e1s1p0f3 on device 1
2018-03-19 11:54:57,204 - htmd.queues.localqueue - INFO - Queueing /data/joao/maindisk/software/repos/Acellera/htmd/tutorials/adaptivemd/input/e3s3_e1s1p0f8
2018-03-19 11:54:57,217 - htmd.queues.simqueue - INFO - Removed existing htmd.queues.done sentinel from input/e3s3_e1s1p0f8
2018-03-19 11:54:57,229 - htmd.adaptive.adaptive - INFO - Finished submitting simulations.
2018-03-19 11:54:57,230 - htmd.queues.localqueue - INFO - Running /data/joao/maindisk/software/repos/Acellera/htmd/tutorials/adaptivemd/input/e3s3_e1s1p0f8 on device 0
2018-03-19 11:54:57,231 - htmd.adaptive.adaptive - INFO - Sleeping for 120 seconds.
2018-03-19 11:56:57,260 - htmd.adaptive.adaptive - INFO - Processing epoch 3
2018-03-19 11:56:57,262 - htmd.adaptive.adaptive - INFO - Retrieving simulations.
2018-03-19 11:56:57,263 - htmd.adaptive.adaptive - INFO - 3 simulations in progress
2018-03-19 11:56:57,264 - htmd.adaptive.adaptive - INFO - Sleeping for 120 seconds.
2018-03-19 11:57:24,990 - htmd.queues.localqueue - INFO - Completed /data/joao/maindisk/software/repos/Acellera/htmd/tutorials/adaptivemd/input/e3s1_e1s1p0f3
2018-03-19 11:57:25,265 - htmd.queues.localqueue - INFO - Completed /data/joao/maindisk/software/repos/Acellera/htmd/tutorials/adaptivemd/input/e3s2_e1s1p0f3
2018-03-19 11:57:32,962 - htmd.queues.localqueue - INFO - Completed /data/joao/maindisk/software/repos/Acellera/htmd/tutorials/adaptivemd/input/e3s3_e1s1p0f8
2018-03-19 11:58:57,356 - htmd.adaptive.adaptive - INFO - Processing epoch 3
2018-03-19 11:58:57,358 - htmd.adaptive.adaptive - INFO - Retrieving simulations.
2018-03-19 11:58:57,359 - htmd.adaptive.adaptive - INFO - 0 simulations in progress
2018-03-19 11:58:57,360 - htmd.adaptive.adaptive - INFO - Reached maximum number of epochs 3

AdaptiveGoal#

Now let’s change to the adaptivegoal directory and work there instead:

os.chdir('../adaptivegoal')
  • Most of the class arguments are identical to AdaptiveMD

adg = AdaptiveGoal()
adg.app = queue
adg.nmin = 1
adg.nmax = 3
adg.nepochs = 2
adg.generatorspath = './generators'
adg.projection = MetricSelfDistance('protein and name CA')
adg.updateperiod = 120  # execute every 2 minutes
adg.goalfunction = None  # set to None just as an example
  • It requires the goalfunction argument which defines a goal

  • We can define a variety of different goal functions

The goal function#

The goal function will: * take as input a Molecule object of a simulation and * produce as output a score for each frame of that simulation. * The higher the score, the more desirable that simulation frame for being respawned.

RMSD goal function#

For this goal function, we will use a crystal structure of NTL9.

You can download the structure from the following link and save it on the adaptivegoal directory:

Alternatively, you can download the structure using wget.

assert os.system('wget -q http://pub.htmd.org/tutorials/adaptive-sampling/ntl9_crystal.pdb') == 0

We can define a simple goal function that uses the RMSD between the conformation sampled and a reference (in this case, the crystal structure), and returns a score to be evaluated by the AdaptiveGoal algorithm:

ref = Molecule('./ntl9_crystal.pdb')

def mygoalfunction(mol):
    rmsd = MetricRmsd(ref, 'protein and name CA').project(mol)
    return -rmsd  # or even 1/rmsd

adg.goalfunction = mygoalfunction

AdaptiveGoal ranks conformations from a high to low score. For the case of RMSD, since we want lower RMSD to give higher score, the symetric value is returned instead (the inverse would also work).

Launch the AdaptiveGoal run:

adg.run()
2018-03-19 12:00:20,987 - htmd.adaptive.adaptive - INFO - Processing epoch 0
2018-03-19 12:00:20,989 - htmd.adaptive.adaptive - INFO - Epoch 0, generating first batch
2018-03-19 12:00:21,013 - htmd.queues.localqueue - INFO - Queueing /data/joao/maindisk/software/repos/Acellera/htmd/tutorials/adaptivegoal/input/e1s1_ntl9_1ns_0
2018-03-19 12:00:21,014 - htmd.queues.localqueue - INFO - Running /data/joao/maindisk/software/repos/Acellera/htmd/tutorials/adaptivegoal/input/e1s1_ntl9_1ns_0 on device 2
2018-03-19 12:00:21,015 - htmd.queues.localqueue - INFO - Queueing /data/joao/maindisk/software/repos/Acellera/htmd/tutorials/adaptivegoal/input/e1s2_ntl9_1ns_1
2018-03-19 12:00:21,018 - htmd.queues.localqueue - INFO - Running /data/joao/maindisk/software/repos/Acellera/htmd/tutorials/adaptivegoal/input/e1s2_ntl9_1ns_1 on device 0
2018-03-19 12:00:21,018 - htmd.queues.localqueue - INFO - Queueing /data/joao/maindisk/software/repos/Acellera/htmd/tutorials/adaptivegoal/input/e1s3_ntl9_1ns_2
2018-03-19 12:00:21,046 - htmd.adaptive.adaptive - INFO - Sleeping for 120 seconds.
2018-03-19 12:00:21,047 - htmd.queues.localqueue - INFO - Running /data/joao/maindisk/software/repos/Acellera/htmd/tutorials/adaptivegoal/input/e1s3_ntl9_1ns_2 on device 1
2018-03-19 12:02:21,149 - htmd.adaptive.adaptive - INFO - Processing epoch 1
2018-03-19 12:02:21,151 - htmd.adaptive.adaptive - INFO - Retrieving simulations.
2018-03-19 12:02:21,152 - htmd.adaptive.adaptive - INFO - 3 simulations in progress
2018-03-19 12:02:21,153 - htmd.adaptive.adaptive - INFO - Sleeping for 120 seconds.
2018-03-19 12:02:48,206 - htmd.queues.localqueue - INFO - Completed /data/joao/maindisk/software/repos/Acellera/htmd/tutorials/adaptivegoal/input/e1s1_ntl9_1ns_0
2018-03-19 12:02:49,199 - htmd.queues.localqueue - INFO - Completed /data/joao/maindisk/software/repos/Acellera/htmd/tutorials/adaptivegoal/input/e1s3_ntl9_1ns_2
2018-03-19 12:02:54,666 - htmd.queues.localqueue - INFO - Completed /data/joao/maindisk/software/repos/Acellera/htmd/tutorials/adaptivegoal/input/e1s2_ntl9_1ns_1
2018-03-19 12:04:21,221 - htmd.adaptive.adaptive - INFO - Processing epoch 1
2018-03-19 12:04:21,222 - htmd.adaptive.adaptive - INFO - Retrieving simulations.
2018-03-19 12:04:21,223 - htmd.adaptive.adaptive - INFO - 0 simulations in progress
2018-03-19 12:04:21,224 - htmd.adaptive.adaptiverun - INFO - Postprocessing new data
Creating simlist: 100%|██████████| 3/3 [00:00<00:00, 323.03it/s]
Filtering trajectories: 100%|██████████| 3/3 [00:00<00:00, 11.11it/s]
Projecting trajectories: 100%|██████████| 3/3 [00:00<00:00, 12.61it/s]
2018-03-19 12:04:24,491 - htmd.projections.metric - INFO - Frame step 0.10000000149011612ns was read from the trajectories. If it looks wrong, redefine it by manually setting the MetricData.fstep property.
2018-03-19 12:04:24,492 - htmd.metricdata - INFO - Dropped 0 trajectories from 3 resulting in 3
A Jupyter Widget
A Jupyter Widget
2018-03-19 12:04:25,262 - htmd.metricdata - INFO - Dropped 0 trajectories from 3 resulting in 3
Projecting trajectories: 100%|██████████| 3/3 [00:00<00:00, 11.17it/s]
2018-03-19 12:04:26,086 - htmd.projections.metric - INFO - Frame step 0.10000000149011612ns was read from the trajectories. If it looks wrong, redefine it by manually setting the MetricData.fstep property.
2018-03-19 12:04:26,294 - htmd.model - INFO - 100.0% of the data was used
2018-03-19 12:04:26,296 - htmd.model - INFO - Number of trajectories that visited each macrostate:
2018-03-19 12:04:26,297 - htmd.model - INFO - [2 3 2]
2018-03-19 12:04:26,299 - htmd.model - INFO - Take care! Macro 0 has been visited only in 2 trajectories:
2018-03-19 12:04:26,300 - htmd.model - INFO -
simid = 0
parent = 0
input = None
trajectory = ['filtered/e1s1_ntl9_1ns_0/output.filtered.xtc']
molfile = filtered/filtered.pdb
numframes = [None]

2018-03-19 12:04:26,300 - htmd.model - INFO -
simid = 2
parent = 2
input = None
trajectory = ['filtered/e1s3_ntl9_1ns_2/output.filtered.xtc']
molfile = filtered/filtered.pdb
numframes = [None]

2018-03-19 12:04:26,301 - htmd.model - INFO - Take care! Macro 1 has been visited only in 3 trajectories:
2018-03-19 12:04:26,302 - htmd.model - INFO -
simid = 0
parent = 0
input = None
trajectory = ['filtered/e1s1_ntl9_1ns_0/output.filtered.xtc']
molfile = filtered/filtered.pdb
numframes = [None]

2018-03-19 12:04:26,303 - htmd.model - INFO -
simid = 1
parent = 1
input = None
trajectory = ['filtered/e1s2_ntl9_1ns_1/output.filtered.xtc']
molfile = filtered/filtered.pdb
numframes = [None]

2018-03-19 12:04:26,304 - htmd.model - INFO -
simid = 2
parent = 2
input = None
trajectory = ['filtered/e1s3_ntl9_1ns_2/output.filtered.xtc']
molfile = filtered/filtered.pdb
numframes = [None]

2018-03-19 12:04:26,305 - htmd.model - INFO - Take care! Macro 2 has been visited only in 2 trajectories:
2018-03-19 12:04:26,305 - htmd.model - INFO -
simid = 0
parent = 0
input = None
trajectory = ['filtered/e1s1_ntl9_1ns_0/output.filtered.xtc']
molfile = filtered/filtered.pdb
numframes = [None]

2018-03-19 12:04:26,306 - htmd.model - INFO -
simid = 2
parent = 2
input = None
trajectory = ['filtered/e1s3_ntl9_1ns_2/output.filtered.xtc']
molfile = filtered/filtered.pdb
numframes = [None]

Writing inputs: 100%|██████████| 3/3 [00:00<00:00, 12.53it/s]
2018-03-19 12:04:26,672 - htmd.queues.localqueue - INFO - Queueing /data/joao/maindisk/software/repos/Acellera/htmd/tutorials/adaptivegoal/input/e2s1_e1s2p0f5
2018-03-19 12:04:26,674 - htmd.queues.simqueue - INFO - Removed existing htmd.queues.done sentinel from input/e2s1_e1s2p0f5
2018-03-19 12:04:26,675 - htmd.queues.localqueue - INFO - Running /data/joao/maindisk/software/repos/Acellera/htmd/tutorials/adaptivegoal/input/e2s1_e1s2p0f5 on device 0
2018-03-19 12:04:26,675 - htmd.queues.localqueue - INFO - Queueing /data/joao/maindisk/software/repos/Acellera/htmd/tutorials/adaptivegoal/input/e2s2_e1s2p0f8
2018-03-19 12:04:26,679 - htmd.queues.simqueue - INFO - Removed existing htmd.queues.done sentinel from input/e2s2_e1s2p0f8
2018-03-19 12:04:26,691 - htmd.queues.localqueue - INFO - Queueing /data/joao/maindisk/software/repos/Acellera/htmd/tutorials/adaptivegoal/input/e2s3_e1s2p0f8
2018-03-19 12:04:26,692 - htmd.queues.localqueue - INFO - Running /data/joao/maindisk/software/repos/Acellera/htmd/tutorials/adaptivegoal/input/e2s2_e1s2p0f8 on device 3
2018-03-19 12:04:26,695 - htmd.queues.simqueue - INFO - Removed existing htmd.queues.done sentinel from input/e2s3_e1s2p0f8
2018-03-19 12:04:26,697 - htmd.adaptive.adaptive - INFO - Finished submitting simulations.
2018-03-19 12:04:26,697 - htmd.queues.localqueue - INFO - Running /data/joao/maindisk/software/repos/Acellera/htmd/tutorials/adaptivegoal/input/e2s3_e1s2p0f8 on device 1
2018-03-19 12:04:26,698 - htmd.adaptive.adaptive - INFO - Sleeping for 120 seconds.
2018-03-19 12:06:26,789 - htmd.adaptive.adaptive - INFO - Processing epoch 2
2018-03-19 12:06:26,790 - htmd.adaptive.adaptive - INFO - Retrieving simulations.
2018-03-19 12:06:26,791 - htmd.adaptive.adaptive - INFO - 3 simulations in progress
2018-03-19 12:06:26,792 - htmd.adaptive.adaptive - INFO - Sleeping for 120 seconds.
2018-03-19 12:06:52,643 - htmd.queues.localqueue - INFO - Completed /data/joao/maindisk/software/repos/Acellera/htmd/tutorials/adaptivegoal/input/e2s2_e1s2p0f8
2018-03-19 12:06:56,571 - htmd.queues.localqueue - INFO - Completed /data/joao/maindisk/software/repos/Acellera/htmd/tutorials/adaptivegoal/input/e2s3_e1s2p0f8
2018-03-19 12:07:01,804 - htmd.queues.localqueue - INFO - Completed /data/joao/maindisk/software/repos/Acellera/htmd/tutorials/adaptivegoal/input/e2s1_e1s2p0f5
2018-03-19 12:08:26,890 - htmd.adaptive.adaptive - INFO - Processing epoch 2
2018-03-19 12:08:26,891 - htmd.adaptive.adaptive - INFO - Retrieving simulations.
2018-03-19 12:08:26,893 - htmd.adaptive.adaptive - INFO - 0 simulations in progress
2018-03-19 12:08:26,893 - htmd.adaptive.adaptive - INFO - Reached maximum number of epochs 2

Functions with multiple arguments#

The goal function can also take multiple arguments. This allows flexibility and on-the-fly comparisons to non-static conformations (i.e. compare with different references as the run progresses). Here, we redefine the previous goal function with multiple arguments:

def newgoalfunction(mol, crystal):
    rmsd = MetricRmsd(crystal, 'protein and name CA').project(mol)
    return -rmsd  # or even 1/rmsd

Now we clean the previous AdaptiveGoal run, and start a new one with the new goal function:

# clean previous run
shutil.rmtree('./input')
shutil.rmtree('./data')
shutil.rmtree('./filtered')

# run with new goal
ref = Molecule('./ntl9_crystal.pdb')
adg.goalfunction = (newgoalfunction, (ref,))
adg.run()
2018-03-19 12:08:41,639 - htmd.adaptive.adaptive - INFO - Processing epoch 0
2018-03-19 12:08:41,640 - htmd.adaptive.adaptive - INFO - Epoch 0, generating first batch
2018-03-19 12:08:41,659 - htmd.queues.localqueue - INFO - Queueing /data/joao/maindisk/software/repos/Acellera/htmd/tutorials/adaptivegoal/input/e1s1_ntl9_1ns_0
2018-03-19 12:08:41,660 - htmd.queues.localqueue - INFO - Queueing /data/joao/maindisk/software/repos/Acellera/htmd/tutorials/adaptivegoal/input/e1s2_ntl9_1ns_1
2018-03-19 12:08:41,660 - htmd.queues.localqueue - INFO - Running /data/joao/maindisk/software/repos/Acellera/htmd/tutorials/adaptivegoal/input/e1s1_ntl9_1ns_0 on device 3
2018-03-19 12:08:41,662 - htmd.queues.localqueue - INFO - Running /data/joao/maindisk/software/repos/Acellera/htmd/tutorials/adaptivegoal/input/e1s2_ntl9_1ns_1 on device 0
2018-03-19 12:08:41,662 - htmd.queues.localqueue - INFO - Queueing /data/joao/maindisk/software/repos/Acellera/htmd/tutorials/adaptivegoal/input/e1s3_ntl9_1ns_2
2018-03-19 12:08:41,665 - htmd.adaptive.adaptive - INFO - Sleeping for 120 seconds.
2018-03-19 12:08:41,665 - htmd.queues.localqueue - INFO - Running /data/joao/maindisk/software/repos/Acellera/htmd/tutorials/adaptivegoal/input/e1s3_ntl9_1ns_2 on device 2
2018-03-19 12:10:41,714 - htmd.adaptive.adaptive - INFO - Processing epoch 1
2018-03-19 12:10:41,715 - htmd.adaptive.adaptive - INFO - Retrieving simulations.
2018-03-19 12:10:41,716 - htmd.adaptive.adaptive - INFO - 3 simulations in progress
2018-03-19 12:10:41,717 - htmd.adaptive.adaptive - INFO - Sleeping for 120 seconds.
2018-03-19 12:11:08,429 - htmd.queues.localqueue - INFO - Completed /data/joao/maindisk/software/repos/Acellera/htmd/tutorials/adaptivegoal/input/e1s1_ntl9_1ns_0
2018-03-19 12:11:08,628 - htmd.queues.localqueue - INFO - Completed /data/joao/maindisk/software/repos/Acellera/htmd/tutorials/adaptivegoal/input/e1s3_ntl9_1ns_2
2018-03-19 12:11:16,949 - htmd.queues.localqueue - INFO - Completed /data/joao/maindisk/software/repos/Acellera/htmd/tutorials/adaptivegoal/input/e1s2_ntl9_1ns_1
2018-03-19 12:12:41,790 - htmd.adaptive.adaptive - INFO - Processing epoch 1
2018-03-19 12:12:41,792 - htmd.adaptive.adaptive - INFO - Retrieving simulations.
2018-03-19 12:12:41,793 - htmd.adaptive.adaptive - INFO - 0 simulations in progress
2018-03-19 12:12:41,794 - htmd.adaptive.adaptiverun - INFO - Postprocessing new data
Creating simlist: 100%|██████████| 3/3 [00:00<00:00, 275.92it/s]
Filtering trajectories: 100%|██████████| 3/3 [00:00<00:00, 12.21it/s]
Projecting trajectories: 100%|██████████| 3/3 [00:00<00:00, 13.00it/s]
2018-03-19 12:12:45,002 - htmd.projections.metric - INFO - Frame step 0.10000000149011612ns was read from the trajectories. If it looks wrong, redefine it by manually setting the MetricData.fstep property.
2018-03-19 12:12:45,003 - htmd.metricdata - INFO - Dropped 0 trajectories from 3 resulting in 3
A Jupyter Widget
A Jupyter Widget
2018-03-19 12:12:45,684 - htmd.metricdata - INFO - Dropped 0 trajectories from 3 resulting in 3
Projecting trajectories: 100%|██████████| 3/3 [00:00<00:00, 12.06it/s]
2018-03-19 12:12:46,436 - htmd.projections.metric - INFO - Frame step 0.10000000149011612ns was read from the trajectories. If it looks wrong, redefine it by manually setting the MetricData.fstep property.
2018-03-19 12:12:46,669 - htmd.model - INFO - 96.7% of the data was used
2018-03-19 12:12:46,670 - htmd.model - INFO - Number of trajectories that visited each macrostate:
2018-03-19 12:12:46,671 - htmd.model - INFO - [1 1 3 3]
2018-03-19 12:12:46,673 - htmd.model - INFO - Take care! Macro 0 has been visited only in 1 trajectories:
2018-03-19 12:12:46,673 - htmd.model - INFO -
simid = 1
parent = 1
input = None
trajectory = ['filtered/e1s2_ntl9_1ns_1/output.filtered.xtc']
molfile = filtered/filtered.pdb
numframes = [None]

2018-03-19 12:12:46,674 - htmd.model - INFO - Take care! Macro 1 has been visited only in 1 trajectories:
2018-03-19 12:12:46,675 - htmd.model - INFO -
simid = 1
parent = 1
input = None
trajectory = ['filtered/e1s2_ntl9_1ns_1/output.filtered.xtc']
molfile = filtered/filtered.pdb
numframes = [None]

2018-03-19 12:12:46,676 - htmd.model - INFO - Take care! Macro 2 has been visited only in 3 trajectories:
2018-03-19 12:12:46,676 - htmd.model - INFO -
simid = 0
parent = 0
input = None
trajectory = ['filtered/e1s1_ntl9_1ns_0/output.filtered.xtc']
molfile = filtered/filtered.pdb
numframes = [None]

2018-03-19 12:12:46,677 - htmd.model - INFO -
simid = 1
parent = 1
input = None
trajectory = ['filtered/e1s2_ntl9_1ns_1/output.filtered.xtc']
molfile = filtered/filtered.pdb
numframes = [None]

2018-03-19 12:12:46,678 - htmd.model - INFO -
simid = 2
parent = 2
input = None
trajectory = ['filtered/e1s3_ntl9_1ns_2/output.filtered.xtc']
molfile = filtered/filtered.pdb
numframes = [None]

2018-03-19 12:12:46,679 - htmd.model - INFO - Take care! Macro 3 has been visited only in 3 trajectories:
2018-03-19 12:12:46,680 - htmd.model - INFO -
simid = 0
parent = 0
input = None
trajectory = ['filtered/e1s1_ntl9_1ns_0/output.filtered.xtc']
molfile = filtered/filtered.pdb
numframes = [None]

2018-03-19 12:12:46,681 - htmd.model - INFO -
simid = 1
parent = 1
input = None
trajectory = ['filtered/e1s2_ntl9_1ns_1/output.filtered.xtc']
molfile = filtered/filtered.pdb
numframes = [None]

2018-03-19 12:12:46,681 - htmd.model - INFO -
simid = 2
parent = 2
input = None
trajectory = ['filtered/e1s3_ntl9_1ns_2/output.filtered.xtc']
molfile = filtered/filtered.pdb
numframes = [None]

Writing inputs: 100%|██████████| 3/3 [00:00<00:00, 11.62it/s]
2018-03-19 12:12:47,073 - htmd.queues.localqueue - INFO - Queueing /data/joao/maindisk/software/repos/Acellera/htmd/tutorials/adaptivegoal/input/e2s1_e1s3p0f9
2018-03-19 12:12:47,075 - htmd.queues.simqueue - INFO - Removed existing htmd.queues.done sentinel from input/e2s1_e1s3p0f9
2018-03-19 12:12:47,076 - htmd.queues.localqueue - INFO - Running /data/joao/maindisk/software/repos/Acellera/htmd/tutorials/adaptivegoal/input/e2s1_e1s3p0f9 on device 0
2018-03-19 12:12:47,076 - htmd.queues.localqueue - INFO - Queueing /data/joao/maindisk/software/repos/Acellera/htmd/tutorials/adaptivegoal/input/e2s2_e1s1p0f5
2018-03-19 12:12:47,079 - htmd.queues.simqueue - INFO - Removed existing htmd.queues.done sentinel from input/e2s2_e1s1p0f5
2018-03-19 12:12:47,091 - htmd.queues.localqueue - INFO - Running /data/joao/maindisk/software/repos/Acellera/htmd/tutorials/adaptivegoal/input/e2s2_e1s1p0f5 on device 3
2018-03-19 12:12:47,092 - htmd.queues.localqueue - INFO - Queueing /data/joao/maindisk/software/repos/Acellera/htmd/tutorials/adaptivegoal/input/e2s3_e1s2p0f1
2018-03-19 12:12:47,097 - htmd.queues.simqueue - INFO - Removed existing htmd.queues.done sentinel from input/e2s3_e1s2p0f1
2018-03-19 12:12:47,109 - htmd.adaptive.adaptive - INFO - Finished submitting simulations.
2018-03-19 12:12:47,110 - htmd.queues.localqueue - INFO - Running /data/joao/maindisk/software/repos/Acellera/htmd/tutorials/adaptivegoal/input/e2s3_e1s2p0f1 on device 1
2018-03-19 12:12:47,111 - htmd.adaptive.adaptive - INFO - Sleeping for 120 seconds.
2018-03-19 12:14:47,185 - htmd.adaptive.adaptive - INFO - Processing epoch 2
2018-03-19 12:14:47,187 - htmd.adaptive.adaptive - INFO - Retrieving simulations.
2018-03-19 12:14:47,188 - htmd.adaptive.adaptive - INFO - 3 simulations in progress
2018-03-19 12:14:47,189 - htmd.adaptive.adaptive - INFO - Sleeping for 120 seconds.
2018-03-19 12:15:13,720 - htmd.queues.localqueue - INFO - Completed /data/joao/maindisk/software/repos/Acellera/htmd/tutorials/adaptivegoal/input/e2s2_e1s1p0f5
2018-03-19 12:15:14,962 - htmd.queues.localqueue - INFO - Completed /data/joao/maindisk/software/repos/Acellera/htmd/tutorials/adaptivegoal/input/e2s3_e1s2p0f1
2018-03-19 12:15:22,419 - htmd.queues.localqueue - INFO - Completed /data/joao/maindisk/software/repos/Acellera/htmd/tutorials/adaptivegoal/input/e2s1_e1s3p0f9
2018-03-19 12:16:47,290 - htmd.adaptive.adaptive - INFO - Processing epoch 2
2018-03-19 12:16:47,292 - htmd.adaptive.adaptive - INFO - Retrieving simulations.
2018-03-19 12:16:47,293 - htmd.adaptive.adaptive - INFO - 0 simulations in progress
2018-03-19 12:16:47,294 - htmd.adaptive.adaptive - INFO - Reached maximum number of epochs 2

Other goal function examples#

HTMD includes other two goal functions: The secondary structure goal function and the contacts goal function.

Secondary structure goal function#

ref = Molecule('./ntl9_crystal.pdb')

def ssGoal(mol, crystal):
    crystalSS = MetricSecondaryStructure().project(crystal)[0]
    proj = MetricSecondaryStructure().project(mol)
    # How many crystal SS match with simulation SS
    ss_score = np.sum(proj == crystalSS, axis=1) / proj.shape[1]
    return ss_score

adg.goalfunction = (ssGoal, (ref,))

Contacts goal function#

ref = Molecule('./ntl9_crystal.pdb')

def contactGoal(mol, crystal):
    crystalCO = MetricSelfDistance('protein and name CA', pbc=False,
                                   metric='contacts',
                                   threshold=10).project(crystal)
    proj = MetricSelfDistance('protein and name CA',
                              metric='contacts',
                              threshold=10).project(mol)
    # How many crystal contacts are seen?
    co_score = np.sum(proj[:, crystalCO] == 1, axis=1)
    co_score /= np.sum(crystalCO)
    return ss_score

adg.goalfunction = (contactGoal, (ref,))

Many more goal functions can be devised.