Adaptive Sampling entire pipeline tutorial#

by Stefan Doerr


This tutorial aims to show how different PlayMolecule apps can be used in conjunction to create a pipeline. In this specific example we would like to set up and run an adaptive sampling simulation, starting from a protein and a ligand.

You can download the following files: ligand mol2 and analysis script

First we import our libraries and start a new session with the PlayMolecule server

from playmolecule import Session, JobStatus
import os
import time
import numpy as np

s = Session("my_token")  # Replace my_token with your PM token

System building#

If we already have a built system we can skip directly to the next session. Otherwise we can follow the next step to obtain a built system using PlayMolecule apps. First we prepare out protein using the ProteinPrepare app.

job = s.start_app("ProteinPrepare")
job.pdbid = "3PTB"
# job.pdbfile = "./3PTB.pdb"  # We can use this instead of the above for local files
assert job.get_status() == JobStatus.COMPLETED, "Job failed"
prepf = job.retrieve("test/prepared_protein")

Since we have a ligand we need to obtain parameters for our simulation. This is done through the Parameterize app and in this case we simply use the GAFF2 parameterization as it’s very quick. We could also opt for a QM parameterization by switching the mode or a neural network based method.

job = s.start_app("Parameterize")
job.molecule = "/home/sdoerr/benzamidine-3PTB-pH7.mol2"
job.mode = "GAFF2"
assert job.get_status() == JobStatus.COMPLETED, "Job failed"
ligf = job.retrieve("test/parameterized_ligand")
paramf = os.path.join(ligf, "parameters", "GAFF2")

Now that we have both a protein and a parameterized ligand we can build our system using SystemBuilder.

job = s.start_app("SystemBuilder")
job.inputdir = prepf
job.forcefield = "amber"
job.no_prepare = True
# The following two options are only needed if we have a ligand
job.ligand = os.path.join(aparamf, "mol.mol2")
job.ligfrcmod = os.path.join(paramf, "mol.frcmod")
assert job.get_status() == JobStatus.COMPLETED, "Job failed"
buildf = job.retrieve("test/build")


If we already have a built system for AMBER or CHARMM we can skip directly to this section. We simply need a folder (in this case called build_1) containing a “structure.pdb” and a “structure.prmtop” file.

First we will equilibrate our system and produce inputs for the adaptive run using the genadapt option.

job = s.start_app("SimpleRun")
job.inputdir = os.path.join(buildf, "build_1")
job.runtime = 1  # How long each adaptive run should be in ns
job.simtype = "globular"
job.equiltime = 3  # in ns
job.ligresname = "BEN"  # Benzamidine residue name. Only needed if we have a ligand
job.constraints = "protein-ligand"
job.genadapt = True
assert job.get_status() == JobStatus.COMPLETED, "Job failed"
equilf = job.retrieve("test/equil")

Now that we have an equilibrated system we can start with our adaptive sampling simulations.

# This function will print a nice report every few seconds informing us of the current status of our runs.
def report(job):
    status = job.get_status(_logger=False)
    progress = job.get_progress(_logger=False)
    _, childrenstatus = job.get_children()
    print(status, progress)
    print("Children status")
    print("  Completed: ", np.sum(np.array(childrenstatus) == JobStatus.COMPLETED))
    print("  Running:   ", np.sum(np.array(childrenstatus) == JobStatus.RUNNING))
    print("  Errored:   ", np.sum(np.array(childrenstatus) == JobStatus.ERROR))

job = s.start_app("AdaptiveSampling")
job.inputdir = os.path.join(equilf, "generators")
job.nmax = 2
job.nmin = 1
job.numepochs = 3
job.projection = "distances"
job.adapttype = "unbind"  # What type of adaptive run we want to do
job.ligname = "BEN"
while job.get_status(_logger=False) != JobStatus.COMPLETED:
    if job.get_status(_logger=False) == JobStatus.ERROR:
assert job.get_status() == JobStatus.COMPLETED, "Job failed"
resf = job.retrieve("test/adaptive")