PlayMolecule API Tutorial#

by Stefan Doerr

Purpose#

The PlayMolecule API is used for interacting with the PlayMolecule server as a means for

  • Starting executions of apps

  • Checking the status of executing apps

  • Retrieving the results of apps

It is also used by app developers for the app to communicate with the server in an easy manner.

For instructions on how to install the PlayMolecule python API head over to the installation page.

Interacting with the PlayMolecule server#

To use the API interactively you can start ipython from the command line and import all the components of the playmolecule package you just installed with

from playmolecule import *

This command will report where it reads the current configuration of the API from. You can modify that file or create a local copy of it in your home directory and add an environment variable like export PM_SDK_CONFIG=/path/to/pm_sdk_config.ini to your ~/.bashrc file to use that file instead. This is recommended, as the default config.ini file will be overwritten if you ever uninstall or update the playmolecule package. In this file you will configure the IP and ports of your PlayMolecule server amongst other things.

The above command will also import four main classes.

  1. The Session class which is used to connect to the server, list apps, start jobs and more. For a full list and description of the Session methods check the API documentation.

  2. The DataCenter class. This class is used to download, upload and tag datasets on the backend. All data handled by the backend is considered a dataset including job related user-submitted and generated data as well as data necessary for executing some apps.

  3. The Job class which represents a job on the server and can be used to check the inputs of a job, submit a new job, get its status and more. For a full list and description of the Job methods check the API documentation.

  4. The JobStatus enumeration. This is so-called enumeration is just a convenient mapping of job status codes (numbers) to more useful descriptions. For a full list of the JobStatus enumeration check the API documentation.

Creating a session#

To create a session with the PlayMolecule server from the python API you will need to obtain a PlayMolecule token from Acellera. For this please contact the Acellera customer support or your corresponding representative.

sess = Session("MY_PM_TOKEN")

Listing available apps#

You can list all available apps which are currently registered on your PlayMolecule server using the following command.

sess.get_apps()

Creating a job for an app#

Once we have chosen an app that we want to use (here we will demonstrate with “ProteinPrepare”) we can create a new job for the app using the start_app method

job = sess.start_app("ProteinPrepare")

To list all of the inputs of this specific app we can use the following command

job.describe()

Starting (submitting) a job#

Now that we know what inputs the job requires we can supply those and submit the job for execution to the PlayMolecule server.

job = sess.start_app("ProteinPrepare")
job.pdbid = "3ptb"
job.submit()

The submit method will print out the job ID. It can be useful to keep this ID around in case the current ipython session dies so that you can get back your job from the server.

Get back a job object#

If for some reason you lose the job object, for example if you close the ipython console or you overwrite the job variable, you can get back the object with the following command, replacing "EXECID" with the execution ID printed by the above job.submit() method.

job = sess.get_job(execid="EXECID")

More complex example#

Here you can see a short example of how to submit an Adaptive Sampling job to the PlayMolecule server which also demonstrates how to pass different types of inputs such as local directories and numbers as arguments.

job = sess.start_app("AdaptiveSampling")
job.describe()
job.inputdir = "test/generators/"
job.numepochs = 3
job.nmin = 1
job.nmax = 2
job.adapttype = "confexplore"
job.projection = "dihedrals"
job.submit()

Checking job status#

When running a job it’s important to know what the current status of the job is, if it’s running, if it has completed etc. We can do this using the get_status method.

job.get_status()
# We can disable the printing using the _logger argument
status = job.get_status(_logger=False)

# We can compare the returned status to any status we want
print(status == JobStatus.WAITING_DATA)

# We can also check if the status belongs to any in a list
if status in (JobStatus.WAITING_DATA, JobStatus.QUEUED):
    print("Job has not started running yet.")

Checking job progress#

Some apps, especially long running ones, have progress reporting implemented. This mean that they report their current status whenever they have reached a major milestone. For example the AdaptiveSampling app might report the current epoch it’s at to the server. This information is not considered the status of the app as above but instead it’s current progress. We can obtain the current progress info from the app using the following commands.

info, per = job.get_progress()

Where info is a string describing the current progress of the execution and per is an estimate of the percentage of completion of the execution. The information returned by get_progress should only be used as a way for the user to check the current progress and not used programmatically to write scripts as it might not be very accurate, depending on the app implementation.

Retrieving the job results#

Once a job has completed (or before, if we are interested in the intermediate results of the calculations), we can retrieve the results of the job to a local directory using the retrieve method.

job.retrieve(path="./results", on_status=JobStatus.COMPLETED)

The on_status argument tells the retrieve method to only retrieve the job if it has reached that specific status. You can replace this with any other status or list of statuses.

Waiting for the job to complete#

Often we want our code to wait for a job to complete before proceeding to other operations. We can do this using the wait method.

job.wait()

We can also set a on_status to the wait method which tells it to wait until the job has reached that status.

job.wait(on_status=(JobStatus.COMPLETED, JobStatus.ERROR))

Secondary executions (children) - Mostly for app devs#

Some apps will spawn secondary executions of other apps. An example of that is the SimpleRun app which will spawn multiple MDRun apps to perform the equilibration and production simulations necessary. These secondary executions are sometimes called children. As an example, to submit a job as the child of another job it can be done as follows by passing it the execution ID of the parent job.

job.submit(child_of=parentExecId)

For app developers which create apps which spawn other jobs it can also be useful to use the wait_children method which waits for all the children jobs spawned by the current job

job.wait_children(on_status=(JobStatus.COMPLETED, JobStatus.ERROR))

You can also get all the children of a job as follows

# For old behaviour use return_dict=True
childDict = job.get_children(returnDict=True)
# New behaviour: Get all the children of the job
children, status = job.get_children()
# New behaviour: Get only the completed children
children, status = job.get_children(status=JobStatus.COMPLETED)


for child in children:
    child.retrieve()

That’s it!

If you have any suggestions on improvements or additional features which you would like to see in the PlayMolecule python API feel free to write us up.