PlayMolecule API Tutorial#
by Stefan Doerr
Purpose#
The PlayMolecule API is used for interacting with the PlayMolecule server as a means for
Starting executions of apps
Checking the status of executing apps
Retrieving the results of apps
It is also used by app developers for the app to communicate with the server in an easy manner.
For instructions on how to install the PlayMolecule python API head over to the installation page.
Interacting with the PlayMolecule server#
To use the API interactively you can start ipython
from the command
line and import all the components of the playmolecule
package you
just installed with
from playmolecule import *
This command will report where it reads the current configuration of the
API from. You can modify that file or create a local copy of it in your
home directory and add an environment variable like
export PM_SDK_CONFIG=/path/to/pm_sdk_config.ini
to your
~/.bashrc
file to use that file instead. This is recommended, as the
default config.ini
file will be overwritten if you ever uninstall or
update the playmolecule
package. In this file you will configure the
IP and ports of your PlayMolecule server amongst other things.
The above command will also import four main classes.
The
Session
class which is used to connect to the server, list apps, start jobs and more. For a full list and description of the Session methods check the API documentation.The
DataCenter
class. This class is used to download, upload and tag datasets on the backend. All data handled by the backend is considered a dataset including job related user-submitted and generated data as well as data necessary for executing some apps.The
Job
class which represents a job on the server and can be used to check the inputs of a job, submit a new job, get its status and more. For a full list and description of the Job methods check the API documentation.The
JobStatus
enumeration. This is so-called enumeration is just a convenient mapping of job status codes (numbers) to more useful descriptions. For a full list of theJobStatus
enumeration check the API documentation.
Creating a session#
To create a session with the PlayMolecule server from the python API you will need to obtain a PlayMolecule token from Acellera. For this please contact the Acellera customer support or your corresponding representative.
sess = Session("MY_PM_TOKEN")
Listing available apps#
You can list all available apps which are currently registered on your PlayMolecule server using the following command.
sess.get_apps()
Creating a job for an app#
Once we have chosen an app that we want to use (here we will demonstrate
with “ProteinPrepare”) we can create a new job for the app using the
start_app
method
job = sess.start_app("ProteinPrepare")
To list all of the inputs of this specific app we can use the following command
job.describe()
Starting (submitting) a job#
Now that we know what inputs the job requires we can supply those and submit the job for execution to the PlayMolecule server.
job = sess.start_app("ProteinPrepare")
job.pdbid = "3ptb"
job.submit()
The submit
method will print out the job ID. It can be useful to
keep this ID around in case the current ipython
session dies so that
you can get back your job from the server.
Get back a job object#
If for some reason you lose the job
object, for example if you close
the ipython
console or you overwrite the job
variable, you can
get back the object with the following command, replacing "EXECID"
with the execution ID printed by the above job.submit()
method.
job = sess.get_job(execid="EXECID")
More complex example#
Here you can see a short example of how to submit an Adaptive Sampling job to the PlayMolecule server which also demonstrates how to pass different types of inputs such as local directories and numbers as arguments.
job = sess.start_app("AdaptiveSampling")
job.describe()
job.inputdir = "test/generators/"
job.numepochs = 3
job.nmin = 1
job.nmax = 2
job.adapttype = "confexplore"
job.projection = "dihedrals"
job.submit()
Checking job status#
When running a job it’s important to know what the current status of the
job is, if it’s running, if it has completed etc. We can do this using
the get_status
method.
job.get_status()
# We can disable the printing using the _logger argument
status = job.get_status(_logger=False)
# We can compare the returned status to any status we want
print(status == JobStatus.WAITING_DATA)
# We can also check if the status belongs to any in a list
if status in (JobStatus.WAITING_DATA, JobStatus.QUEUED):
print("Job has not started running yet.")
Checking job progress#
Some apps, especially long running ones, have progress reporting implemented. This mean that they report their current status whenever they have reached a major milestone. For example the AdaptiveSampling app might report the current epoch it’s at to the server. This information is not considered the status of the app as above but instead it’s current progress. We can obtain the current progress info from the app using the following commands.
info, per = job.get_progress()
Where info
is a string describing the current progress of the
execution and per
is an estimate of the percentage of completion of
the execution. The information returned by get_progress
should only
be used as a way for the user to check the current progress and not used
programmatically to write scripts as it might not be very accurate,
depending on the app implementation.
Retrieving the job results#
Once a job has completed (or before, if we are interested in the
intermediate results of the calculations), we can retrieve the results
of the job to a local directory using the retrieve
method.
job.retrieve(path="./results", on_status=JobStatus.COMPLETED)
The on_status
argument tells the retrieve
method to only
retrieve the job if it has reached that specific status. You can replace
this with any other status or list of statuses.
Waiting for the job to complete#
Often we want our code to wait for a job to complete before proceeding
to other operations. We can do this using the wait
method.
job.wait()
We can also set a on_status
to the wait
method which tells it to
wait until the job has reached that status.
job.wait(on_status=(JobStatus.COMPLETED, JobStatus.ERROR))
Secondary executions (children) - Mostly for app devs#
Some apps will spawn secondary executions of other apps. An example of
that is the SimpleRun
app which will spawn multiple MDRun
apps
to perform the equilibration and production simulations necessary. These
secondary executions are sometimes called children. As an example, to
submit a job as the child of another job it can be done as follows by
passing it the execution ID of the parent job.
job.submit(child_of=parentExecId)
For app developers which create apps which spawn other jobs it can also
be useful to use the wait_children
method which waits for all the
children jobs spawned by the current job
job.wait_children(on_status=(JobStatus.COMPLETED, JobStatus.ERROR))
You can also get all the children of a job as follows
# For old behaviour use return_dict=True
childDict = job.get_children(returnDict=True)
# New behaviour: Get all the children of the job
children, status = job.get_children()
# New behaviour: Get only the completed children
children, status = job.get_children(status=JobStatus.COMPLETED)
for child in children:
child.retrieve()
That’s it!
If you have any suggestions on improvements or additional features which you would like to see in the PlayMolecule python API feel free to write us up.