by Stefan Doerr
The PlayMolecule API is used for interacting with the PlayMolecule server as a means for
Starting executions of apps
Checking the status of executing apps
Retrieving the results of apps
It is also used by app developers for the app to communicate with the server in an easy manner.
For instructions on how to install the PlayMolecule python API head over to the installation page.
To use the API interactively you can start ipython
from the command
line and import all the components of the playmolecule
package you
just installed with
from playmolecule import *
This command will report where it reads the current configuration of the
API from. You can modify that file or create a local copy of it in your
home directory and add an environment variable like
export PM_SDK_CONFIG=/path/to/pm_sdk_config.ini
to your
~/.bashrc
file to use that file instead. This is recommended, as the
default config.ini
file will be overwritten if you ever uninstall or
update the playmolecule
package. In this file you will configure the
IP and ports of your PlayMolecule server amongst other things.
The above command will also import four main classes.
The Session
class which is used to connect to the server, list
apps, start jobs and more. For a full list and description of the
Session methods check the API documentation.
The DataCenter
class. This class is used to download, upload and
tag datasets on the backend. All data handled by the backend is
considered a dataset including job related user-submitted and
generated data as well as data necessary for executing some apps.
The Job
class which represents a job on the server and can be
used to check the inputs of a job, submit a new job, get its status
and more. For a full list and description of the Job methods check
the API documentation.
The JobStatus
enumeration. This is so-called enumeration is just
a convenient mapping of job status codes (numbers) to more useful
descriptions. For a full list of the JobStatus
enumeration check
the API documentation.
To create a session with the PlayMolecule server from the python API you will need to obtain a PlayMolecule token from Acellera. For this please contact the Acellera customer support or your corresponding representative.
sess = Session("MY_PM_TOKEN")
You can list all available apps which are currently registered on your PlayMolecule server using the following command.
sess.get_apps()
Once we have chosen an app that we want to use (here we will demonstrate
with “ProteinPrepare”) we can create a new job for the app using the
start_app
method
job = sess.start_app("ProteinPrepare")
To list all of the inputs of this specific app we can use the following command
job.describe()
Now that we know what inputs the job requires we can supply those and submit the job for execution to the PlayMolecule server.
job = sess.start_app("ProteinPrepare")
job.pdbid = "3ptb"
job.submit()
The submit
method will print out the job ID. It can be useful to
keep this ID around in case the current ipython
session dies so that
you can get back your job from the server.
If for some reason you lose the job
object, for example if you close
the ipython
console or you overwrite the job
variable, you can
get back the object with the following command, replacing "EXECID"
with the execution ID printed by the above job.submit()
method.
job = sess.get_job(execid="EXECID")
Here you can see a short example of how to submit an Adaptive Sampling job to the PlayMolecule server which also demonstrates how to pass different types of inputs such as local directories and numbers as arguments.
job = sess.start_app("AdaptiveSampling")
job.describe()
job.inputdir = "test/generators/"
job.numepochs = 3
job.nmin = 1
job.nmax = 2
job.adapttype = "confexplore"
job.projection = "dihedrals"
job.submit()
When running a job it’s important to know what the current status of the
job is, if it’s running, if it has completed etc. We can do this using
the get_status
method.
job.get_status()
# We can disable the printing using the _logger argument
status = job.get_status(_logger=False)
# We can compare the returned status to any status we want
print(status == JobStatus.WAITING_DATA)
# We can also check if the status belongs to any in a list
if status in (JobStatus.WAITING_DATA, JobStatus.QUEUED):
print("Job has not started running yet.")
Some apps, especially long running ones, have progress reporting implemented. This mean that they report their current status whenever they have reached a major milestone. For example the AdaptiveSampling app might report the current epoch it’s at to the server. This information is not considered the status of the app as above but instead it’s current progress. We can obtain the current progress info from the app using the following commands.
info, per = job.get_progress()
Where info
is a string describing the current progress of the
execution and per
is an estimate of the percentage of completion of
the execution. The information returned by get_progress
should only
be used as a way for the user to check the current progress and not used
programmatically to write scripts as it might not be very accurate,
depending on the app implementation.
Once a job has completed (or before, if we are interested in the
intermediate results of the calculations), we can retrieve the results
of the job to a local directory using the retrieve
method.
job.retrieve(path="./results", on_status=JobStatus.COMPLETED)
The on_status
argument tells the retrieve
method to only
retrieve the job if it has reached that specific status. You can replace
this with any other status or list of statuses.
Often we want our code to wait for a job to complete before proceeding
to other operations. We can do this using the wait
method.
job.wait()
We can also set a on_status
to the wait
method which tells it to
wait until the job has reached that status.
job.wait(on_status=(JobStatus.COMPLETED, JobStatus.ERROR))
Some apps will spawn secondary executions of other apps. An example of
that is the SimpleRun
app which will spawn multiple MDRun
apps
to perform the equilibration and production simulations necessary. These
secondary executions are sometimes called children. As an example, to
submit a job as the child of another job it can be done as follows by
passing it the execution ID of the parent job.
job.submit(child_of=parentExecId)
For app developers which create apps which spawn other jobs it can also
be useful to use the wait_children
method which waits for all the
children jobs spawned by the current job
job.wait_children(on_status=(JobStatus.COMPLETED, JobStatus.ERROR))
You can also get all the children of a job as follows
# For old behaviour use return_dict=True
childDict = job.get_children(returnDict=True)
# New behaviour: Get all the children of the job
children, status = job.get_children()
# New behaviour: Get only the completed children
children, status = job.get_children(status=JobStatus.COMPLETED)
for child in children:
child.retrieve()
That’s it!
If you have any suggestions on improvements or additional features which you would like to see in the PlayMolecule python API feel free to write us up.