by Stefan Doerr
The PlayMolecule API is used for interacting with the PlayMolecule server as a means for
Starting executions of apps
Checking the status of executing apps
Retrieving the results of apps
It is also used by app developers for the app to communicate with the server in an easy manner.
For instructions on how to install the PlayMolecule python API head over to the installation page.
To use the API interactively you can start
ipython from the command
line and import all the components of the
playmolecule package you
just installed with
from playmolecule import *
This command will report where it reads the current configuration of the
API from. You can modify that file or create a local copy of it in your
home directory and add an environment variable like
export PM_SDK_CONFIG=/path/to/pm_sdk_config.ini to your
~/.bashrc file to use that file instead. This is recommended, as the
config.ini file will be overwritten if you ever uninstall or
playmolecule package. In this file you will configure the
IP and ports of your PlayMolecule server amongst other things.
The above command will also import four main classes.
Session class which is used to connect to the server, list
apps, start jobs and more. For a full list and description of the
Session methods check the API documentation.
DataCenter class. This class is used to download, upload and
tag datasets on the backend. All data handled by the backend is
considered a dataset including job related user-submitted and
generated data as well as data necessary for executing some apps.
Job class which represents a job on the server and can be
used to check the inputs of a job, submit a new job, get its status
and more. For a full list and description of the Job methods check
the API documentation.
JobStatus enumeration. This is so-called enumeration is just
a convenient mapping of job status codes (numbers) to more useful
descriptions. For a full list of the
JobStatus enumeration check
the API documentation.
To create a session with the PlayMolecule server from the python API you will need to obtain a PlayMolecule token from Acellera. For this please contact the Acellera customer support or your corresponding representative.
sess = Session("MY_PM_TOKEN")
You can list all available apps which are currently registered on your PlayMolecule server using the following command.
Once we have chosen an app that we want to use (here we will demonstrate
with “ProteinPrepare”) we can create a new job for the app using the
job = sess.start_app("ProteinPrepare")
To list all of the inputs of this specific app we can use the following command
Now that we know what inputs the job requires we can supply those and submit the job for execution to the PlayMolecule server.
job = sess.start_app("ProteinPrepare") job.pdbid = "3ptb" job.submit()
submit method will print out the job ID. It can be useful to
keep this ID around in case the current
ipython session dies so that
you can get back your job from the server.
If for some reason you lose the
job object, for example if you close
ipython console or you overwrite the
job variable, you can
get back the object with the following command, replacing
with the execution ID printed by the above
job = sess.get_job(execid="EXECID")
Here you can see a short example of how to submit an Adaptive Sampling job to the PlayMolecule server which also demonstrates how to pass different types of inputs such as local directories and numbers as arguments.
job = sess.start_app("AdaptiveSampling") job.describe()
job.inputdir = "test/generators/" job.numepochs = 3 job.nmin = 1 job.nmax = 2 job.adapttype = "confexplore" job.projection = "dihedrals" job.submit()
When running a job it’s important to know what the current status of the
job is, if it’s running, if it has completed etc. We can do this using
# We can disable the printing using the _logger argument status = job.get_status(_logger=False) # We can compare the returned status to any status we want print(status == JobStatus.WAITING_DATA) # We can also check if the status belongs to any in a list if status in (JobStatus.WAITING_DATA, JobStatus.QUEUED): print("Job has not started running yet.")
Some apps, especially long running ones, have progress reporting implemented. This mean that they report their current status whenever they have reached a major milestone. For example the AdaptiveSampling app might report the current epoch it’s at to the server. This information is not considered the status of the app as above but instead it’s current progress. We can obtain the current progress info from the app using the following commands.
info, per = job.get_progress()
info is a string describing the current progress of the
per is an estimate of the percentage of completion of
the execution. The information returned by
get_progress should only
be used as a way for the user to check the current progress and not used
programmatically to write scripts as it might not be very accurate,
depending on the app implementation.
Once a job has completed (or before, if we are interested in the
intermediate results of the calculations), we can retrieve the results
of the job to a local directory using the
on_status argument tells the
retrieve method to only
retrieve the job if it has reached that specific status. You can replace
this with any other status or list of statuses.
Often we want our code to wait for a job to complete before proceeding
to other operations. We can do this using the
We can also set a
on_status to the
wait method which tells it to
wait until the job has reached that status.
Some apps will spawn secondary executions of other apps. An example of
that is the
SimpleRun app which will spawn multiple
to perform the equilibration and production simulations necessary. These
secondary executions are sometimes called children. As an example, to
submit a job as the child of another job it can be done as follows by
passing it the execution ID of the parent job.
For app developers which create apps which spawn other jobs it can also
be useful to use the
wait_children method which waits for all the
children jobs spawned by the current job
You can also get all the children of a job as follows
# For old behaviour use return_dict=True childDict = job.get_children(returnDict=True) # New behaviour: Get all the children of the job children, status = job.get_children() # New behaviour: Get only the completed children children, status = job.get_children(status=JobStatus.COMPLETED) for child in children: child.retrieve()
If you have any suggestions on improvements or additional features which you would like to see in the PlayMolecule python API feel free to write us up.