Executable directory#
An ExecutableDirectory (ED) is the on-disk artefact you get back from every app call. It is the unit PlayMolecule moves around, runs, polls, and re-uses. This page explains what’s inside one, why the abstraction exists, and how it composes with SLURM and HTTP backends.
The two-phase model#
A PlayMolecule app call has two distinct phases:
Setup —
proteinprepare(outdir="out", pdbid="3ptb")validates arguments against the manifest signature, stages input files intoout/run_<timestamp>_<uuid>/, writes the input JSON, generates a run script, and returns anExecutableDirectory. No container has started yet.Run —
ed.run()(optionallyed.run(queue="slurm", ...)) hands the prepared directory to an execution backend. Outputs land back inoutdir.
The split exists because the two phases benefit from different environments:
Setup wants to be cheap and local — you might do it in a notebook on your laptop.
Run wants to be wherever the resources are — your laptop, a SLURM worker, a GPU node, the HTTP backend.
Decoupling them means you can set up hundreds of EDs in a script and then submit them in a batch, replay a single ED on a different cluster, or inspect prepared inputs before paying for compute.
Layout on disk#
outdir/
├── output.pdb # produced by the run (later)
├── details.csv # produced by the run (later)
├── run_03_07_2026_14_22_a1b2c3d4.sh # the rendered run script
└── run_03_07_2026_14_22_a1b2c3d4/ # the inputs dir for this run
├── inputs.json # input JSON consumed by the container
├── input-files-staged-here/ # copies/symlinks of file params
├── .pm.alive # heartbeat — see Job lifecycle
└── .pm.err # error sentinel (only if it failed)
Key properties:
The outdir is the user-chosen location.
The run directory has a fresh timestamp + UUID per call, so you can re-run the same ED and get parallel
run_*/siblings.The run script lives next to the run directory;
runsh = inputs_dir.basename + ".sh".The directory is self-contained. If you
tarit up, copy it to another machine, and reconstruct the ED there,ed.run()will work as long as the same registry/images are available.
Reconstructing an ED from disk#
from playmolecule import ExecutableDirectory
ed = ExecutableDirectory(dirname="/shared/scratch/me/run")
print(ed.status)
ed.run() # resume / re-run
The constructor finds the most recent run_<id>/ inside dirname and uses it as the inputs directory. This is what makes “submit on Monday, check status on Tuesday” work — there’s no in-memory state required.
Execution backend dispatch#
ed.run() dispatches to whichever execution backend was active when the ED was built. That means:
Setting up under
PM_EXECUTOR=localand later changingPM_EXECUTOR=http://...does not move the job. The backend was captured at setup time.To switch, set up a new ED in a new process.
ed.status follows the same dispatch — local EDs are queried by reading the heartbeat file and the SLURM queue; HTTP EDs are queried by HTTP.
The slurm shortcut#
ed.run(queue="slurm", ...) wraps the prepared run directory in a jobqueues SLURM submission. Resources default to the values captured from the app manifest at setup time. (ed.slurm(...) is a thin alias retained for backwards compatibility.)
The execution backend isn’t switched to “SLURM” — SLURM is a mode of the local execution backend (it ultimately invokes the same docker run or apptainer run, just on a worker node).
Batched MPS submission#
slurm_mps() takes a list of EDs and submits them as a single SLURM job that holds one GPU under NVIDIA MPS. The EDs are still independent on disk — each one writes to its own outdir — but the SLURM accounting collapses them. Resource defaults are taken from the first ED’s execution_resources, not the union.
Why not just return a dict?#
You could imagine PlayMolecule returning {"runsh": "...", "inputs_dir": "...", ...} and dropping the class. The reasons it doesn’t:
.statusneeds to dispatch by execution backend. A dict can’t do that without a wrapper.HTTP-backend jobs need to track their server-side job id between calls. A dict can’t carry that.
Polling code reads more naturally as
ed.statusthaned["status"].
The ED is intentionally thin — almost everything it knows is in fields, and its methods (run, slurm, status) are dispatch shims to the active backend.
See also#
ExecutableDirectory