Architecture#
PlayMolecule’s design separates where apps come from (manifest discovery) from where jobs run (execution). Both are pluggable. Understanding the split is the key to picking the right environment variables and predicting what a given call will do.
Two orthogonal backend axes#
graph LR
subgraph Registries [Manifest backend]
D[docker://<registry><br/>Docker registry]
H[http://<url><br/>PlayMolecule HTTP backend]
L[local:<path><br/>filesystem]
end
subgraph Executors [Execution backend]
EL[Local<br/>docker run / apptainer run<br/>--<br/>direct or via SLURM sbatch]
EH[HTTP<br/>POST to backend]
end
D --> EL
H --> EH
L --> EL
Left column: where apps are discovered (PM_REGISTRIES). Right column: where jobs run (PM_EXECUTOR).
Manifest backend is chosen by
PM_REGISTRIES. It answers “what apps exist and what are their parameters?”. One of:docker://,http://, orlocal:.Execution backend is chosen by
PM_EXECUTOR. It answers “when I called.run(), where does the container actually start?”. One oflocal(default) or anhttp://URL.
The two are independent. A common production setup mixes them — e.g., discover apps from a docker:// registry but execute remotely through an http:// backend; or browse an http:// backend’s catalogue locally without ever submitting through it.
What each backend does#
Manifest backends#
|
When you’d use it |
How it discovers apps |
|---|---|---|
|
Default. Pulls Acellera’s released apps from a container registry. |
Lists images in the configured Docker registry. Reads each image’s manifest label. |
|
When jobs are dispatched to a remote PlayMolecule backend. |
Hits the backend’s catalogue endpoint and decodes the JSON. |
|
Developer use only — when you’re writing your own app and want to iterate on its manifest without publishing. |
Scans |
The output of all three is the same shape — a {appname: {version: {manifest, files, run.sh}}} dict — so the rest of the system doesn’t care which one ran.
Execution backends#
|
How it runs jobs |
|---|---|
|
|
|
POSTs the prepared input JSON to the backend, polls for status, downloads outputs. |
(SLURM is a mode of local) |
|
SLURM isn’t a separate execution backend — it’s an sbatch wrapper around the local execution path. SLURM workers still use Docker or Apptainer to run the container; PlayMolecule just generates the submission script.
The flow of a single call#
sequenceDiagram
actor User
participant App as proteinprepare(...)
participant ED as ExecutableDirectory
participant Exec as execution backend
participant Container
User->>App: proteinprepare(outdir, pdbid='3ptb')
App->>App: validate args against manifest signature
App->>ED: set up run_<id>/ on disk, write input JSON
App-->>User: return ed (nothing has executed yet)
User->>ED: ed.run()
ED->>Exec: dispatch
Exec->>Container: docker run / apptainer run / HTTP POST
Container-->>Exec: outputs to outdir, exit code
Exec-->>ED: status
ED-->>User: control returns
The two-phase split — setup then run — is deliberate. It lets you:
Inspect or tweak inputs in
outdir/run_<id>/before launching.Save an
edreference, submit it to SLURM, and check status hours later from a fresh process.Batch many
eds into a single SLURM submission withslurm_mps().
Where configuration lives#
Source |
Reads / writes |
|---|---|
|
Single source of truth at import time. Listed in Environment variables. |
App manifest |
Per-app parameters, default resources, expected outputs. |
|
The exact inputs sent to a specific run. |
|
HTTP backend session. |
|
SIF cache (Docker images converted on first use). |
Everything user-tunable is in env vars; everything app-tunable is in the manifest; everything specific to a run is in run_<id>/. There is no global mutable state inside PlayMolecule itself.