# Executable directory An {py:class}`~playmolecule.ExecutableDirectory` (ED) is the on-disk artefact you get back from every app call. It is the unit PlayMolecule moves around, runs, polls, and re-uses. This page explains what's inside one, why the abstraction exists, and how it composes with SLURM and HTTP backends. ## The two-phase model A PlayMolecule app call has two distinct phases: 1. **Setup** — `proteinprepare(outdir="out", pdbid="3ptb")` validates arguments against the manifest signature, stages input files into `out/run__/`, writes the input JSON, generates a run script, and returns an {py:class}`~playmolecule.ExecutableDirectory`. **No container has started yet.** 2. **Run** — `ed.run()` (optionally `ed.run(queue="slurm", ...)`) hands the prepared directory to an execution backend. Outputs land back in `outdir`. The split exists because the two phases benefit from different environments: - Setup wants to be **cheap** and **local** — you might do it in a notebook on your laptop. - Run wants to be **wherever the resources are** — your laptop, a SLURM worker, a GPU node, the HTTP backend. Decoupling them means you can set up hundreds of EDs in a script and then submit them in a batch, replay a single ED on a different cluster, or inspect prepared inputs before paying for compute. ## Layout on disk ```text outdir/ ├── output.pdb # produced by the run (later) ├── details.csv # produced by the run (later) ├── run_03_07_2026_14_22_a1b2c3d4.sh # the rendered run script └── run_03_07_2026_14_22_a1b2c3d4/ # the inputs dir for this run ├── inputs.json # input JSON consumed by the container ├── input-files-staged-here/ # copies/symlinks of file params ├── .pm.alive # heartbeat — see Job lifecycle └── .pm.err # error sentinel (only if it failed) ``` Key properties: - The **outdir** is the user-chosen location. - The **run directory** has a fresh timestamp + UUID per call, so you can re-run the same ED and get parallel `run_*/` siblings. - The **run script** lives next to the run directory; `runsh = inputs_dir.basename + ".sh"`. - The directory is **self-contained**. If you `tar` it up, copy it to another machine, and reconstruct the ED there, `ed.run()` will work as long as the same registry/images are available. ## Reconstructing an ED from disk ```python from playmolecule import ExecutableDirectory ed = ExecutableDirectory(dirname="/shared/scratch/me/run") print(ed.status) ed.run() # resume / re-run ``` The constructor finds the most recent `run_/` inside `dirname` and uses it as the inputs directory. This is what makes "submit on Monday, check status on Tuesday" work — there's no in-memory state required. ## Execution backend dispatch `ed.run()` dispatches to whichever execution backend was active **when the ED was built**. That means: - Setting up under `PM_EXECUTOR=local` and later changing `PM_EXECUTOR=http://...` does not move the job. The backend was captured at setup time. - To switch, set up a new ED in a new process. `ed.status` follows the same dispatch — local EDs are queried by reading the heartbeat file and the SLURM queue; HTTP EDs are queried by HTTP. ## The `slurm` shortcut `ed.run(queue="slurm", ...)` wraps the prepared run directory in a `jobqueues` SLURM submission. Resources default to the values captured from the app manifest at setup time. (`ed.slurm(...)` is a thin alias retained for backwards compatibility.) The execution backend isn't switched to "SLURM" — SLURM is a mode of the local execution backend (it ultimately invokes the same `docker run` or `apptainer run`, just on a worker node). ## Batched MPS submission {py:func}`~playmolecule.slurm_mps` takes a list of EDs and submits them as a single SLURM job that holds one GPU under NVIDIA MPS. The EDs are still independent on disk — each one writes to its own `outdir` — but the SLURM accounting collapses them. Resource defaults are taken from the **first** ED's `execution_resources`, not the union. ## Why not just return a dict? You could imagine PlayMolecule returning `{"runsh": "...", "inputs_dir": "...", ...}` and dropping the class. The reasons it doesn't: - `.status` needs to dispatch by execution backend. A dict can't do that without a wrapper. - HTTP-backend jobs need to track their server-side job id between calls. A dict can't carry that. - Polling code reads more naturally as `ed.status` than `ed["status"]`. The ED is intentionally thin — almost everything it knows is in fields, and its methods (`run`, `slurm`, `status`) are dispatch shims to the active backend. ## See also - [Architecture](architecture.md) - [Job lifecycle](job-lifecycle.md) - [Check job status](../howto/check-job-status.md) - {py:class}`~playmolecule.ExecutableDirectory`