# Run an app on SLURM ## Goal Submit a PlayMolecule job to a SLURM cluster and let it run asynchronously. ## Minimal example ```python from playmolecule.apps import proteinprepare ed = proteinprepare( outdir="/shared/scratch/me/proteinprepare-3ptb", pdbid="3ptb", ) ed.run(queue="slurm", partition="normalCPU", ncpu=1, ngpu=0) ``` `outdir` must be on a filesystem visible to all SLURM nodes. The call returns immediately; the job runs on a worker. ## Parameters that matter Pass `queue="slurm"` to {py:meth}`~playmolecule.ExecutableDirectory.run`; every other keyword is forwarded to the SLURM submission: | Parameter | Type | What it does | |----------------|---------------------|-----------------------------------------------------------------------------------------------------------------------| | `partition` | `str` or `list[str]`| Queue to run on. Pass a list and the queue offering earliest start is used. | | `ncpu` | `int` | CPUs requested. Defaults to the app manifest's `resources.ncpu`. | | `ngpu` | `int` | GPUs requested. Defaults to the app manifest's `resources.ngpu`. | | `memory` | `int` | RAM in MiB. | | `gpumemory` | `int` | Minimum GPU memory in MiB (requires `gpu_mem` SLURM feature). | | `walltime` | `int` | Timeout in seconds. | | `priority` | `str` | SLURM priority class. | | `jobname` | `str` | Job identifier shown in `squeue`. | | `nodelist` | `list[str]` | Whitelist of nodes — **jobs will be duplicated** across them, not load-balanced. | | `exclude` | `list[str]` | Blacklist of nodes. | | `envvars` | `str` | Comma-separated env vars to propagate from the submit node to the worker. | | `prerun` | `list[str]` | Shell commands run on the worker before the container starts (e.g., `module load apptainer`). | | `mailtype` | `str` | `BEGIN,END,FAIL,...` — what to email on. | | `mailuser` | `str` | Email address for `mailtype`. | | `outputstream` | `str` | SLURM stdout file path. | | `errorstream` | `str` | SLURM stderr file path. | When `ncpu` / `ngpu` aren't passed explicitly, PlayMolecule reads them from the app manifest's resource defaults. Override only when you want to deviate from them. ## Preset the queue from the environment Set the queue config once and `ed.run()` with no arguments will route to SLURM automatically: ```bash export PM_QUEUE_CONFIG='{"queue": "slurm", "cpu_partition": "normalCPU", "gpu_partition": "normalGPU"}' ``` ```python ed.run() # picks gpu_partition if the manifest requests GPUs, cpu_partition otherwise ``` Other keys in the JSON pass through as kwargs (e.g., `memory`, `walltime`). ## Check on the job ```python print(ed.status) # JobStatus.WAITING_INFO / RUNNING / COMPLETED / ERROR ``` See [Check job status](check-job-status.md) for the polling pattern. ## Gotchas - `/tmp/` is *not* shared. If you set `outdir=/tmp/...` your job will start and immediately fail when the worker can't read the inputs. Use shared storage. - Logs go to wherever SLURM was configured to write them (and to `outdir/run_/`). Use `--output` / `outputstream` to override. - The submitting Python process does not need to stay alive — the job is owned by SLURM. Status queries work from any process by reconstructing the {py:class}`~playmolecule.ExecutableDirectory` from `dirname`. ## Side note: `ed.slurm(...)` `ed.slurm(partition=..., ncpu=..., ...)` is a thin alias for `ed.run(queue="slurm", ...)` retained for backwards compatibility. New code should prefer `run(queue="slurm")` so the same call style works for local, SLURM, and HTTP backends, and so `PM_QUEUE_CONFIG` can drop the kwargs entirely. ## See also - {py:meth}`~playmolecule.ExecutableDirectory.run` - {py:meth}`~playmolecule.ExecutableDirectory.slurm` - [Run many jobs on one GPU](run-many-jobs-on-one-gpu.md) - [Check job status](check-job-status.md)