# Run an app on SLURM

## Goal

Submit a PlayMolecule job to a SLURM cluster and let it run asynchronously.

## Minimal example

```python
from playmolecule.apps import proteinprepare

ed = proteinprepare(
    outdir="/shared/scratch/me/proteinprepare-3ptb",
    pdbid="3ptb",
)
ed.run(queue="slurm", partition="normalCPU", ncpu=1, ngpu=0)
```

`outdir` must be on a filesystem visible to all SLURM nodes. The call returns immediately; the job runs on a worker.

## Parameters that matter

Pass `queue="slurm"` to {py:meth}`~playmolecule.ExecutableDirectory.run`; every other keyword is forwarded to the SLURM submission:

| Parameter      | Type                | What it does                                                                                                          |
|----------------|---------------------|-----------------------------------------------------------------------------------------------------------------------|
| `partition`    | `str` or `list[str]`| Queue to run on. Pass a list and the queue offering earliest start is used.                                           |
| `ncpu`         | `int`               | CPUs requested. Defaults to the app manifest's `resources.ncpu`.                                                      |
| `ngpu`         | `int`               | GPUs requested. Defaults to the app manifest's `resources.ngpu`.                                                      |
| `memory`       | `int`               | RAM in MiB.                                                                                                           |
| `gpumemory`    | `int`               | Minimum GPU memory in MiB (requires `gpu_mem` SLURM feature).                                                         |
| `walltime`     | `int`               | Timeout in seconds.                                                                                                   |
| `priority`     | `str`               | SLURM priority class.                                                                                                 |
| `jobname`      | `str`               | Job identifier shown in `squeue`.                                                                                     |
| `nodelist`     | `list[str]`         | Whitelist of nodes — **jobs will be duplicated** across them, not load-balanced.                                      |
| `exclude`      | `list[str]`         | Blacklist of nodes.                                                                                                   |
| `envvars`      | `str`               | Comma-separated env vars to propagate from the submit node to the worker.                                             |
| `prerun`       | `list[str]`         | Shell commands run on the worker before the container starts (e.g., `module load apptainer`).                         |
| `mailtype`     | `str`               | `BEGIN,END,FAIL,...` — what to email on.                                                                              |
| `mailuser`     | `str`               | Email address for `mailtype`.                                                                                         |
| `outputstream` | `str`               | SLURM stdout file path.                                                                                               |
| `errorstream`  | `str`               | SLURM stderr file path.                                                                                               |

When `ncpu` / `ngpu` aren't passed explicitly, PlayMolecule reads them from the app manifest's resource defaults. Override only when you want to deviate from them.

## Preset the queue from the environment

Set the queue config once and `ed.run()` with no arguments will route to SLURM automatically:

```bash
export PM_QUEUE_CONFIG='{"queue": "slurm", "cpu_partition": "normalCPU", "gpu_partition": "normalGPU"}'
```

```python
ed.run()    # picks gpu_partition if the manifest requests GPUs, cpu_partition otherwise
```

Other keys in the JSON pass through as kwargs (e.g., `memory`, `walltime`).

## Check on the job

```python
print(ed.status)        # JobStatus.WAITING_INFO / RUNNING / COMPLETED / ERROR
```

See [Check job status](check-job-status.md) for the polling pattern.

## Gotchas

- `/tmp/` is *not* shared. If you set `outdir=/tmp/...` your job will start and immediately fail when the worker can't read the inputs. Use shared storage.
- Logs go to wherever SLURM was configured to write them (and to `outdir/run_<id>/`). Use `--output` / `outputstream` to override.
- The submitting Python process does not need to stay alive — the job is owned by SLURM. Status queries work from any process by reconstructing the {py:class}`~playmolecule.ExecutableDirectory` from `dirname`.

## Side note: `ed.slurm(...)`

`ed.slurm(partition=..., ncpu=..., ...)` is a thin alias for `ed.run(queue="slurm", ...)` retained for backwards compatibility. New code should prefer `run(queue="slurm")` so the same call style works for local, SLURM, and HTTP backends, and so `PM_QUEUE_CONFIG` can drop the kwargs entirely.

## See also

- {py:meth}`~playmolecule.ExecutableDirectory.run`
- {py:meth}`~playmolecule.ExecutableDirectory.slurm`
- [Run many jobs on one GPU](run-many-jobs-on-one-gpu.md)
- [Check job status](check-job-status.md)