Check job status#
Goal#
Find out whether a PlayMolecule job is queued, running, finished, or failed — and wait for it to finish if needed.
Minimal example#
from playmolecule import JobStatus
from playmolecule.apps import proteinprepare
ed = proteinprepare(outdir="/shared/scratch/me/run", pdbid="3ptb")
ed.run(queue="slurm", partition="normalCPU", ncpu=1, ngpu=0)
print(ed.status) # JobStatus.WAITING_INFO immediately after submission
The four states#
JobStatus is an IntEnum with four members:
State |
Value |
Meaning |
|---|---|---|
|
0 |
Submitted but not yet running. Default for fresh jobs. |
|
1 |
Container has started and is making progress. |
|
2 |
Exited cleanly. Outputs are ready. |
|
3 |
Exited with a non-zero status or stalled past the heartbeat timeout. |
For the lifecycle and the heartbeat mechanism, see Job lifecycle.
Poll until done#
import time
while ed.status not in (JobStatus.COMPLETED, JobStatus.ERROR):
time.sleep(30)
if ed.status == JobStatus.ERROR:
raise RuntimeError("Job failed — check outdir/run_*/")
Use time.sleep(30) or longer for SLURM jobs; one second of polling per minute of runtime is plenty and keeps the controller node sane.
Query status from a different Python process#
A common pattern is to submit a job from a notebook and later check on it from a fresh shell. Rebuild the ExecutableDirectory by pointing it at the existing directory:
from playmolecule import ExecutableDirectory
ed = ExecutableDirectory(dirname="/shared/scratch/me/run")
print(ed.status)
The constructor finds the latest run_<id>/ subdirectory inside dirname and reattaches. This works for local, SLURM, and HTTP-backend jobs — the status query is dispatched by the active execution backend.
Compare with state directly#
JobStatus is an IntEnum, so:
if ed.status == JobStatus.RUNNING:
...
works as expected. str(ed.status) returns a human label ("Running", etc.) via describe().
Gotchas#
After SLURM marks a job
COMPLETED, the worker may still be flushing buffered output. Wait a second or two before reading the result files if your script raced through.The
WAITING_INFOstate is also returned when the controller hasn’t received any heartbeat yet — there’s no way to distinguish “queued” from “I haven’t heard from the worker yet” from this enum. SLURM’ssqueueis the source of truth for queue state.The HTTP backend derives a job ID from the
outdirpath. If you move or rename the directory after submission, status queries will fail to find the job.
See also#
JobStatus