Check job status#

Goal#

Find out whether a PlayMolecule job is queued, running, finished, or failed — and wait for it to finish if needed.

Minimal example#

from playmolecule import JobStatus
from playmolecule.apps import proteinprepare

ed = proteinprepare(outdir="/shared/scratch/me/run", pdbid="3ptb")
ed.run(queue="slurm", partition="normalCPU", ncpu=1, ngpu=0)

print(ed.status)         # JobStatus.WAITING_INFO immediately after submission

The four states#

JobStatus is an IntEnum with four members:

State

Value

Meaning

WAITING_INFO

0

Submitted but not yet running. Default for fresh jobs.

RUNNING

1

Container has started and is making progress.

COMPLETED

2

Exited cleanly. Outputs are ready.

ERROR

3

Exited with a non-zero status or stalled past the heartbeat timeout.

For the lifecycle and the heartbeat mechanism, see Job lifecycle.

Poll until done#

import time

while ed.status not in (JobStatus.COMPLETED, JobStatus.ERROR):
    time.sleep(30)

if ed.status == JobStatus.ERROR:
    raise RuntimeError("Job failed — check outdir/run_*/")

Use time.sleep(30) or longer for SLURM jobs; one second of polling per minute of runtime is plenty and keeps the controller node sane.

Query status from a different Python process#

A common pattern is to submit a job from a notebook and later check on it from a fresh shell. Rebuild the ExecutableDirectory by pointing it at the existing directory:

from playmolecule import ExecutableDirectory

ed = ExecutableDirectory(dirname="/shared/scratch/me/run")
print(ed.status)

The constructor finds the latest run_<id>/ subdirectory inside dirname and reattaches. This works for local, SLURM, and HTTP-backend jobs — the status query is dispatched by the active execution backend.

Compare with state directly#

JobStatus is an IntEnum, so:

if ed.status == JobStatus.RUNNING:
    ...

works as expected. str(ed.status) returns a human label ("Running", etc.) via describe().

Gotchas#

  • After SLURM marks a job COMPLETED, the worker may still be flushing buffered output. Wait a second or two before reading the result files if your script raced through.

  • The WAITING_INFO state is also returned when the controller hasn’t received any heartbeat yet — there’s no way to distinguish “queued” from “I haven’t heard from the worker yet” from this enum. SLURM’s squeue is the source of truth for queue state.

  • The HTTP backend derives a job ID from the outdir path. If you move or rename the directory after submission, status queries will fail to find the job.

See also#