Pass input files to an app#
Goal#
Give an app one of your own files (a PDB, an SDF, a directory of trajectories, …) as a parameter.
Minimal example#
from playmolecule.apps import proteinprepare
ed = proteinprepare(
outdir="out",
pdbfile="./inputs/3ptb.pdb",
)
ed.run()
How file parameters are handled#
When you pass a string or Path for a parameter typed Path in the manifest, PlayMolecule:
Resolves it to an absolute path on your host.
Copies (or symlinks — see below) it into
outdir/run_<id>/under the same basename.Rewrites the input JSON so the in-container path points to the staged copy.
Result: the container sees the file at a predictable path; you keep the original. Trying to pass a path that doesn’t exist raises immediately, before the container starts.
Copy vs symlink#
By default PlayMolecule copies inputs into the run directory. The copy is what makes a run directory reproducible: once outdir/run_<id>/ is built, it contains every byte the container will ever read, so you can tar it up, archive it alongside published results, or replay it months later — even if the originals on your host have moved, changed, or been deleted.
The trade-off is speed: copying multi-GB trajectories is slow. Set PM_SYMLINK=1 to symlink instead:
export PM_SYMLINK=1
With symlinks the run directory is not self-contained: it depends on the originals staying where they were when you called the app. If you delete or move the source files, the run directory’s inputs become dangling links. Use symlinks for fast iteration on large inputs; keep the default (copy) when reproducibility matters more than I/O.
Also don’t use symlinks when outdir is on a different filesystem than the source — some container runtimes won’t follow cross-mount symlinks.
Pass a directory#
If the app parameter is typed Path and you give it a directory, the whole tree is staged the same way:
ed = some_app(outdir="out", trajdir="./trajectories/run42")
Pair with PM_SYMLINK=1 if the directory is large.
Pass a file that’s already an app artifact#
When the file is bundled with the app (a trained model, a reference dataset), don’t copy it manually — use the artifact handle directly:
from playmolecule.apps import deepsite
deepsite(outdir="out", pdbid="3ptb", model=deepsite.artifacts.default).run()
See Use app artifacts.
Gotchas#
The string
"."and relative paths are resolved against the current working directory at the time of the call, not at the time ofrun(). If you change directories between the two, you’ll surprise yourself.For SLURM, the staged path must be readable by the compute node. That means
outdirand (without symlinks) the original input must live on shared storage.Some apps take parameters typed
dictwhose values reference files — for exampleproteinprepare’sresidue_smiles. Those are not staged; the strings go into the input JSON as-is.