# Artifacts and files PlayMolecule has two related but distinct concepts for "things that live inside an app's container and can be used by it": **files** and **artifacts**. They share the same underlying class hierarchy but serve different purposes. ## Files: the raw inventory Every app declares a `files` block in its manifest mapping logical paths to in-container paths: ```json "files": { "tests/3ptb.pdb": "/app/files/tests/3ptb.pdb", "tests/web_content.pickle": "/app/files/tests/web_content.pickle" } ``` These are exposed as `app.files` — a dict of file handles. You don't usually touch this directly. It's used internally to: - Resolve test-config paths to actual file handles (`tests/3ptb.pdb` → the bundled PDB). - Resolve `artifacts` entries (next section). ## Artifacts: the curated, callable surface The `artifacts` block (also accepted as the older synonym `datasets`) declares which files are *meant to be used as inputs*: ```json "artifacts": [ { "name": "default", "path": "datasets/model_98acc.ckpt", "description": "DeepSite final model" } ] ``` These appear as attributes on `app.artifacts`: ```python from playmolecule.apps import deepsite deepsite.artifacts.default # a callable file handle deepsite.artifacts.default.path # path inside the container deepsite.artifacts.default.download("./local-copy") ``` You pass `deepsite.artifacts.default` directly as a function argument; PlayMolecule resolves it to the right path depending on the active execution backend (local mount, Docker bind, HTTP fetch). ## The summary | Aspect | `app.files` | `app.artifacts` | |----------------|-----------------------------------------------------|-------------------------------------------------------------------| | Source | `files` block in manifest | `artifacts` (or `datasets`) block in manifest | | Access | dict keyed by logical path | attribute access by curated name | | Purpose | wiring (tests, internal resolution) | user-facing — pass into app calls | | Has `.name`? | Logical path | Curated short name (no dots; must start with a letter) | | Has description| Usually no | Yes (from manifest) | | Downloadable? | Yes (`.download()`) | Yes (`.download()`) | In short: `artifacts` is `files` filtered to the entries someone took the trouble to curate and name. Reach for `artifacts` unless you know you need the raw `files` dict. ## Backend-aware file handles A file handle knows how to fetch its content for the active backend: - **local registry** — plain filesystem path; `.download()` is a copy. - **Docker registry** (Docker runtime) — `.download()` shells out to `docker cp`. - **Docker registry** (Apptainer runtime) — `.download()` runs `apptainer exec` against the cached SIF and copies out. - **HTTP backend** — `.download()` issues an authenticated GET to the backend. You don't pick which one you got — the handle does the right thing for the current registry/runtime, which is why example code can use `app.artifacts.Foo` uniformly across installations. ## When you'd actually use `.download()` The `download()` path is for **outside-the-app** consumers — say, you want to do an analysis in your own notebook with the same reference data the app uses. For arguments *to* the app, just pass the handle directly; don't download first. ## Gotchas - Two artifacts in the same app cannot share a name. If they do, the loader overwrites silently. - `download()` to a path that already exists will overwrite a file or wipe-and-recreate a directory. There is no "keep existing" mode. - For Docker / Apptainer files, `download()` shells out — it's slow for many small files. Prefer a single `download()` of a directory over a loop of per-file downloads. ## See also - [Use app artifacts](../howto/use-app-artifacts.md) - [Apps and manifests](apps-and-manifests.md)