Debug a simulation crash#

You will learn: how to diagnose the most common ACEMD failure — openmm.OpenMMException: Particle coordinate is nan — and recover the trajectory frames immediately before the crash.

Prerequisites:

  • A simulation that crashes.

  • A visualizer (VMD, PyMOL, or similar) to inspect frames.

Step 1 — Increase minimization#

If the crash happens right after minimization, the most common cause is an unresolved atomic clash. Bump the minimization steps:

input.yaml#
minimize: 5000

If that fixes it, the original minimize value was too small to relax the starting geometry.

Step 2 — Capture every frame right before the crash#

If minimization doesn’t help — or if the crash happens during production — write the trajectory at every step so you can see the frames leading up to the failure:

input.yaml#
trajectoryperiod: 1
stepzero: true

stepzero: true writes the starting frame; trajectoryperiod: 1 writes one frame per integration step.

Warning

These settings collapse simulation throughput by orders of magnitude — disk I/O dominates. Use them only to debug; remove them once you’ve found the cause.

Re-run the simulation. The last frames of output.xtc now show the seconds before the crash.

Step 3 — Inspect the trajectory#

Open output.xtc in VMD or PyMOL. Look for:

  • A bond stretching to an unphysical length.

  • An atom flying away from its neighbours between frames.

  • Two residues overlapping.

The atoms or bonds with the largest unphysical motion point at the cause.

Note

Atoms “leaving” the box are not a sign of a crash — ACEMD writes unwrapped trajectories despite using periodic boundary conditions. See the moleculekit wrap tutorial: https://software.acellera.com/moleculekit/howto/wrap-trajectories.html.

Step 4 — If the cause isn’t visible, dump forces too#

Adding trajforcefile writes per-atom forces (kcal/mol/Å) at every trajectoryperiod step. The XTC format is misused as a force container — coordinates are replaced with force vectors.

input.yaml#
trajforcefile: "forces.xtc"

To visualize the forces as arrows on top of the trajectory, use view_forces() — see Visualize forces for the recipe.

Common causes#

The four most common roots for NaN crashes:

1. Starting-coordinate clashes#

System builders (CHARMM psfgen, AMBER tleap) can produce small atom clashes if you don’t watch their output. More minimization usually fixes these; if not, fix the build.

2. Mis-parameterised small molecule#

If the unphysical forces are concentrated on a small molecule (a non-canonical residue from the builder), the issue is its parameters. Re-parameterise the molecule.

3. Wrong box dimensions#

If the initial boxsize is far from the actual extent of the starting coordinates and the run is NVT, the simulation can’t recover. NPT runs adapt the box, but if the gap is large the barostat may not catch up before the crash. Check the box volume in output.csv — large fluctuations early on are a tell.

The standard workflow: an NPT equilibration sets the right box, and the production NVT (or NPT) uses the final box from equilibration.

4. Strong external forces#

Restraints with high k values, narrow fbWidth boxes, or NNP forces in an extrapolation regime. Try the simulation without extforces to isolate the cause — if it runs, your restraint setup is too aggressive. Lower the force constant, widen the box, or fix the selection.

5. NNP instability#

Neural network potentials can extrapolate poorly outside their training data. Try:

  • Lowering timestep to 1 fs.

  • Switching to a different NNP — newer AceFF releases improve coverage.

See also#