Performance and benchmarks#
ACEMD’s simulation throughput depends primarily on the GPU. The numbers below show steady-state nanoseconds-per-day for three reference systems on consumer NVIDIA cards under sustained load.
Speed benchmarks#
Device |
Force field |
DHFR |
FactorIX |
STMV |
|---|---|---|---|---|
AMBER |
500 |
120 |
6.86 |
|
CHARMM |
467 |
116 |
7.48 |
|
AMBER |
623 |
180 |
9.94 |
|
CHARMM |
600 |
163 |
10.7 |
|
AMBER |
1040 |
313 |
15.4 |
|
CHARMM |
979 |
296 |
17.0 |
|
AMBER |
1308 |
434 |
22.4 |
|
CHARMM |
1258 |
416 |
25.1 |
|
AMBER |
1810 |
777 |
59.2 |
|
CHARMM |
1772 |
711 |
58.4 |
Numbers are ns/day on a single GPU at typical production settings.
The input files for the three benchmark systems are in acemd_benchmarks.zip.
Benchmark systems#
System |
Atoms |
Box (Å) |
|---|---|---|
DHFR (Dihydrofolate reductase) |
23,558 |
62.23 × 62.23 × 62.23 |
FactorIX (Factor IX) |
90,906 |
142.09 × 83.34 × 78.68 |
1,067,095 |
221.2 × 223.2 × 224.5 |
Benchmark conditions#
Force field: AMBER ff99SB or CHARMM 36 with TIP3P water.
Electrostatics: Particle-Mesh Ewald, grid spacing < 1.0 Å, real-space cutoff 9.0 Å.
van der Waals: cutoff 9.0 Å; switching function off for AMBER, 7.5 Å for CHARMM.
Constraints: H-bond constraints + rigid water (tolerance 1 × 10⁻⁶).
Integrator and thermostat: 4 fs timestep, 298.15 K Langevin thermostat (friction 0.1 ps⁻¹), HMR with
hydrogenmass = 4.0amu.Output: trajectory every 100 ps (25,000 steps).
How to read these numbers#
They are single-GPU numbers. ACEMD scales across multiple GPUs in a single host — typically a sublinear speed-up because of inter-GPU communication.
Sustained throughput depends on cooling — bursty benchmarks can run faster, but a long production run settles at the steady-state value above.
NNP and NNP/MM runs are slower than pure-MM runs at the same atom count. The exact slowdown depends on the size of the NNP-handled subsystem and the model variant.