# Performance and benchmarks

ACEMD's simulation throughput depends primarily on the GPU. The numbers below show steady-state nanoseconds-per-day for three reference systems on consumer NVIDIA cards under sustained load.

## Speed benchmarks

| Device                        | Force field | DHFR  | FactorIX | STMV |
|-------------------------------|-------------|------:|---------:|-----:|
| [NVIDIA GeForce GTX 1080][1]    | AMBER       | 500   | 120      | 6.86 |
|                                 | CHARMM      | 467   | 116      | 7.48 |
| [NVIDIA GeForce GTX 1080 Ti][2] | AMBER       | 623   | 180      | 9.94 |
|                                 | CHARMM      | 600   | 163      | 10.7 |
| [NVIDIA GeForce RTX 2080 Ti][3] | AMBER       | 1040  | 313      | 15.4 |
|                                 | CHARMM      | 979   | 296      | 17.0 |
| [NVIDIA GeForce RTX 3090][4]    | AMBER       | 1308  | 434      | 22.4 |
|                                 | CHARMM      | 1258  | 416      | 25.1 |
| [NVIDIA GeForce RTX 4090][5]    | AMBER       | 1810  | 777      | 59.2 |
|                                 | CHARMM      | 1772  | 711      | 58.4 |

Numbers are ns/day on a single GPU at typical production settings.

[1]: https://www.nvidia.com/en-us/geforce/products/10series/geforce-gtx-1080/
[2]: https://www.nvidia.com/en-us/geforce/products/10series/geforce-gtx-1080-ti/
[3]: https://www.nvidia.com/en-us/geforce/graphics-cards/rtx-2080-ti/
[4]: https://www.nvidia.com/en-us/geforce/graphics-cards/30-series/rtx-3090/
[5]: https://www.nvidia.com/es-es/geforce/graphics-cards/40-series/rtx-4090/

The input files for the three benchmark systems are in {download}`acemd_benchmarks.zip <../acemd_benchmarks.zip>`.

## Benchmark systems

| System    | Atoms     | Box (Å)                       |
|-----------|----------:|-------------------------------|
| DHFR ([Dihydrofolate reductase][dhfr]) | 23,558    | 62.23 × 62.23 × 62.23         |
| FactorIX ([Factor IX][f9])             | 90,906    | 142.09 × 83.34 × 78.68        |
| STMV ([Satellite tobacco mosaic virus][stmv]) | 1,067,095 | 221.2 × 223.2 × 224.5         |

[dhfr]: https://en.wikipedia.org/wiki/Dihydrofolate_reductase
[f9]: https://en.wikipedia.org/wiki/Factor_IX
[stmv]: https://en.wikipedia.org/wiki/Satellite_tobacco_mosaic_virus

## Benchmark conditions

- **Force field:** [AMBER ff99SB](http://ambermd.org/AmberModels.php) or [CHARMM 36](https://onlinelibrary.wiley.com/doi/abs/10.1002/jcc.23354) with [TIP3P](https://aip.scitation.org/doi/10.1063/1.445869) water.
- **Electrostatics:** Particle-Mesh Ewald, grid spacing < 1.0 Å, real-space cutoff 9.0 Å.
- **van der Waals:** cutoff 9.0 Å; switching function off for AMBER, 7.5 Å for CHARMM.
- **Constraints:** H-bond constraints + rigid water (tolerance 1 × 10⁻⁶).
- **Integrator and thermostat:** 4 fs timestep, 298.15 K Langevin thermostat (friction 0.1 ps⁻¹), HMR with `hydrogenmass = 4.0` amu.
- **Output:** trajectory every 100 ps (25,000 steps).

## How to read these numbers

- They are **single-GPU** numbers. ACEMD scales across multiple GPUs in a single host — typically a sublinear speed-up because of inter-GPU communication.
- Sustained throughput depends on cooling — bursty benchmarks can run faster, but a long production run settles at the steady-state value above.
- NNP and NNP/MM runs are slower than pure-MM runs at the same atom count. The exact slowdown depends on the size of the NNP-handled subsystem and the model variant.

## See also

- [Select GPU devices](../how-to/select-gpu-devices.md)
- [Integrator and constraints](integrator-and-constraints.md)