htmd.builder.nonstandard module#
End-to-end parameterization pipeline for non-canonical residues under
AMBER, driven by the spec list returned from
moleculekit.tools.nonstandard_residues.detectNonStandardResidues().
parameterizeFromSpecs() is the user-facing entry point. It walks
mol.bonds to recover cluster grouping (residues sharing non-peptide
inter-residue bonds), builds a combined antechamber model compound per
cluster (full residues + ACE/NME-style backbone caps), runs antechamber +
parmchk2 once per cluster, and splits the output into per-residue
CIF / frcmod pairs. Free residues (no cluster bonds) are parameterized
standalone via htmd.builder._ambertools._fftype_antechamber().
The result ClusterOutputs carries the topology paths, frcmod
paths, and custombonds list in the shape that
htmd.builder.amber.build() expects.
For canonical residues that the detector renamed (CYS bonded to a
scaffold, ASN glycosylated by a sugar, …), the per-residue CIF carries
ff14SB atom types pulled from the right AMBER residue template (mid-chain
CYX / N-terminal NCYX / C-terminal CCYX and the analogous
forms for LYS/HIS/ASN/…) so that backbone bonds resolve against
ff14SB. Per-atom charges come from the antechamber compute on the
combined model, except the backbone atoms of chain-resident residues,
which are pinned to ff14SB: the whole backbone from the ff14SB libraries
for canonical residues, the charge-class amide charges for NCAAs
(see _backbone_charge_map()). The frcmod carries cross-FF
junction terms (bond / angle / dihedral entries spanning a
canonical-residue atom and a non-canonical
one) with the canonical-side atom types rewritten from antechamber’s
GAFF2 to ff14SB.
- class htmd.builder.nonstandard.ClusterBond(atom_a, atom_b)#
Bases:
objectOne non-peptide covalent bond between two atoms in a cluster. Symmetric (no canonical-side / scaffold-side asymmetry), so it works uniformly for NCAA-NCAA crosslinks, canonical-AA-anchored scaffolds, and everything in between.
- atom_a: UniqueAtomID#
- atom_b: UniqueAtomID#
- class htmd.builder.nonstandard.ClusterModel(spec, cif_path, atom_map, atom_to_residue, atom_to_orig_name, canonical_renames)#
Bases:
objectResult of
buildClusterModel(). Carries everythingprepareClusterResidues()needs to split antechamber’s output back into per-residue topology files.- spec: ClusterSpec#
- class htmd.builder.nonstandard.ClusterOutputs(topo_paths=<factory>, frcmod_paths=<factory>, custombonds=<factory>, xml_paths=<factory>)#
Bases:
objectAggregated result of
parameterizeFromSpecs()/prepareClusterResidues(). Carries the topology, parameter and custombond inputs that the user feeds back intohtmd.builder.amber.build().
- class htmd.builder.nonstandard.ClusterSpec(subtype, residues, is_chain_resident, is_canonical, roles, bonds, canonical_resnames=<factory>, canonical_terminus=<factory>, is_n_term=<factory>, is_c_term=<factory>)#
Bases:
objectA connected covalent cluster of residues that share non-peptide bonds and need combined parameterization.
residueslists every cluster member; the four parallel lists carry per-residue metadata (chain residency, canonical/non-canonical, role tag, original canonical resname for renamed anchors).
- class htmd.builder.nonstandard.ModelAtom(role, ff_type=None)#
Bases:
objectPer-atom record in a cluster model compound.
roleis one of"residue"(atom is part of a cluster residue) or"cap"(an ACE/NME-style backbone cap atom that is dropped at split time).ff_typeis unused in the current pipeline and kept for forward compatibility.
- htmd.builder.nonstandard.buildClusterModel(mol, spec, outdir)#
Build the combined model compound for a
ClusterSpec.Assembles full cluster residues plus ACE/NME-style backbone caps derived from the live molecule’s chain neighbours, then writes the result as a CIF ready to feed to antechamber.
- Parameters:
mol (
Molecule) – The molecule containing the cluster residues and their chain neighbours. Must carry covalent bonds.spec (
ClusterSpec) – Specification of the cluster to model (residues, chain residency flags, canonical flags, inter-residue bonds).outdir (
str) – Output directory where the model CIF will be written.
- Returns:
model – Cluster model compound with atom-name maps needed by
prepareClusterResidues()to split antechamber output back into per-residue topology files.- Return type:
- htmd.builder.nonstandard.parameterizeFromSpecs(specs, mol, outdir, forcefield='gaff2', charge_method='am1-bcc', am1_path_length=15, pin_backbone_charges=True, normalize='cluster', use_pyodide=None)#
Parameterize every non-canonical residue in
specsand return paths plus custombonds ready to feedhtmd.builder.amber.build().The function recovers cluster grouping by walking
mol.bondsfor non-peptide inter-residue bonds and unioning the touching residues. Per cluster it builds a combined model compound (full residues + ACE/NME-style backbone caps), runs antechamber + parmchk2 once, and splits the output into per-residue CIF / frcmod pairs. Free residues (no cluster bonds) are parameterized standalone.- Parameters:
specs (
list) – Per-residue specs frommoleculekit.tools.nonstandard_residues.detectNonStandardResidues().mol (
Molecule) – The molecule the specs describe. Must already carry covalent bonds (typically the post-systemPreparemolecule).outdir (
str) – Output directory for all generated CIF / frcmod / XML files.forcefield (
str|dict) – Force field for the non-canonical atoms. A name starting with"gaff"dispatches through antechamber + parmchk2 and emits prepi + frcmod (consumable byamber.build()) plus a combined OpenMM XML; any other string is treated as a SMIRNOFF offxml filename (e.g."openff_unconstrained-2.3.0.offxml") and dispatches through OpenFF Interchange, emitting only per-cluster OpenMM XML (consumable byopenmm.build()). A dict{resname: ff_name, "default": ff_name}lets different residues use different force fields; mixing within a single cluster is not supported (the cluster compound is parameterized as one molecule).charge_method (
str) – Charge model for the non-canonical atoms. Orthogonal toforcefield- every model works with both GAFF and SMIRNOFF typing."gasteiger"is the recommended choice: fast, computed via RDKit, honours the net charge, and is the automatic fallback under Pyodide where AM1-BCC’s SQM backend is unavailable."am1-bcc"is more expensive but highly accurate."nagl"uses the OpenFF NAGL graph neural network as an AM1-BCC surrogate (requires PyTorch)."resp"/"resp-multiconf"fit RESP charges to a Psi4-computed QM ESP; requires the private Acelleraparameterizepackage + Psi4."abcg2"is AM1-BCC v2, only meaningful with GAFF.am1_path_length (
int|None) – Maximum path length for AM1-BCC charge equivalence determination, passed to antechamber’s-plflag. Caps antechamber’s atom-equivalence search so it does not hang on cyclic or large molecules. Only used forcharge_method="am1-bcc"/"abcg2"; ignored for Gasteiger.Nonekeeps antechamber’s own default.pin_backbone_charges (
bool) – IfTrue, the backbone partial charges of every chain-resident residue are pinned to ff14SB (residue-specific for canonical residues, charge-class fallback for NCAAs). Matches the Robin Betz / R.E.D. / Carlos Ramos tutorial convention. SetFalseto keep the cluster-computed backbone charges (the Forcefield_PTM / Khoury et al. 2014 convention).normalize (
str|None) – How to absorb the small per-residue drift left by slicing one residue out of a jointly-charged cluster and any shift the backbone pin introduces on the cluster total."cluster"(default): only the cluster total is normalized to integer; per-residue totals are left at their natural (fractional) values."per_residue": each emitted unit is integer-charged - AMBER’s tLeap convention, the safer choice if the same residue might recur in different bonding contexts.None: no rebalance at all.use_pyodide (
bool|None) – Force the AmberTools dispatch path.Truedispatches viaantechamber_pyodide.run;Falseuses a native subprocess.None(default) auto-detects Pyodide viasys.platform.
- Returns:
out – Aggregated topology / parameter files and custombonds for the whole system. The
forcefieldchoice determines what is populated:GAFF forcefield:
out.topo_paths(prepi) +out.frcmod_paths(frcmod, foramber.build()) plus one combined OpenMM XML (gaff_combined.xml) appended toout.xml_paths.SMIRNOFF forcefield: per-cluster XML fragments appended to
out.xml_pathsonly;topo_paths/frcmod_pathsstay empty.Mixed: both above contribute. Synthetic atom-type names are globally unique by construction, so the XMLs load together via
openmm.app.ForceField(*defaultFf(), *out.xml_paths).
out.custombondsis populated in every case and matches thecustombonds=argument ofamber.build()/openmm.build(). To tell whether GAFF was involved (and therefore whetheramber.build()is also viable), checkfrcmod_paths: it is non-empty iff at least one residue went through the GAFF path.- Return type:
Examples
Build a scaffolded cyclic peptide (3 cysteines thioether-bonded to a triazinane scaffold
LFI):from moleculekit.molecule import Molecule from moleculekit.tools.nonstandard_residues import detectNonStandardResidues from moleculekit.tools.preparation import systemPrepare from htmd.builder.nonstandard import parameterizeFromSpecs from htmd.builder import amber mol = Molecule("8QFZ.pdb") mol.filter("chain B") mol.segid[:] = "P" mol.segid[mol.resname == "LFI"] = "L" # 1. Inspect the molecule and decide what needs custom params. specs = detectNonStandardResidues(mol) # 2. Template each non-canonical residue from a SMILES string. mol.templateResidueFromSmiles( "resname LFI", "C1N(CN(CN1C(=O)CCBr)C(=O)CCBr)C(=O)CCBr", addHs=True, ) # 3. Protonate the canonical part and apply the spec renames / # displaced-H drops in one step. pmol, _ = systemPrepare(mol, detect_specs=specs) # 4. Run antechamber per cluster and split per-residue using # Gasteiger charges (fast and reliable). out = parameterizeFromSpecs( specs, pmol, outdir="./params", charge_method="gasteiger" ) # 5. Build. built = amber.build( pmol, outdir="./build", custombonds=out.custombonds, topo=out.topo_paths, param=out.frcmod_paths, )
- htmd.builder.nonstandard.prepareClusterResidues(typed_path, frcmod_path, model, outdir=None, use_pyodide=None, residue_templates=None, parameter_sets=None, pin_backbone_charges=True, normalize='cluster')#
Split antechamber output for a cluster model compound into per- residue topology files and emit the matching custombonds list.
For each non-canonical cluster residue the function writes a CIF using antechamber’s GAFF2 types and the cluster compute’s per-atom charges. For each canonical anchor the CIF uses the appropriate AMBER residue template’s ff14SB atom types (mid-chain
CYX/NLN/ … or the matching N- or C-terminal variantNCYX/CCYX/ … when the residue is at a chain terminus), with per-atom charges from the antechamber compute on the combined model. For chain-resident residues the backbone charges are pinned to ff14SB by default and the residue rebalanced to its integer formal charge (see_backbone_charge_map()); setpin_backbone_charges=Falseto skip the pin and keep the cluster-computed backbone charges. In both cases every per-residue file (chain-resident and scaffold) is rebalanced to its integer formal charge, so each emitted unit is integer-charged. Each canonical residue’s bucket resname (assigned by detect, e.g.CY1) keeps the residue out of tLeap’s built-in libraries so our prepi loads instead of the standard template.- Parameters:
typed_path (
str) – Antechamber-typed mol2 of the cluster model compound.frcmod_path (
str) – parmchk2 output for the same model compound.model (
ClusterModel) – Cluster model returned bybuildClusterModel().outdir (
str|None) – Output directory; created if missing. IfNone, a fresh temporary directory is used.use_pyodide (
bool|None) – Force Pyodide dispatch (True), native subprocess (False), or auto-detect (None).residue_templates (
list|None) – If provided, every per-residue typed-mol slice this function writes is appended as a_ResidueTemplateDataentry for the downstream OpenMM XML emitter.parameter_sets (
list|None) – If provided, the cluster’s finalAmberParameterSet(post junction-term injection and backbone-rename duplication) is appended for the downstream XML emitter.pin_backbone_charges (
bool) – IfTrue, pin backbone partial charges of chain-resident residues to ff14SB values before writing topology files. SetFalseto keep cluster-computed backbone charges.normalize (
str|None) – Charge normalization mode applied after backbone pinning."cluster"distributes any residual across the whole cluster;"per_residue"normalizes each residue independently to its integer formal charge;Noneleaves charges as-is.
- Returns:
out – Per-residue topology paths, frcmod paths, and custombonds for this cluster.
- Return type: