htmd.builder.nonstandard module#
End-to-end parameterization pipeline for non-canonical residues under
AMBER, driven by the spec list returned from
moleculekit.tools.nonstandard_residues.detectNonStandardResidues().
parameterizeFromSpecs() is the user-facing entry point. It walks
mol.bonds to recover cluster grouping (residues sharing non-peptide
inter-residue bonds), builds a combined antechamber model compound per
cluster (full residues + ACE/NME-style backbone caps), runs antechamber +
parmchk2 once per cluster, and splits the output into per-residue
CIF / frcmod pairs. Free residues (no cluster bonds) are parameterized
standalone via htmd.builder._ambertools._fftype_antechamber().
The result ClusterOutputs carries the topology paths, frcmod
paths, and custombonds list in the shape that
htmd.builder.amber.build() expects.
For canonical residues that the detector renamed (CYS bonded to a
scaffold, ASN glycosylated by a sugar, …), the per-residue CIF carries
ff14SB atom types pulled from the right AMBER residue template (mid-chain
CYX / N-terminal NCYX / C-terminal CCYX and the analogous
forms for LYS/HIS/ASN/…) so that backbone bonds resolve against
ff14SB. Per-atom charges come from the antechamber compute on the
combined model, except the backbone atoms of chain-resident residues,
which are pinned to ff14SB: the whole backbone from the ff14SB libraries
for canonical residues, the charge-class amide charges for NCAAs
(see _backbone_charge_map()). The frcmod carries cross-FF
junction terms (bond / angle / dihedral entries spanning a
canonical-residue atom and a non-canonical
one) with the canonical-side atom types rewritten from antechamber’s
GAFF2 to ff14SB.
- class htmd.builder.nonstandard.ClusterBond(atom_a, atom_b)#
Bases:
objectOne non-peptide covalent bond between two atoms in a cluster. Symmetric (no canonical-side / scaffold-side asymmetry), so it works uniformly for NCAA-NCAA crosslinks, canonical-AA-anchored scaffolds, and everything in between.
- atom_a: UniqueAtomID#
- atom_b: UniqueAtomID#
- class htmd.builder.nonstandard.ClusterModel(spec, cif_path, atom_map, atom_to_residue, atom_to_orig_name, canonical_renames)#
Bases:
objectResult of
buildClusterModel(). Carries everythingprepareClusterResidues()needs to split antechamber’s output back into per-residue topology files.- atom_map: dict#
- atom_to_orig_name: dict#
- atom_to_residue: dict#
- canonical_renames: dict#
- cif_path: str#
- spec: ClusterSpec#
- class htmd.builder.nonstandard.ClusterOutputs(topo_paths=<factory>, frcmod_paths=<factory>, custombonds=<factory>, xml_paths=<factory>)#
Bases:
objectAggregated result of
parameterizeFromSpecs()/prepareClusterResidues(). Carries the topology, parameter and custombond inputs that the user feeds back intohtmd.builder.amber.build().- custombonds: list#
- frcmod_paths: list#
- topo_paths: list#
- xml_paths: list#
- class htmd.builder.nonstandard.ClusterSpec(subtype, residues, is_chain_resident, is_canonical, roles, bonds, canonical_resnames=<factory>, canonical_terminus=<factory>, is_n_term=<factory>, is_c_term=<factory>)#
Bases:
objectA connected covalent cluster of residues that share non-peptide bonds and need combined parameterization.
residueslists every cluster member; the four parallel lists carry per-residue metadata (chain residency, canonical/non-canonical, role tag, original canonical resname for renamed anchors).- bonds: list#
- canonical_resnames: list#
- canonical_terminus: list#
- is_c_term: list#
- is_canonical: list#
- is_chain_resident: list#
- is_n_term: list#
- residues: list#
- roles: list#
- subtype: str#
- class htmd.builder.nonstandard.ModelAtom(role, ff_type=None)#
Bases:
objectPer-atom record in a cluster model compound.
roleis one of"residue"(atom is part of a cluster residue) or"cap"(an ACE/NME-style backbone cap atom that is dropped at split time).ff_typeis unused in the current pipeline and kept for forward compatibility.- ff_type: str | None = None#
- role: str#
- htmd.builder.nonstandard.buildClusterModel(mol, spec, outdir)#
Build the combined model compound for a
ClusterSpec: full residues + ACE/NME-style backbone caps derived from the live mol’s chain neighbours, written as a CIF ready to feed to antechamber.
- htmd.builder.nonstandard.parameterizeFromSpecs(specs, mol, outdir, forcefield='gaff2', charge_method='am1-bcc', am1_path_length=15, pin_backbone_charges=True, normalize='cluster', use_pyodide=None)#
Parameterize every non-canonical residue in
specsand return paths plus custombonds ready to feedhtmd.builder.amber.build().The function recovers cluster grouping by walking
mol.bondsfor non-peptide inter-residue bonds and unioning the touching residues. Per cluster it builds a combined model compound (full residues + ACE/NME-style backbone caps), runs antechamber + parmchk2 once, and splits the output into per-residue CIF / frcmod pairs. Free residues (no cluster bonds) are parameterized standalone.- Parameters:
specs (list) – Per-residue specs from
moleculekit.tools.nonstandard_residues.detectNonStandardResidues().mol (
moleculekit.molecule.Molecule) – The molecule the specs describe. Must already carry covalent bonds (typically the post-systemPreparemolecule).outdir (str) – Output directory for all generated CIF / frcmod / XML files.
forcefield (str or dict, optional) – Force field for the non-canonical atoms. Default
"gaff2". A name starting with"gaff"dispatches through antechamber + parmchk2 and emits prepi + frcmod (consumable byamber.build()) plus a combined OpenMM XML; any other string is treated as a SMIRNOFF offxml filename (e.g."openff_unconstrained-2.3.0.offxml") and dispatches through OpenFF Interchange, emitting only per-cluster OpenMM XML (consumable byopenmm.build()). A dict{resname: ff_name, "default": ff_name}lets different residues use different force fields; mixing within a single cluster is not supported (the cluster compound is parameterised as one molecule).charge_method (str, optional) – Charge model for the non-canonical atoms. Orthogonal to
forcefield- every model works with both GAFF and SMIRNOFF typing (the externally-fit methods pre-compute charges, then the engine only types)."am1-bcc"(default) is the most accurate and honours the net charge."gasteiger"is faster, computed via RDKit so it also honours the net charge, and is the automatic fallback under Pyodide where AM1-BCC’s SQM backend is unavailable."nagl"uses the OpenFF NAGL graph neural network as an AM1-BCC surrogate - much faster on medium-to-large molecules. Requires PyTorch."resp"/"resp-multiconf"fit RESP charges to a Psi4-computed QM ESP. Most accurate option but requires the private Acelleraparameterizepackage + Psi4.resp-multiconfaverages over up to 10 conformers (free ligands only; cluster path downgrades to single-conformer RESP since RDKit’s ETKDG isn’t appropriate for clusters with ACE/NME caps)."abcg2"is AM1-BCC v2, only meaningful with GAFF.am1_path_length (int or None, optional) – Maximum path length for AM1-BCC charge equivalence determination, passed to antechamber’s
-plflag. Caps antechamber’s atom- equivalence search so it doesn’t hang on cyclic or large molecules. Only used forcharge_method="am1-bcc"/"abcg2"; ignored for Gasteiger.Nonekeeps antechamber’s own default.pin_backbone_charges (bool, optional) – If
True(default), the backbone partial charges of every chain-resident residue are pinned to ff14SB (residue-specific for canonical residues, charge-class fallback for NCAAs). Matches the Robin Betz / R.E.D. / Carlos Ramos tutorial convention. SetFalseto keep the cluster-computed backbone charges (the Forcefield_PTM / Khoury et al. 2014 convention, which argues backbone freezing can hurt fit quality).normalize ({"cluster", "per_residue", None}, optional) – How to absorb the small per-residue drift left by slicing one residue out of a jointly-charged cluster (RDKit Gasteiger PEOE or antechamber AM1-BCC) and any shift the backbone pin introduces on the cluster total. Default
"cluster": only the cluster total is normalised to integer; per-residue totals are left at their natural (fractional) values, preserving the per-atom charges the charge method computed."per_residue": each emitted unit is integer-charged - AMBER’s tLeap convention, used by Betz / R.E.D. / Ramos, the safer choice if the same residue might recur in different bonding contexts.None: no rebalance at all (charges are exactly what the charge method produced, modulo the backbone pin).use_pyodide (bool or None, optional) – Force the AmberTools dispatch path (
True-> dispatch viaantechamber_pyodide.run;False-> native subprocess).None(default) auto-detects Pyodide viasys.platform.
- Returns:
Aggregated topology / parameter files and custombonds for the whole system. The
forcefieldchoice determines what is populated:GAFF forcefield:
out.topo_paths(prepi) +out.frcmod_paths(frcmod, foramber.build()) plus one combined OpenMM XML (gaff_combined.xml) appended toout.xml_paths.SMIRNOFF forcefield: per-cluster XML fragments appended to
out.xml_pathsonly;topo_paths/frcmod_pathsstay empty.Mixed: both above contribute. Synthetic atom-type names are globally unique by construction, so the XMLs load together via
openmm.app.ForceField(*defaultFf(), *out.xml_paths).
out.custombondsis populated in every case and matches thecustombonds=argument ofamber.build()/openmm.build(). To tell whether GAFF was involved (and therefore whetheramber.build()is also viable), checkfrcmod_paths: it is non-empty iff at least one residue went through the GAFF path.- Return type:
Examples
Build a scaffolded cyclic peptide (3 cysteines thioether-bonded to a triazinane scaffold
LFI):from moleculekit.molecule import Molecule from moleculekit.tools.nonstandard_residues import detectNonStandardResidues from moleculekit.tools.preparation import systemPrepare from htmd.builder.nonstandard import parameterizeFromSpecs from htmd.builder import amber mol = Molecule("8QFZ.pdb") mol.filter("chain B") mol.segid[:] = "P" mol.segid[mol.resname == "LFI"] = "L" # 1. Inspect the molecule and decide what needs custom params. specs = detectNonStandardResidues(mol) # 2. Template each non-canonical residue from a SMILES string. mol.templateResidueFromSmiles( "resname LFI", "C1N(CN(CN1C(=O)CCBr)C(=O)CCBr)C(=O)CCBr", addHs=True, ) # 3. Protonate the canonical part and apply the spec renames / # displaced-H drops in one step. pmol, _ = systemPrepare(mol, detect_specs=specs) # 4. Run antechamber per cluster and split per-residue. out = parameterizeFromSpecs(specs, pmol, outdir="./params") # 5. Build. built = amber.build( pmol, outdir="./build", custombonds=out.custombonds, topo=out.topo_paths, param=out.frcmod_paths, )
- htmd.builder.nonstandard.prepareClusterResidues(typed_path, frcmod_path, model, outdir=None, use_pyodide=None, residue_templates=None, parameter_sets=None, pin_backbone_charges=True, normalize='cluster')#
Split antechamber output for a cluster model compound into per- residue topology files and emit the matching custombonds list.
For each non-canonical cluster residue the function writes a CIF using antechamber’s GAFF2 types and the cluster compute’s per-atom charges. For each canonical anchor the CIF uses the appropriate AMBER residue template’s ff14SB atom types (mid-chain
CYX/NLN/ … or the matching N- or C-terminal variantNCYX/CCYX/ … when the residue is at a chain terminus), with per-atom charges from the antechamber compute on the combined model. For chain-resident residues the backbone charges are pinned to ff14SB by default and the residue rebalanced to its integer formal charge (see_backbone_charge_map()); setpin_backbone_charges=Falseto skip the pin and keep the cluster-computed backbone charges. In both cases every per-residue file (chain-resident and scaffold) is rebalanced to its integer formal charge, so each emitted unit is integer-charged. Each canonical residue’s bucket resname (assigned by detect, e.g.CY1) keeps the residue out of tLeap’s built-in libraries so our prepi loads instead of the standard template.- Parameters:
typed_path (str) – Antechamber-typed mol2 of the cluster model compound.
frcmod_path (str) – parmchk2 output for the same model compound.
model (
ClusterModel) – Cluster model returned bybuildClusterModel.outdir (str or None) – Output directory; created if missing. If
None, a fresh tempdir is used.residue_templates (list or None) – If provided, every per-residue typed-mol slice this function writes is appended as a
_ResidueTemplateDatafor the downstream OpenMM XML emitter.parameter_sets (list or None) – If provided, the cluster’s final
AmberParameterSet(post junction-term injection and backbone-rename duplication, pre clean-up) is appended for the downstream XML emitter.
- Return type: