moleculekit.tools.nonstandard_residues module#

Discovery helper for non-standard residues that need user-driven AMBER parameterization before building.

A “non-standard residue” is any residue whose resname is not in moleculekit’s canonical amino-acid / nucleic / water / ion sets. detectNonStandardResidues() inspects the molecule (without mutating it) and returns one spec per non-standard residue, plus one per canonical residue covalently bonded to a non-canonical one:

  • NCAASpec - chain-resident NCAA with no sidechain crosslink (e.g. selenomethionine, norleucine, a D-amino acid).

  • CrosslinkedNCAASpec - chain-resident NCAA whose sidechain is covalently bonded to one or more other residues (a stapled- peptide residue, a glycosylated NCAA).

  • ScaffoldSpec - free non-canonical residue with two or more non-peptide bonds to other residues (the central scaffold of a bicyclic peptide, a multi-anchor covalent inhibitor).

  • CovalentLigandSpec - free non-canonical residue with exactly one non-peptide bond to another residue (single-anchor covalent inhibitor, NAG-Asn glycan stem, single-Cys heme).

  • LigandSpec - free non-canonical residue with no covalent bonds (small-molecule binding-pocket ligand, fatty acid).

  • CanonicalRenamedSpec - one per canonical amino-acid residue that detect identified as bonded to a non-canonical residue. The spec proposes a custom 3-char resname (e.g. CY1 for the CYS-SG-bonded-to-LFI bucket) and lists the displaced sidechain hydrogen names (["HG"] for a CYS-SG anchor, ["HD22"] for an ASN-ND2 anchor).

Pass the spec list to moleculekit.tools.preparation.systemPrepare() via detect_specs=specs to apply the proposed renames + H-drops on the prepared molecule.

class moleculekit.tools.nonstandard_residues.CanonicalRenamedSpec(residue: UniqueResidueID, new_resname: str, drop_h: list[str], is_n_term: bool = False, is_c_term: bool = False)#

Bases: object

A canonical amino-acid residue that the detector identified as covalently bonded to a non-canonical residue. residue.resname is the original canonical resname ("CYS", "ASN", …) and new_resname carries the proposed custom 3-char resname that parameterization will use. drop_h lists the displaced sidechain hydrogen atom names to remove when the rename is applied (e.g. ["HG"] for a CYS-SG anchor, ["HD22"] for an ASN-ND2 anchor); empty when no displacement is expected.

is_n_term / is_c_term flag whether this residue sits at a chain terminus; terminal forms need different parameters than mid-chain ones, so they end up in their own bucket.

drop_h: list[str]#
is_c_term: bool = False#
is_n_term: bool = False#
new_resname: str#
residue: UniqueResidueID#
class moleculekit.tools.nonstandard_residues.CovalentLigandSpec(resname: str, residue: UniqueResidueID)#

Bases: object

A non-canonical residue that is not peptide-bonded into a chain and has exactly one non-peptide bond going out to another residue. Examples: a single-anchor covalent inhibitor, a NAG-Asn glycan stem, a single-Cys heme.

residue: UniqueResidueID#
resname: str#
class moleculekit.tools.nonstandard_residues.CrosslinkedNCAASpec(resname: str, residue: UniqueResidueID, is_n_term: bool, is_c_term: bool)#

Bases: object

A non-canonical amino acid embedded in a polypeptide chain (peptide- bonded into the backbone) that also has one or more non-peptide sidechain bonds to other residues - a stapled-peptide residue, an NCAA whose sidechain is glycosylated, etc. The parameterizer combines this residue with its crosslink partners in a single antechamber compute.

is_c_term: bool#
is_n_term: bool#
residue: UniqueResidueID#
resname: str#
class moleculekit.tools.nonstandard_residues.LigandSpec(resname: str, residue: UniqueResidueID)#

Bases: object

A non-canonical residue with no covalent bonds to any other residue (a free, non-covalently bound ligand). Examples: small-molecule drug ligands in binding pockets, fatty acids, lipid head-groups. The parameterizer treats it standalone with no caps.

residue: UniqueResidueID#
resname: str#
class moleculekit.tools.nonstandard_residues.NCAASpec(resname: str, residue: UniqueResidueID, is_n_term: bool, is_c_term: bool)#

Bases: object

A non-canonical amino acid embedded in a polypeptide chain via standard peptide (N-C) bonds to canonical amino acids, with no other inter-residue covalent bonds. Examples: selenomethionine (MSE), norleucine (NLE), D-amino acids, backbone-modified residues. The parameterizer treats it as a free residue with ACE/NME caps.

is_c_term: bool#
is_n_term: bool#
residue: UniqueResidueID#
resname: str#
class moleculekit.tools.nonstandard_residues.ScaffoldSpec(resname: str, residue: UniqueResidueID)#

Bases: object

A non-canonical residue that is not peptide-bonded into a chain and has two or more non-peptide bonds going out to other residues. Examples: the central scaffold of a bicyclic / tricyclic peptide, a multi-anchor covalent inhibitor.

residue: UniqueResidueID#
resname: str#
moleculekit.tools.nonstandard_residues.detectNonStandardResidues(mol)#

Walk mol and return one spec per non-standard residue.

For each canonical amino-acid residue whose sidechain is covalently bonded to a non-canonical residue, emits a CanonicalRenamedSpec that proposes a custom 3-char resname (e.g. CY1 for the CYS-SG-bonded-to-LFI bucket, NL1 for ASN-ND2-bonded-to-NAG, …) plus the displaced sidechain hydrogen names looked up via moleculekit.tools._anchor_variants.lookup_anchor_variant(). The proposed rename and the H-drop are applied later by moleculekit.tools.preparation.systemPrepare() (passed via detect_specs=specs).

Residues that share the same (canonical_resname, anchor_atom_name, partner_resname, n_term, c_term) key are assigned the same resname so the parameterizer emits one prepi shared across them (e.g. all mid-chain ASN-ND2-bonded-to-NAG residues become NL1). Chain-terminal residues end up in their own bucket because terminal forms genuinely have different atoms (OXT on C-terminal carboxylate, extra H1/H2/H3 on N-terminal amine) and different charges. Residues whose anchor has no entry in the anchor-variants table (e.g. SER OG) use their original resname’s first two chars as the prefix.

Parameters:

mol (moleculekit.molecule.Molecule) – Input molecule. Should already carry covalent bonds (read from a PDB CONECT block or a CIF _struct_conn block, or set up via mol.templateResidueFromSmiles); if mol.bonds is empty, the detector falls back to distance-based bond guessing via mol._guessBonds().

Returns:

A flat list mixing NCAASpec, CrosslinkedNCAASpec, ScaffoldSpec, CovalentLigandSpec, LigandSpec, and CanonicalRenamedSpec entries.

Return type:

list[PerResidueSpec]