moleculekit.tools.preparation module#
- exception moleculekit.tools.preparation.MissingTopologyError#
Bases:
Exception
- moleculekit.tools.preparation.systemPrepare(mol_in, titration=True, pH=7.4, force_protonation=None, no_opt=None, no_prot=None, no_titr=None, hold_nonpeptidic_bonds=True, verbose=True, return_details=False, hydrophobic_thickness=None, plot_pka=None, _logger_level='ERROR', titrate=None, detect_specs=None, restore_missing_sidechains=False)#
Prepare a molecular system by adding hydrogens, assigning protonation states, and optimizing the H-bond network.
Wraps PDB2PQR + PROPKA. Protein and nucleic residues are protonated and optimized; non-protein, non-nucleic residues contribute to the pKa calculation but are not themselves protonated or optimized here. The input molecule is not mutated — a new
Moleculeis returned.The returned molecule uses the following protonation-aware resnames:
ASH
Neutral ASP
CYX
SS-bonded CYS
CYM
Negative CYS
GLH
Neutral GLU
HIP
Positive HIS
HID
Neutral HIS, proton HD1 present
HIE
Neutral HIS, proton HE2 present
LYN
Neutral LYS
TYM
Negative TYR
AR0
Neutral ARG
Charge +1
Neutral
Charge -1
ASH
ASP
CYS
CYM
GLH
GLU
HIP
HID/HIE
LYS
LYN
TYR
TYM
ARG
AR0
Notes
What this function does:
Assigns protonation states via PROPKA.
Flips Asn/Gln/His sidechains to optimize the H-bond network.
Debumps clashes introduced by added hydrogens.
Adds missing heavy atoms and hydrogens via PDB2PQR’s force-field templates.
Detects non-standard residues and sidechain crosslinks (disulfides, metal-coordinating Cys/His/Tyr, isopeptides, glycosylations, …) and renames the affected residues to canonical variants (
CYX,XX#buckets) so PDB2PQR’s templates apply and the bonds are preserved. Seedetect_specsbelow.
If
hydrophobic_thicknessis set to a positive value2*h, a warning is produced for titratable residues with-h < z < hthat are buried by less than 75%. The heuristic is crude (it assumes the protein is aligned with the membrane centered atz=0) and the “buried fraction” estimate (from PROPKA) is approximate, so cavity- facing residues may appear solvent-exposed regardless ofz.- Parameters:
mol_in (
Molecule) – Input molecule. Not mutated — an internal copy is taken on the first frame only.titration (bool) – If True, run PROPKA to assign titration states. If False, just add and optimize hydrogens at the input resnames’ default protonation.
pH (float) – Solution pH used by PROPKA to pick titration states. Default 7.4.
force_protonation (list of tuple[str, str], optional) – Force specific protonation states on individual residues. Each entry is
(atomselection, resname), e.g.[("protein and resid 40", "HID")]forces residue 40 to HID. Atom selections use VMD syntax.no_opt (list of str, optional) – Atom selections of residues to exclude from H-bond / sidechain flip optimization. Use this when an optimizer flip would degrade a known-good structure (e.g. a residue in a metal site).
no_prot (list of str, optional) – Atom selections of residues to exclude from hydrogen addition.
no_titr (list of str, optional) – Atom selections of residues to exclude from titration.
hold_nonpeptidic_bonds (bool) – When True (default), protein residues that are covalently bonded to non-protein partners are automatically added to
no_opt,no_protandno_titrso their bonds aren’t broken and their boundary atoms aren’t over-protonated. Set to False to let PDB2PQR/PROPKA process them ignoring the non-peptide bond.verbose (bool) – If False, demote this module’s logger to WARNING for the call.
return_details (bool) – If True, additionally return a pandas DataFrame with per-residue protonation / pKa info.
hydrophobic_thickness (float, optional) – Membrane thickness
2*hin Angstrom (None for globular proteins). Triggers a warning for titratable residues that fall inside the bilayer and are not well-buried (see Notes).plot_pka (str, optional) – Path to a
.pngfile. When set, writes a titration diagram for the system’s titratable residues.titrate (list[str], optional) – Restrict titration to a subset of the canonical AAs in the table above (e.g.
["HIS", "ARG"]). All other AAs in the table are added tono_titr. When None (default), every AA in the table is titratable.detect_specs (list[PerResidueSpec], optional) – Per-residue specs from
moleculekit.tools.nonstandard_residues.detectNonStandardResidues(). When left asNone(the default),systemPreparerunsdetectNonStandardResidues()itself so canonical residues at non-peptide junctions are renamed (e.g. CYS-Cys disulfide ->CYX, a TYR coordinating a heme Fe -> aTY#bucket) and their sidechains re-templated before PDB2PQR runs. Pass an explicit list to bypass the auto-detection; passdetect_specs=[]to skip non-standard residue handling entirely. The applied specs are always returned so they can be forwarded to the downstream parameterizer / builder.restore_missing_sidechains (bool, optional) – If True, sidechain atoms removed by PDB2PQR are templated back in after preparation. Specifically, canonical protein residues whose entire heavy-atom sidechain is absent (only backbone + CB remain) are reconstructed using the Dunbrack rotamer library before PDB2PQR runs, so PDB2PQR’s 10 %-missing-atom threshold does not reject the structure on a partial input. Default False.
- Returns:
mol_out (moleculekit.molecule.Molecule) – The protonated and optimized molecule.
detect_specs (list[PerResidueSpec]) – The non-standard-residue specs that were applied (the caller’s
detect_specsargument if supplied, otherwise the auto-detected list). Empty whendetect_specs=[]was passed in.details (pandas.DataFrame) – Per-residue protonation states, pKas and buried fractions. Returned only when
return_details=True.
- Raises:
ImportError – If
pdb2pqris not installed.
Examples
>>> tryp = Molecule('3PTB') >>> tryp_op, specs, df = systemPrepare(tryp, return_details=True) >>> tryp_op.write('/tmp/3PTB_prepared.pdb') >>> df.to_excel("/tmp/tryp-report.csv") >>> df resname protonation resid insertion chain segid pKa buried 0 ILE ILE 16 A 0 7.413075 0.839286 1 VAL VAL 17 A 0 NaN NaN 2 GLY GLY 18 A 0 NaN NaN 3 GLY GLY 19 A 0 NaN NaN 4 TYR TYR 20 A 0 9.590845 0.146429 .. ... ... ... ... ... ... ... ... 282 HOH WAT 804 A 1 NaN NaN 283 HOH WAT 805 A 1 NaN NaN 284 HOH WAT 807 A 1 NaN NaN 285 HOH WAT 808 A 1 NaN NaN 286 HOH WAT 809 A 1 NaN NaN
[287 rows x 8 columns]
Acidic pH:
>>> tryp_op, specs = systemPrepare(tryp, pH=1.0) >>> tryp_op.write('/tmp/3PTB_pH1.pdb')
Freeze residues 36 and 49 in place:
>>> tryp_op, specs = systemPrepare(tryp, no_opt=["protein and resid 36", "chain A and resid 49"])
Disable protonation on residue 32:
>>> tryp_op, specs = systemPrepare(tryp, no_prot=["protein and resid 32",])
Disable titration and protonation on residue 32:
>>> tryp_op, specs = systemPrepare(tryp, no_titr=["protein and resid 32",], no_prot=["protein and resid 32",])
Force residue 40 to HIE and 57 to HIP:
>>> tryp_op, specs = systemPrepare(tryp, force_protonation=[("protein and resid 40", "HIE"), ("protein and resid 57", "HIP")])
Skip non-standard residue detection entirely (legacy behavior):
>>> tryp_op, specs = systemPrepare(tryp, detect_specs=[])
Forward
specsto a downstream builder for non-canonical residues (NCAAs, sidechain crosslinks, metal-coordinating residues):>>> # from htmd.builder.nonstandard import parameterizeFromSpecs >>> # parameterizeFromSpecs(specs, tryp_op, outdir="./params")