moleculekit.tools.preparation module#

moleculekit.tools.preparation.proteinPrepare(mol_in, pH=7.0, verbose=0, returnDetails=False, hydrophobicThickness=None, holdSelection=None, _loggerLevel=None)#
moleculekit.tools.preparation.systemPrepare(mol_in, titration=True, pH=7.4, force_protonation=None, no_opt=None, no_prot=None, no_titr=None, hold_nonpeptidic_bonds=True, verbose=True, return_details=False, hydrophobic_thickness=None, plot_pka=None, ignore_ns_errors=False, _logger_level='ERROR', _molkit_ff=True, outdir=None, residue_smiles=None)#

Prepare molecular systems through protonation and h-bond optimization.

The preparation routine protonates and optimizes protein and nucleic residues. It will also take into account any non-protein, non-nucleic molecules for the pKa calculation but will not attempt to protonate or optimize those.

Returns a Molecule object, where residues have been renamed to follow internal conventions on protonation (below). Coordinates are changed to optimize the H-bonding network.

The following residue names are used in the returned molecule:

ASH

Neutral ASP

CYX

SS-bonded CYS

CYM

Negative CYS

GLH

Neutral GLU

HIP

Positive HIS

HID

Neutral HIS, proton HD1 present

HIE

Neutral HIS, proton HE2 present

LYN

Neutral LYS

TYM

Negative TYR

AR0

Neutral ARG

Charge +1

Neutral

Charge -1

ASH

ASP

CYS

CYM

GLH

GLU

HIP

HID/HIE

LYS

LYN

TYR

TYM

ARG

AR0

A detailed table about the residues modified is returned (as a second return value) when return_details is True .

If hydrophobic_thickness is set to a positive value 2*h, a warning is produced for titratable residues having -h<z<h and are buried in the protein by less than 75%. Note that the heuristic for the detection of membrane-exposed residues is very crude; the “buried fraction” computation (from propKa) is approximate; also, in the presence of cavities, residues may be solvent-exposed independently from their z location.

Notes

Features:
  • assigns protonation states via propKa

  • flips residues to optimize H-bonding network

  • debumps collisions

  • fills in missing atoms, e.g. hydrogen atoms

Parameters:
  • mol_in (moleculekit.molecule.Molecule) – the object to be optimized

  • titration (bool) – If set to True it will use propka to set the titration state of residues. Otherwise it will just add and optimize the hydrogens.

  • pH (float) – pH to decide titration

  • verbose (bool) – verbosity

  • return_details (bool) – whether to return just the prepared Molecule (False, default) or a molecule and a pandas DataFrame object including computed properties

  • hydrophobic_thickness (float) – the thickness of the membrane in which the protein is embedded, or None if globular protein. Used to provide a warning about membrane-exposed residues.

  • force_protonation (list of tuples) – Allows the user to force specific protonations on residues. This can be done by providing a list of tuples, one for each residue we want to force. i.e. [(“protein and resid 40”, “HID”)] will force the protonation of the first atomselection to the second resname. Atomselections should be valid VMD atomselections.

  • no_opt (list of str) – Allows the user to disable optimization for specific residues. For example if the user determines that a residue flip or change of coordinates performed by this method causes issues in the structure, they can disable optimization on that residue by passing an atomselection for the residue to hold. i.e. [“protein and resid 23”].

  • no_prot (list of str) – Same as no_opt but disables the addition of hydrogens to specific residues.

  • no_titr (list of str) – Same as no_opt but disables the titration of specific residues.

  • hold_nonpeptidic_bonds (bool) – When set to True, systemPrepare will automatically not optimize, protonate or titrate protein residues which are covalently bound to non-protein molecules. When set to False, systemPrepare will optimize them ignoring the covalent bond, meaning it may break the bonds or add hydrogen atoms between the bonds.

  • plot_pka (str) – Provide a file path with .png extension to draw the titration diagram for the system residues.

  • ignore_ns_errors (bool) – If False systemPrepare will issue an error when it fails to protonate non-canonical residues in the protein. If True it will ignore errors on non-canonical residues leaving them unprotonated.

  • outdir (str) – A path where to save custom residue cif files used for building

  • residue_smiles (dict) – A dictionary with keys being residue names and values being the SMILES string of the residue. This is used to create protonated versions of non-canonical residues with the help of the aceprep library.

Returns:

  • mol_out (moleculekit.molecule.Molecule) – the molecule titrated and optimized. The molecule object contains an additional attribute,

  • details (pandas.DataFrame) – A table of residues with the corresponding protonation states, pKas, and other information

Examples

>>> tryp = Molecule('3PTB')
>>> tryp_op, df = systemPrepare(tryp, return_details=True)
>>> tryp_op.write('/tmp/3PTB_prepared.pdb')
>>> df.to_excel("/tmp/tryp-report.csv")
>>> df                                                        
resname protonation  resid insertion chain segid       pKa    buried
0       ILE         ILE     16               A     0  7.413075  0.839286
1       VAL         VAL     17               A     0       NaN       NaN
2       GLY         GLY     18               A     0       NaN       NaN
3       GLY         GLY     19               A     0       NaN       NaN
4       TYR         TYR     20               A     0  9.590845  0.146429
..      ...         ...    ...       ...   ...   ...       ...       ...
282     HOH         WAT    804               A     1       NaN       NaN
283     HOH         WAT    805               A     1       NaN       NaN
284     HOH         WAT    807               A     1       NaN       NaN
285     HOH         WAT    808               A     1       NaN       NaN
286     HOH         WAT    809               A     1       NaN       NaN

[287 rows x 8 columns]

>>> tryp_op = systemPrepare(tryp, pH=1.0)
>>> tryp_op.write('/tmp/3PTB_pH1.pdb')

The following will force the preparation to freeze residues 36 and 49 in place >>> tryp_op = systemPrepare(tryp, no_opt=[“protein and resid 36”, “chain A and resid 49”])

The following will disable protonation on residue 32 of the protein >>> tryp_op = systemPrepare(tryp, no_prot=[“protein and resid 32”,])

The following will disable titration and protonation on residue 32 >>> tryp_op = systemPrepare(tryp, no_titr=[“protein and resid 32”,], no_prot=[“protein and resid 32”,])

The following will force residue 40 protonation to HIE and 57 to HIP >>> tryp_op = systemPrepare(tryp, force_protonation=[(“protein and resid 40”, “HIE”), (“protein and resid 57”, “HIP”)])