--- jupytext: text_representation: extension: .md format_name: myst format_version: 0.13 kernelspec: display_name: Python 3 language: python name: python3 --- # Basic protonation **You will learn:** how to add hydrogens to a protein at a chosen pH, inspect the resulting protonation-state table, and write the prepared system to disk. **Prerequisites:** - The [First molecule](../01-first-molecule.md) tutorial. - PDB2PQR and PROPKA installed (they ship as moleculekit dependencies). ## Setup ```{code-cell} python import pandas as pd from moleculekit.molecule import Molecule from moleculekit.tools.preparation import systemPrepare mol = Molecule("3PTB") ``` 3PTB is bovine trypsin with a benzamidine ligand in the active site. ## Step 1 — Run systemPrepare at pH 7.4 ```{code-cell} python pmol, specs, details = systemPrepare(mol, pH=7.4, return_details=True) ``` The call returns a 3-tuple: `pmol`, `specs`, `details`. `pmol` is a **new** {py:class}`~moleculekit.molecule.Molecule` — the input `mol` is not mutated. `specs` is the list of detected non-standard-residue specs that the call applied (same type as returned by {py:func}`~moleculekit.tools.nonstandard_residues.detectNonStandardResidues`); pass it back to a later `systemPrepare` call if you need to repeat the run, or inspect it to audit which residues were renamed. `details` is a `pandas.DataFrame` with one row per titratable residue; columns include `resname`, `resid`, `chain`, `segid`, `pKa`, `protonation`, and `buried`. The function adds hydrogens, runs PROPKA to predict pKa values, and titrates each titratable residue accordingly. ## Step 2 — Inspect protonation states ```{code-cell} python details[["resname", "resid", "protonation", "pKa"]].head(10) ``` Each row shows the assigned protonation form for one residue. Histidines appear as `HID`, `HIE`, or `HIP` depending on tautomer or charge; aspartates as `ASP` (deprotonated) or `ASH` (protonated); glutamates as `GLU` or `GLH`; cysteines as `CYS` or `CYM`; lysines as `LYS` or `LYN`. Residues whose predicted pKa falls within 2 units of the target pH are the most sensitive to the pH choice — flipping them would change their protonation state if pH moved a unit or two: ```{code-cell} python import numpy as np pkas = pd.to_numeric(details["pKa"], errors="coerce") near_pka = details[np.abs(pkas - 7.4) < 2.0] near_pka[["resname", "resid", "chain", "protonation", "pKa"]] ``` These residues would flip protonation state if pH moved a unit or two from 7.4. ## Step 3 — Skip titration entirely ```{code-cell} python pmol_no_titr, _ = systemPrepare(mol, titration=False) ``` `titration=False` skips PROPKA. Hydrogens are still added by PDB2PQR, but every titratable residue gets the standard protonation form at default pH with no per-residue prediction. Use this when you already know the protonation states you want, or when you will set them yourself via `force_protonation`. ## Step 4 — Write the prepared structure ```{code-cell} python pmol.write("trypsin_prepared.cif") ``` mmCIF is the recommended output format here: it preserves the bonds, bond orders, and formal charges that `systemPrepare` just established. Reload with {py:class}`~moleculekit.molecule.Molecule`("trypsin_prepared.cif") to round-trip. ## Recap - {py:func}`~moleculekit.tools.preparation.systemPrepare` adds hydrogens and assigns protonation states to `mol` at the chosen `pH` in one call. - `return_details=True` gives you a per-titratable-residue table (a `pandas.DataFrame`) for inspection. - `titration=False` skips PROPKA when you do not need pKa prediction. ## Next - [Non-standard residues](02-non-standard-residues.md) - [System-preparation pipeline](../../explanation/system-preparation-pipeline.md)