Build a stapled peptide#

You will learn: how to build a peptide whose two NCAAs are joined by a chemical crosslink (a hydrocarbon staple), exemplified by an NF-Y-derived stapled peptide from PDB 8QU4.

Prerequisites:

Note

The workflow below is identical to Build a protein with a ligand - the only change is the two extra SMILES strings you pass to templateResidueFromSmiles() (one per NCAA). detectNonStandardResidues() reads the staple bond from the input structure’s connectivity on its own, and parameterizeFromSpecs() carries it through to the build without any extra wiring.

What the staple is#

PDB 8QU4 chain A is a 13-mer designed peptide containing two NCAAs:

  • NLE (norleucine) at one end of the staple.

  • MK8 (an α-methyl-norleucine variant) at the other end.

The staple is a single CE-CE bond between the two NCAAs - the closure product of a ring-closing metathesis between two olefinic side chains. There is no canonical anchor between them: the crosslink joins two NCAAs directly.

For the build flow, this means detectNonStandardResidues() returns two ChainResidueSpec entries (one per NCAA) with anchor_atom set to the staple atom (CE), and parameterizeFromSpecs() emits exactly one entry in custombonds for the staple closure.

Note

This tutorial skips solvation and ionisation so the build runs in seconds and the focus stays on the staple. For a production run, either solvate first with solvate() (and keep ionize=True on the build) or build with implicit solvent by passing gbsa=True to build().

Setup#

from moleculekit.molecule import Molecule
from moleculekit.tools.nonstandard_residues import detectNonStandardResidues
from moleculekit.tools.preparation import systemPrepare
from htmd.builder import amber
from htmd.builder.nonstandard import parameterizeFromSpecs
Copyright by Acellera Ltd. By executing you are accepting the License. In order to register, run htmd_register on your terminal.
The registration information must be valid so that it might be verified.
rdkit - INFO - Enabling RDKit 2026.03.3 jupyter extensions

Step 1 - Load and segment#

mol = Molecule("8QU4")
mol.filter("chain A")
mol.segid[:] = "P"
moleculekit.molecule - WARNING - Alternative atom locations detected. Only altloc A was kept. If you prefer to keep all use the keepaltloc="all" option when reading the file.
moleculekit.molecule - INFO - Removed 80 atoms. 1568 atoms remaining in the molecule.
moleculekit.molecule - INFO - Removed 1440 atoms. 128 atoms remaining in the molecule.

The orange tube marks the NLE.CE - MK8.CE staple bond - the lone covalent crosslink that turns the linear 13-mer into a stapled macrocycle.

Step 2 - Detect#

specs = detectNonStandardResidues(mol)
for spec in specs:
    print(spec)
ChainResidueSpec(resname='NLE', residue=<moleculekit.molecule.UniqueResidueID object at 0x7fd5439223c0>
UniqueResidueID<resname: 'NLE', chain: 'A', resid: 272, insertion: '', segid: 'P'>, new_resname=None, anchor_atom='CE', is_n_term=False, is_c_term=False)
ChainResidueSpec(resname='MK8', residue=<moleculekit.molecule.UniqueResidueID object at 0x7fd543922750>
UniqueResidueID<resname: 'MK8', chain: 'A', resid: 276, insertion: '', segid: 'P'>, new_resname=None, anchor_atom='CE', is_n_term=False, is_c_term=False)

Each NLE / MK8 spec should carry anchor_atom="CE", marking the side-chain carbon that participates in the staple.

Step 3 - Template both NCAAs#

NLE_SMILES = "CCCC[C@@H](C(=O)O)N"
MK8_SMILES = "CCCC[C@](C)(C(=O)O)N"

mol.templateResidueFromSmiles("resname NLE", NLE_SMILES, addHs=True)
mol.templateResidueFromSmiles("resname MK8", MK8_SMILES, addHs=True)
moleculekit.rdkittools - INFO - Stripped unmatched terminal heavy atoms from SMILES template (e.g. leaving group displaced by a covalent link, or carboxyl -OH on a non-terminal amino acid). Modified SMILES: 'CCCC[C@H](N)C=O'
moleculekit.rdkittools - INFO - Stripped unmatched terminal heavy atoms from SMILES template (e.g. leaving group displaced by a covalent link, or carboxyl -OH on a non-terminal amino acid). Modified SMILES: 'CCCC[C@@](C)(N)C=O'

These are the free amino-acid SMILES for the two residues, with the neutral amine + free acid form (NLE and MK8 are not ionisable beyond the backbone). For ionisable residues, encode the protonation state at your target pH explicitly. templateResidueFromSmiles strips the terminal -OH automatically when the residue sits inside a peptide chain.

Step 4 - Prepare#

prepared, specs = systemPrepare(mol, detect_specs=specs)
---- Molecule chain report ----
Chain A:
    First residue:  ACE   269  
    Final residue:  HOH   304  
---- End of chain report ----
moleculekit.tools.preparation - INFO - Found 1 covalent bonds from protein to non-protein molecules.
moleculekit.tools.preparation - INFO - Freezing protein residue ARG:A:282 bonded to non-protein molecule NH2:A:283
moleculekit.tools.preparation - INFO - Skipping titration of residue ARG:A:282
moleculekit.tools.preparation - WARNING - The following residues have not been optimized: NH2, ACE
moleculekit.tools.preparation - INFO - Modified residue HIS   273 A to HID

PDB2PQR protonates the canonical residues of the 13-mer peptide as normal; the bond-capture mechanism is what preserves the inter-NCAA staple bond across the prep. Passing detect_specs=specs lets you reuse the list we already computed (avoiding a duplicate detect call), edit it before prep if needed, and thread the same list into parameterizeFromSpecs. The spec list is returned unchanged; rebinding it keeps the data flow visually obvious.

Step 5 - Parameterise#

out = parameterizeFromSpecs(
    specs,
    prepared,
    outdir="./params",
    charge_method="gasteiger",
)
print(out)
ClusterOutputs(topo_paths=['./params/cluster_000/NLE.prepi', './params/cluster_000/MK8.prepi'], frcmod_paths=['./params/cluster_000/NLE.frcmod', './params/cluster_000/MK8.frcmod'], custombonds=[('segid "P" and chain "A" and resid 276 and name "CE"', 'segid "P" and chain "A" and resid 272 and name "CE"')], xml_paths=['./params/gaff_combined.xml'])

You’ll see one NLE.prepi and one MK8.prepi in topo_paths (plus matching .frcmod files in frcmod_paths) and exactly one entry in custombonds - the NLE.CE - MK8.CE staple.

Step 6 - Build#

amber.build(
    prepared,
    outdir="./build",
    custombonds=out.custombonds,
    topo=out.topo_paths,
    param=out.frcmod_paths,
    caps={"P": ("none", "none")},
    ionize=False,
)
htmd.builder.builder - WARNING - Segments ['P'] contain both protein and non-protein atoms. Please assign separate segments to them or the build procedure might fail.
htmd.builder.amber - INFO - Detecting disulfide bonds.
htmd.builder.amber - INFO - Starting the build.
htmd.builder.amber - INFO - Finished building.
moleculekit.tools.sequencestructuralalignment - INFO - Alignment #0 was done on 13 residues: 2-14
<moleculekit.molecule.Molecule object at 0x7fd53e5fbef0>
Molecule with 280 atoms and 1 frames
Atom field - altloc shape: (280,)
Atom field - atomtype shape: (280,)
Atom field - beta shape: (280,)
Atom field - chain shape: (280,)
Atom field - charge shape: (280,)
Atom field - coords shape: (280, 3, 1)
Atom field - element shape: (280,)
Atom field - formalcharge shape: (280,)
Atom field - insertion shape: (280,)
Atom field - masses shape: (280,)
Atom field - name shape: (280,)
Atom field - occupancy shape: (280,)
Atom field - record shape: (280,)
Atom field - resid shape: (280,)
Atom field - resname shape: (280,)
Atom field - segid shape: (280,)
Atom field - serial shape: (280,)
Atom field - virtualsite shape: (280,)
angles shape: (487, 3)
bonds shape: (281, 2)
bondtype shape: (281,)
box shape: (3, 1)
boxangles shape: (3, 1)
crystalinfo: None
dihedrals shape: (1031, 4)
fileloc shape: (1, 2)
impropers shape: (52, 4)
reps: 
step shape: (1,)
time shape: (1,)
topoloc: /tmp/tmpajflns19/build/structure.prmtop
viewname: structure.prmtop

Two non-default arguments here:

  • caps={"P": ("none", "none")} disables automatic ACE / NME capping on this segment - the staple peptide has its own designed termini and adding caps on top would create overvalent atoms.

  • ionize=False skips ion placement because we haven’t solvated.

After the build, verify the staple is in the topology:

import numpy as np
built = Molecule("./build/structure.prmtop")
built.read("./build/structure.pdb")

nle_ce = np.where((built.resname == "NLE") & (built.name == "CE"))[0][0]
mk8_ce = np.where((built.resname == "MK8") & (built.name == "CE"))[0][0]
print("staple bond present:", built.hasBond(nle_ce, mk8_ce)[0])
staple bond present: True

Gotchas#

  • Disable capping (caps={"<segid>": ("none", "none")}) for designed peptides whose termini are explicit in the input. Default capping assumes a free protein N/C terminus and adds ACE / NME caps - which clash with hand-crafted termini.

  • Other staple chemistries (lactam, thioether, click triazole) use the same flow - detect sees the inter-NCAA bond from the input file’s connectivity, regardless of which atoms anchor the staple.

See also#