Build a cyclic peptide#

You will learn: how to build a head-to-tail cyclic peptide whose residues are non-canonical, exemplified by cyclosporin (PDB 4TOT chain E).

Prerequisites:

HTMD installed.
You’ve worked through Build a protein with a ligand - this tutorial builds on the same five-step flow.

Note

The workflow below is identical to Build a protein with a ligand - the only change is the SMILES dictionary you pass to templateResidueFromSmiles(). detectNonStandardResidues() finds the non-canonical residues; the ring-closing peptide bond is added separately by amber.build’s cyclic-segment detector, which spots head-to-tail N-C distances under 1.35 Å in the input geometry and emits the closing bond directive itself.

What makes cyclic peptides interesting#

Cyclosporin A is a head-to-tail cyclic 11-residue peptide. Almost every residue is N-methylated or otherwise modified, and there are no canonical anchors - every residue is a non-canonical amino acid (NCAA), and the first and last residues are covalently joined to close the ring.

For the build flow, the practical implication: detectNonStandardResidues() returns one ChainResidueSpec per NCAA — and that’s it. The ring-closing peptide bond is not in out.custombonds; instead, build() runs its own cyclic-segment detector at build time, sees the short head-to-tail N-C distance in the input coordinates, lifts the cyclic segment into its own tLeap unit, and writes an explicit bond cyc_X.<first>.N cyc_X.<last>.C to close the ring. You don’t have to wire the cyclisation by hand, but it is the builder — not detect — that closes the loop.

Note

This tutorial skips solvation and ionisation so the build runs in seconds and the focus stays on the cyclisation. For a production run, solvate first with solvate() (and keep ionize=True on the build). For implicit-solvent dynamics downstream, pass gbsa=True to build() — that sets GB-compatible radii on the prmtop; the GB model itself is enabled by the MD engine at run time.

Setup#

from moleculekit.molecule import Molecule
from moleculekit.tools.autosegment import autoSegment
from moleculekit.tools.nonstandard_residues import detectNonStandardResidues
from moleculekit.tools.preparation import systemPrepare
from htmd.builder import amber
from htmd.builder.nonstandard import parameterizeFromSpecs

Copyright by Acellera Ltd. By executing you are accepting the License. In order to register, run htmd_register on your terminal.
The registration information must be valid so that it might be verified.

rdkit - INFO - Enabling RDKit 2026.03.3 jupyter extensions

Step 1 - Load and segment#

mol = Molecule("4TOT")
mol.filter("chain E")          # one of the cyclosporin copies in the crystal
mol = autoSegment(mol, fields=("segid", "chain"))

moleculekit.molecule - WARNING - Alternative atom locations detected. Only altloc A was kept. If you prefer to keep all use the keepaltloc="all" option when reading the file.
moleculekit.molecule - INFO - Removed 32 atoms. 6312 atoms remaining in the molecule.
moleculekit.molecule - INFO - Removed 6085 atoms. 227 atoms remaining in the molecule.

Step 2 - Detect#

specs = detectNonStandardResidues(mol)
for spec in specs:
    print(spec)

ChainResidueSpec(resname='DAL', residue=<moleculekit.molecule.UniqueResidueID object at 0x7f7f8b6d28a0>
UniqueResidueID<resname: 'DAL', chain: 'A', resid: 1, insertion: '', segid: 'P0'>, new_resname=None, anchor_atom=None, is_n_term=False, is_c_term=False)
ChainResidueSpec(resname='MLE', residue=<moleculekit.molecule.UniqueResidueID object at 0x7f7f892393a0>
UniqueResidueID<resname: 'MLE', chain: 'A', resid: 2, insertion: '', segid: 'P0'>, new_resname=None, anchor_atom=None, is_n_term=False, is_c_term=False)
ChainResidueSpec(resname='MLE', residue=<moleculekit.molecule.UniqueResidueID object at 0x7f7f895c5940>
UniqueResidueID<resname: 'MLE', chain: 'A', resid: 3, insertion: '', segid: 'P0'>, new_resname=None, anchor_atom=None, is_n_term=False, is_c_term=False)
ChainResidueSpec(resname='MVA', residue=<moleculekit.molecule.UniqueResidueID object at 0x7f7f8c3863f0>
UniqueResidueID<resname: 'MVA', chain: 'A', resid: 4, insertion: '', segid: 'P0'>, new_resname=None, anchor_atom=None, is_n_term=False, is_c_term=False)
ChainResidueSpec(resname='BMT', residue=<moleculekit.molecule.UniqueResidueID object at 0x7f7f8c3872c0>
UniqueResidueID<resname: 'BMT', chain: 'A', resid: 5, insertion: '', segid: 'P0'>, new_resname=None, anchor_atom=None, is_n_term=False, is_c_term=False)
ChainResidueSpec(resname='ABA', residue=<moleculekit.molecule.UniqueResidueID object at 0x7f7fc4b7ae10>
UniqueResidueID<resname: 'ABA', chain: 'A', resid: 6, insertion: '', segid: 'P0'>, new_resname=None, anchor_atom=None, is_n_term=False, is_c_term=False)
ChainResidueSpec(resname='33X', residue=<moleculekit.molecule.UniqueResidueID object at 0x7f7fc4b79fd0>
UniqueResidueID<resname: '33X', chain: 'A', resid: 7, insertion: '', segid: 'P0'>, new_resname=None, anchor_atom=None, is_n_term=False, is_c_term=False)
ChainResidueSpec(resname='34E', residue=<moleculekit.molecule.UniqueResidueID object at 0x7f7fc4b7a390>
UniqueResidueID<resname: '34E', chain: 'A', resid: 8, insertion: '', segid: 'P0'>, new_resname=None, anchor_atom=None, is_n_term=False, is_c_term=False)
ChainResidueSpec(resname='MLE', residue=<moleculekit.molecule.UniqueResidueID object at 0x7f7f88826510>
UniqueResidueID<resname: 'MLE', chain: 'A', resid: 10, insertion: '', segid: 'P0'>, new_resname=None, anchor_atom=None, is_n_term=False, is_c_term=False)

You’ll see one ChainResidueSpec per NCAA. Because every residue in chain E is peptide-bonded on both sides (no free terminus), they all sit in the same chain-position bucket: mid-chain.

Step 3 - Template every NCAA from SMILES#

smiles = {
    "33X": "CC(C=O)NC",
    "34E": "CN[C@@H]([C@H](C)CN1CCN(CCOC)CC1)C=O",
    "ABA": "CC[C@H](C=O)N",
    "BMT": "C/C=C/C[C@@H](C)[C@H]([C@@H](C=O)NC)O",
    "DAL": "C[C@H](C=O)N",
    "MLE": "CC(C)C[C@@H](C=O)NC",
    "MVA": "CC(C)[C@@H](C=O)NC",
}
for resname, smi in smiles.items():
    mol.templateResidueFromSmiles(f'resname "{resname}"', smi, addHs=True)

Templating is per unique resname, not per occurrence - cyclosporin’s three MLE residues all share one SMILES. RCSB chemical-component SMILES are a starting point: encode the protonation state at your target pH (none of these cyclosporin residues are ionisable, so the neutral form is correct). templateResidueFromSmiles strips the terminal -OH / -OXT automatically when the residue is peptide-bonded on one or both sides, so the same SMILES works in both contexts.

Step 4 - Prepare#

prepared, specs = systemPrepare(mol, pH=7.4, detect_specs=specs)

---- Molecule chain report ----
Chain A:
    First residue:  DAL     1  
    Final residue:  ALA    11  
Chain B:
    First residue:  HOH   101  
    Final residue:  HOH   107  
---- End of chain report ----

For a cyclic peptide with no canonical residues, systemPrepare’s PDB2PQR pass has nothing to do on the protonation side - the bond-capture / restore mechanism is what preserves the inter-NCAA peptide bonds (including the ring-closing one) through the prep. Passing detect_specs=specs rather than relying on the default auto-detect lets you reuse the list we already computed, edit it before prep (drop entries you don’t want spec-handled, tweak a new_resname, …), and thread the same list into parameterizeFromSpecs. The spec list is returned unchanged; rebinding it back into specs keeps the data flow visually obvious.

Step 5 - Parameterize#

out = parameterizeFromSpecs(
    specs,
    prepared,
    outdir="./params",
    charge_method="gasteiger",
)
print(out)

ClusterOutputs(topo_paths=['./params/cluster_000/DAL.prepi', './params/cluster_001/MLE.prepi', './params/cluster_003/MVA.prepi', './params/cluster_004/BMT.prepi', './params/cluster_005/ABA.prepi', './params/cluster_006/33X.prepi', './params/cluster_007/34E.prepi'], frcmod_paths=['./params/cluster_000/DAL.frcmod', './params/cluster_001/MLE.frcmod', './params/cluster_003/MVA.frcmod', './params/cluster_004/BMT.frcmod', './params/cluster_005/ABA.frcmod', './params/cluster_006/33X.frcmod', './params/cluster_007/34E.frcmod'], custombonds=[], xml_paths=['./params/gaff_combined.xml'])

parameterizeFromSpecs dedupes singleton chain-NCAA entries by (resname, is_n_term, is_c_term). Three MLE residues at mid-chain produce one MLE.prepi. out.custombonds is empty for this purely-peptide-bonded cyclic peptide - the closing N-C bond gets added by amber.build directly as a cyclic-segment directive (see the note above), not by parameterizeFromSpecs.

Step 6 - Build#

amber.build(
    prepared,
    outdir="./build",
    custombonds=out.custombonds,
    topo=out.topo_paths,
    param=out.frcmod_paths,
    ionize=False,
)

htmd.builder.amber - INFO - Found cyclic segment P0. Disabling capping on it.
htmd.builder.amber - INFO - Detecting disulfide bonds.
htmd.builder.amber - INFO - Starting the build.
htmd.builder.amber - INFO - Finished building.
moleculekit.tools.sequencestructuralalignment - INFO - Alignment #0 was done on 11 residues: 1-11

<moleculekit.molecule.Molecule object at 0x7f7f8827a330>
Molecule with 241 atoms and 1 frames
Atom field - altloc shape: (241,)
Atom field - atomtype shape: (241,)
Atom field - beta shape: (241,)
Atom field - chain shape: (241,)
Atom field - charge shape: (241,)
Atom field - coords shape: (241, 3, 1)
Atom field - element shape: (241,)
Atom field - formalcharge shape: (241,)
Atom field - insertion shape: (241,)
Atom field - masses shape: (241,)
Atom field - name shape: (241,)
Atom field - occupancy shape: (241,)
Atom field - record shape: (241,)
Atom field - resid shape: (241,)
Atom field - resname shape: (241,)
Atom field - segid shape: (241,)
Atom field - serial shape: (241,)
Atom field - virtualsite shape: (241,)
angles shape: (416, 3)
bonds shape: (242, 2)
bondtype shape: (242,)
box shape: (3, 1)
boxangles shape: (3, 1)
crystalinfo: None
dihedrals shape: (719, 4)
fileloc shape: (1, 2)
impropers shape: (17, 4)
reps: 
step shape: (1,)
time shape: (1,)
topoloc: /tmp/tmprwreihut/build/structure.prmtop
viewname: structure.prmtop

ionize=False skips ion placement because we haven’t solvated. The head-to-tail closure is written by amber.build’s cyclic-segment block (a dedicated bond cyc_X.<first>.N cyc_X.<last>.C directive emitted alongside loadpdb for the cyclic unit) rather than through custombonds, so the resulting prmtop carries a closed ring regardless of what’s in out.custombonds.

Gotchas#

For peptides whose ring closes through a side chain (lactam bridges, isopeptide cycles, thioether cycles), detect emits a ChainResidueSpec with the appropriate anchor atom and the cycle closure is wired automatically - the same pattern that the stapled-peptide tutorial shows.