How to assign segments and chains#
Goal#
Derive segid and/or chain fields for a structure that lacks them, using gap detection to split continuous segments automatically.
Minimal example#
from moleculekit.molecule import Molecule
from moleculekit.tools.autosegment import autoSegment
mol = Molecule("3PTB")
mol = autoSegment(mol)
print(set(mol.segid))
Parameters that matter#
Parameter |
Type |
Default |
What it does |
|---|---|---|---|
|
required |
Input molecule (a copy is returned; original is unchanged) |
|
|
|
|
Restrict gap detection to this atom selection |
|
|
|
Prefix for generated segment names, e.g. |
|
|
|
Treat a residue-numbering gap as a real gap only if Cα distance > |
|
|
|
Distance threshold in Å for spatial gap detection |
Common variations#
# Assign segments to protein chains only
mol = autoSegment(mol, sel="protein")
Gotchas#
autoSegment()returns a newMolecule; it does not mutate the input.segidcan be up to 4 characters (MD force-field convention);chainis a single character (PDB convention).Auto-assignment is topology-driven and can fail on structures with non-contiguous or missing residue numbers — inspect the result before use.
When writing to PDB, only the
chainfield is stored in the standard CHAIN column;segidgoes into the SEGID column, which many programs ignore.