moleculekit.tools.autosegment module#
- moleculekit.tools.autosegment.autoSegment(mol, sel='all', basename='P', spatial=True, spatialgap=4.0, fields=('segid',), field=None, _logger=True)#
Detects resid gaps in a selection and assigns incrementing segid to each fragment.
A new segment is started whenever a gap in
residnumbering is found between consecutive residues in the selection (optionally confirmed by checking that the spatial distance between backbone atoms exceedsspatialgap). Water molecules are handled separately: each run of consecutive water residues forms its own segment with automatically renumberedresidvalues.Use
autoSegment()when the input has resid-based gaps (e.g. a PDB where residues are numbered with missing stretches). If you want to segment strictly by the covalent bond graph instead, useautoSegment2(). When you need a specific naming scheme that neither function produces, setmol.segiddirectly withmol.set("segid", "MY_SEG", sel="...").- Parameters:
mol (
Molecule) – The Molecule objectsel (str) – Atom selection string on which to check for gaps. See more here
basename (str) – The basename for segment ids. For example if given ‘P’ it will name the segments ‘P1’, ‘P2’, …
spatial (bool) – Only considers a discontinuity in resid as a gap if matching backbone atoms of the two residues have distance larger than spatialgap Angstrom
spatialgap (float) – The size of a spatial gap which validates a discontinuity (A)
fields (list) – Fields in which to set the segments. Must be a combination of “chain”, “segid” or only one of them.
- Returns:
newmol – A new Molecule object with modified segids
- Return type:
Moleculeobject
Example
>>> newmol = autoSegment(mol, "chain B", "P", fields=("chain", "segid"))
- moleculekit.tools.autosegment.autoSegment2(mol, sel='(protein or resname ACE NME)', basename='P', fields=('segid',), residgaps=False, residgaptol=1, chaingaps=True, _logger=True)#
Detects bonded segments in a selection and assigns incrementing segid to each segment.
Segments are derived from the covalent bond graph: two residues belong to the same segment if and only if they are in the same connected component of the backbone bond graph (computed from
mol.bondssupplemented by distance-based guessing over backbone atoms). This is more robust than resid-gap detection (autoSegment()) for structures where resid numbering is irregular or non-continuous.Use
autoSegment2()when you want to follow connectivity rather than resid sequence. UseautoSegment()when the input has predictable resid-based gaps. When you need a specific naming scheme, setmol.segiddirectly withmol.set("segid", "MY_SEG", sel="...").- Parameters:
mol (
Moleculeobject) – The Molecule objectsel (str) – Atom selection string on which to check for gaps. See more here
basename (str) – The basename for segment ids. For example if given ‘P’ it will name the segments ‘P1’, ‘P2’, …
fields (tuple of strings) – Field to fix. Can be “segid” (default) or any other Molecule field or combinations thereof.
residgaps (bool) – Set to True to consider gaps in resids as structural gaps. Set to False to ignore resids
residgaptol (int) – Above what resid difference is considered a gap. I.e. with residgaptol 1, 235-233 = 2 > 1 hence is a gap. We set default to 2 because in many PDBs single residues are missing in the proteins without any gaps.
chaingaps (bool) – Set to True to consider changes in chains as structural gaps. Set to False to ignore chains
- Returns:
newmol – A new Molecule object with modified segids
- Return type:
Moleculeobject
Example
>>> newmol = autoSegment2(mol)