moleculekit.projections.metricsecondarystructure module#

class moleculekit.projections.metricsecondarystructure.MetricSecondaryStructure(sel='protein', simplified=True, integer=True)#

Bases: Projection

Calculates the secondary structure of the protein. DSSP implementation and documentation taken from MDtraj.

Parameters:
  • sel (str) – Atom selection string for the protein. See more here

  • simplified (bool) – Uses the simplified 3-letter code

  • integer (bool) – Use integers instead of letter codes.

Notes

The simplified DSSP codes are:
  • ‘C’ : Coil. Either of the ‘T’, ‘S’ or ‘ ‘ codes. Integer code: 0

  • ‘E’ : Strand. Either of the ‘E’, or ‘B’ codes. Integer code: 1

  • ‘H’ : Helix. Either of the ‘H’, ‘G’, or ‘I’ codes. Integer code: 2

The full DSSP assignment codes are:
  • ‘H’ : Alpha helix. Integer code: 3

  • ‘B’ : Residue in isolated beta-bridge. Integer code: 4

  • ‘E’ : Extended strand, participates in beta ladder. Integer code: 5

  • ‘G’ : 3-helix (3/10 helix). Integer code: 6

  • ‘I’ : 5 helix (pi helix). Integer code: 7

  • ‘T’ : hydrogen bonded turn. Integer code: 8

  • ‘S’ : bend. Integer code: 9

  • ‘ ‘ : Loops and irregular elements. Integer code: 10

A special ‘NA’ code will be assigned to each ‘residue’ in the topology which isn’t actually a protein residue (does not contain atoms with the names ‘CA’, ‘N’, ‘C’, ‘O’), such as water molecules that are listed as ‘residue’s in the topology.

getMapping(mol)#

Returns the description of each projected dimension.

Parameters:

mol (Molecule object) – A Molecule object which will be used to calculate the descriptions of the projected dimensions.

Returns:

map – A DataFrame containing the descriptions of each dimension

Return type:

DataFrame object

project(mol)#

Project molecule.

Parameters:

mol (Molecule) – A Molecule object to project.

Returns:

data – An array containing the projected data.

Return type:

np.ndarray