moleculekit.readers module#

moleculekit.readers.ALPHAFOLDread(filename, frame=None, topoloc=None, validateElements=True, uri='https://alphafold.ebi.ac.uk/files/AF-{uniprot}-F1-model_v6.cif')#
moleculekit.readers.BCIFread(filename, frame=None, topoloc=None, uri='https://models.rcsb.org/{pdbid}.bcif.gz', covalentonly=True, validateElements=True)#
moleculekit.readers.BINCOORread(filename, frame=None, topoloc=None)#
moleculekit.readers.BINPOSread(filename, frame=None, topoloc=None, stride=None, atom_indices=None)#
moleculekit.readers.CIFread(filename, frame=None, topoloc=None, data=None, covalentonly=True, validateElements=True)#
moleculekit.readers.CRDCARDread(filename, frame=None, topoloc=None)#

https://www.charmmtutorial.org/index.php/CHARMM:The_Basics title = * WATER title = * DATE: 4/10/07 4:25:51 CREATED BY USER: USER title = * Number of atoms (NATOM) = 6 Atom number (ATOMNO) = 1 (just an exmaple) Residue number (RESNO) = 1 Residue name (RESName) = TIP3 Atom type (TYPE) = OH2 Coordinate (X) = -1.30910 Coordinate (Y) = -0.25601 Coordinate (Z) = -0.24045 Segment ID (SEGID) = W Residue ID (RESID) = 1 Atom weight (Weighting) = 0.00000

now what that looks like…

  • WATER

  • DATE: 4/10/07 4:25:51 CREATED BY USER: USER

  • 6 1 1 TIP3 OH2 -1.30910 -0.25601 -0.24045 W 1 0.00000 2 1 TIP3 H1 -1.85344 0.07163 0.52275 W 1 0.00000 3 1 TIP3 H2 -1.70410 0.16529 -1.04499 W 1 0.00000 4 2 TIP3 OH2 1.37293 0.05498 0.10603 W 2 0.00000 5 2 TIP3 H1 1.65858 -0.85643 0.10318 W 2 0.00000 6 2 TIP3 H2 0.40780 -0.02508 -0.02820 W 2 0.00000

moleculekit.readers.CRDread(filename, frame=None, topoloc=None)#
moleculekit.readers.DCDread(filename, frame=None, topoloc=None, stride=None, atom_indices=None)#
exception moleculekit.readers.FormatError(value)#

Bases: Exception

moleculekit.readers.GJFread(filename, frame=None, topoloc=None)#
moleculekit.readers.GROTOPread(filename, frame=None, topoloc=None)#
moleculekit.readers.INPCRDread(filename, frame=None, topoloc=None, stride=None, atom_indices=None)#
moleculekit.readers.JSONread(filename, frame=None, topoloc=None, stride=None, atom_indices=None)#
moleculekit.readers.MAEread(fname, frame=None, topoloc=None)#

Reads maestro files.

Parameters:
  • fname (str) – .mae file

  • frame (int) – The frame to read from the file.

  • topoloc (str) – The location of the topology file.

Returns:

moleculekit.readers.MDTRAJTOPOread(filename, frame=None, topoloc=None, validateElements=True)#
moleculekit.readers.MDTRAJread(filename, frame=None, topoloc=None, validateElements=True)#
moleculekit.readers.MMTFread(filename, frame=None, topoloc=None, validateElements=True)#
moleculekit.readers.MOL2read(filename, frame=None, topoloc=None, singlemol=True, validateElements=True)#
class moleculekit.readers.MolFactory#

Bases: object

Constructs Molecule objects from parsed topology and trajectory data.

The various file readers in this module parse their inputs into intermediate Topology and Trajectory objects. MolFactory takes those parsed objects and assembles them into one or more fully populated Molecule objects, validating elements and deduplicating bonds as requested.

static construct(topos, trajs, filename, frame, validateElements=True, uniqueBonds=False)#
moleculekit.readers.NETCDFread(filename, frame=None, topoloc=None, stride=None, atom_indices=None)#
moleculekit.readers.PDBQTread(filename, frame=None, topoloc=None)#
moleculekit.readers.PDBread(filename, mode='pdb', frame=None, topoloc=None, validateElements=True, uniqueBonds=True)#
moleculekit.readers.PREPIread(filename, frame=None, topoloc=None)#
moleculekit.readers.PRMTOPread(filename, frame=None, topoloc=None, validateElements=False)#
moleculekit.readers.PSFread(filename, frame=None, topoloc=None, validateElements=False)#
moleculekit.readers.RTFread(filename, frame=None, topoloc=None)#
moleculekit.readers.SDFread(filename, frame=None, topoloc=None, mol_idx=None)#
moleculekit.readers.TRRread(filename, frame=None, topoloc=None, stride=None, atom_indices=None)#
class moleculekit.readers.Topology(pandasdata=None)#

Bases: object

property atominfo#
fromMolecule(mol)#
exception moleculekit.readers.TopologyInconsistencyError(value)#

Bases: Exception

class moleculekit.readers.Trajectory(coords=None, box=None, boxangles=None, fileloc=None, step=None, time=None)#

Bases: object

property numFrames#
moleculekit.readers.XSCread(filename, frame=None, topoloc=None)#
moleculekit.readers.XTCread(filename, frame=None, topoloc=None)#
moleculekit.readers.XYZread(filename, frame=None, topoloc=None)#
moleculekit.readers.get_raw_data_from_url(pdb_id, reduced=False)#

Get the msgpack unpacked data given a PDB id.

Parameters:
  • pdb_id (str) – The input PDB id.

  • reduced (bool) – If True, fetch the reduced MMTF representation (backbone-only). Defaults to False (full structure).

Returns:

data – The unpacked MMTF data.

Return type:

dict

moleculekit.readers.openFileOrStringIO(strData, mode=None)#

Context manager yielding a readable handle for a file path or StringIO.

If strData is a io.StringIO it is yielded directly. If it is a path to an existing file it is opened (transparently decompressing .gz files) and yielded. In both cases the handle is closed on exit.

Parameters:
  • strData (str or io.StringIO) – Either a path to a file on disk or an in-memory StringIO object.

  • mode (str | None) – The mode used when opening a file path (e.g. "r" or "rb"). For .gz files a "t" suffix is appended to read text. Ignored when a StringIO is passed.

Yields:

handle (file-like object) – An open, readable handle to the data.

moleculekit.readers.parseV3000SDF(lines, chargemap, bondmap)#
moleculekit.readers.pdbGuessElementByName(elements, names, onlymissing=True)#

Guess atomic elements from PDB atom names by column alignment.

Follows the convention described at https://www.cgl.ucsf.edu/chimera/docs/UsersGuide/tutorials/pdbintro.html#misalignment which states that elements should be right-aligned in columns 13-14 unless it’s a 4 letter name, in which case it would end up being left-aligned. When a name is ambiguous between a one- and two-letter element a warning is emitted suggesting how to correct it.

Parameters:
  • elements (list of str) – The existing element strings for each atom. Empty strings or values not present in the periodic table are treated as missing.

  • names (list of str) – The PDB atom names, used to infer the element from column alignment.

  • onlymissing (bool) – If True only atoms whose element is missing or invalid are guessed. If False (or if all elements are missing) every atom is guessed.

Returns:

  • noelem (numpy.ndarray) – Integer indices of the atoms for which an element was guessed.

  • guessed (numpy.ndarray) – The guessed element strings, aligned with noelem.

moleculekit.readers.sdf_generator(sdffile)#

Generator yielding Molecule objects from a multi-entry SDF file.

The file is read incrementally and a Molecule is produced for each $$$$-delimited record, allowing large SDF files to be iterated without loading every molecule into memory at once.

Parameters:

sdffile (str) – Path to the SDF file to read.

Yields:

mol (moleculekit.molecule.Molecule) – A Molecule object for each entry in the SDF file.