moleculekit.molecule module#

class moleculekit.molecule.Molecule(filename=None, name=None, **kwargs)#

Bases: object

Class to read, write and manipulate molecular structures.

Molecule is the main class of MoleculeKit. It stores all the relevant molecular information for a system and allows many different operations to be performed on it, including adding or removing atoms, bonds, calculating various properties of the system, and visualizing the molecule. Molecule can read a large variety of molecular file formats and convert between them, therefore it can also be used as a molecular file format converter.

Parameters:
  • filename (str or list of str) – Optionally load a PDB file from the specified file. If there’s no file and the value is four characters long assume it is a PDB accession code and try to download from the RCSB web server.

  • name (str) – Give a name to the Molecule that will be used for visualization

  • kwargs – Accepts any further arguments that should be passed to the Molecule.read method.

Examples

>>> mol = Molecule('./test/data/dhfr/dhfr.pdb')  
>>> mol = Molecule('3PTB')
>>> print(mol)                                     
Molecule with 1701 atoms and 1 frames
Atom field - altloc shape: (1701,)
Atom field - atomtype shape: (1701,)
...
addBond(idx1, idx2, btype)#

Add a new bond to a pair of atoms

If the bond already exists it will only update it’s type

Parameters:
  • idx1 (int) – The index of the one atom

  • idx2 (int) – The index of the other atom

  • btype (str) – The type of the bond as a string

Examples

>>> mol.addBond(13, 18, "2") # Adds a double bond
align(sel, refmol=None, refsel=None, frames=None, matchingframes=False, mode='index', _logger=True)#

Align conformations.

Align a given set of frames of this molecule to either the current active frame of this molecule (mol.frame) or the current frame of a different reference molecule. To align to any frame other than the current active one modify the refmol.frame property before calling this method.

Parameters:
  • sel (str) – Atom selection string for aligning. See more here

  • refmol (Molecule, optional) – Optionally pass a reference Molecule on which to align. If None is given, it will align on the first frame of the same Molecule

  • refsel (str, optional) – Atom selection for the refmol if one is given. Default: same as sel. See more here

  • frames (list or range) – A list of frames which to align. By default it will align all frames of the Molecule

  • matchingframes (bool) – If set to True it will align the selected frames of this molecule to the corresponding frames of the refmol. This requires both molecules to have the same number of frames.

  • mode (str) – Options are (‘index’, ‘structure’). Setting to ‘index’ will align two structures on the atoms selected in sel and refsel in increasing order of their indices. Meaning that if sel is name CA and resid 5 3 and refsel is name CA and resid 7 8, assuming that resid 3 comes before 5, it will align the CA or resid 3 to resid 7 in refmol and 5 to 8 instead of 5-7, 3-8 as one might expect from the atomselection strings. Setting mode to ‘structure’ will perform pure structural alignment regardless of atom order using the TM-Align method.

Examples

>>> mol=tryp.copy()
>>> mol.align('protein')
>>> mol.align('name CA', refmol=Molecule('3PTB'))
alignBySequence(ref, molseg=None, refseg=None, molsel='all', refsel='all', nalignfragment=1, returnAlignments=False, maxalignments=1)#

Aligns the Molecule to a reference Molecule by their longest sequence alignment

Parameters:
  • ref (Molecule object) – The reference Molecule to which we want to align

  • molsel (str) – The atom selection of this Molecule we want to align

  • refsel (str) – The atom selection of ref we want to align to

  • nalignfragments (int) – The number of fragments used for the alignment.

  • returnAlignments (bool) – Return all alignments as a list of Molecules

  • maxalignments (int) – The maximum number of alignments we want to produce

Returns:

mols – If returnAlignments is True it returns a list of Molecules each containing a different alignment. Otherwise it modifies the current Molecule with the best single alignment.

Return type:

list

altloc: ndarray[Any, dtype[object_]]#

The alternative location flag of the atoms if read from a PDB.

angles: ndarray[Any, dtype[uint32]]#

Atom triplets corresponding to angle terms.

append(mol, collisions=False, coldist=1.3, removesel='all', invertcollisions=False)#

Append a molecule at the end of the current molecule

Parameters:
  • mol (Molecule) – Target Molecule which to append to the end of the current Molecule

  • collisions (bool) – If set to True it will remove residues of mol which collide with atoms of this Molecule object.

  • coldist (float) – Collision distance in Angstrom between atoms of the two molecules. Anything closer will be considered a collision.

  • removesel (str) – Atomselection for atoms to be removed from the passed molecule in case of collisions.

  • invertcollisions (bool) – If invertcollisions is set to True it will remove residues of this Molecule which collide with atoms of the passed mol molecule.

Example

>>> mol=tryp.copy()
>>> mol.filter("not resname BEN")
array([1630, 1631, 1632, 1633, 1634, 1635, 1636, 1637, 1638], dtype=int32)
>>> lig=tryp.copy()
>>> lig.filter("resname BEN")
array([   0,    1,    2, ..., 1698, 1699, 1700], dtype=int32)
>>> mol.append(lig)
appendFrames(mol)#

Appends the frames of another Molecule object to this object.

Parameters:

mol (Molecule) – A Molecule object.

atomselect(sel, indexes=False, strict=False, fileBonds=True, guessBonds=True, _debug=False)#

Get a boolean mask or the indexes of a set of selected atoms

Parameters:
  • sel (str) – Atom selection string. See more here

  • indexes (bool) – If True returns the indexes instead of a bitmap

  • strict (bool) – If True it will raise an error if no atoms were selected.

  • fileBonds (bool) – If True will use bonds read from files.

  • guessBonds (bool) – If True will use guessed bonds.

Returns:

asel – Either a boolean mask of selected atoms or their indexes

Return type:

np.ndarray

Examples

>>> mol=tryp.copy()
>>> mol.atomselect('resname MOL')
array([False, False, False, ..., False, False, False], dtype=bool)
atomtype: ndarray[Any, dtype[object_]]#

The atom type of each atom.

beta: ndarray[Any, dtype[float32]]#

The beta factor value of each atom.

bonds: ndarray[Any, dtype[uint32]]#

Atom pairs corresponding to bond terms.

bondtype: ndarray[Any, dtype[object_]]#

The type of each bond in Molecule.bonds if available.

box: ndarray[Any, dtype[float32]]#

The box dimensions of the molecule.

boxangles: ndarray[Any, dtype[float32]]#

The box angles of the molecule.

property boxvectors: ndarray#

The box vectors of the Molecule

center(loc=(0, 0, 0), sel='all')#

Moves the geometric center of the Molecule to a given location

Parameters:
  • loc (list, optional) – The location to which to move the geometric center

  • sel (str) – Atom selection string of the atoms whose geometric center we want to center on the loc position. See more here

Returns:

translation – 3D coordinates of the translation applied to the Molecule

Return type:

np.ndarray

Examples

>>> mol=tryp.copy()
>>> mol.center()
>>> mol.center([10, 10, 10], 'name CA')
chain: ndarray[Any, dtype[object_]]#

The chain name of each atom.

charge: ndarray[Any, dtype[float32]]#

The charge of each atom.

coords: ndarray[Any, dtype[float32]]#

The coordinates of the atoms.

copy(frames=None, sel=None)#

Create a copy of the Molecule object

Returns:

  • newmol (Molecule) – A copy of the object

  • frames (list of int) – If specified, only the selected frames will be copied.

  • sel (str) – Atom selection for atoms to keep in the copy.

crystalinfo: dict#

A dictionary containing crystallographic information. It has fields [‘sGroup’, ‘numcopies’, ‘rotations’, ‘translations’]

deleteBonds(sel, inter=True)#

Deletes all bonds that contain atoms in sel or between atoms in sel.

Parameters:
  • sel (str) – Atom selection string of atoms whose bonds will be deleted. See more here

  • inter (bool) – When True it will delete also bonds between atoms in sel with bonds to atoms outside of sel. When False it will only delete bonds between atoms in sel.

dihedrals: ndarray[Any, dtype[uint32]]#

Atom quadruplets corresponding to dihedral terms.

dropFrames(drop=None, keep=None)#

Removes trajectory frames from the Molecule

Parameters:
  • drop (int or list of ints) – Index of frame, or list of frame indexes which we want to drop (and keep all others). By default it will remove all frames from the Molecule.

  • keep (int or list of ints) – Index of frame, or list of frame indexes which we want to keep (and drop all others).

Examples

>>> mol = Molecule('1sb0')
>>> mol.dropFrames(keep=[1,2])
>>> mol.numFrames == 2
True
>>> mol.dropFrames(drop=[0])
>>> mol.numFrames == 1
True
element: ndarray[Any, dtype[object_]]#

The element of each atom.

empty(numAtoms, numFrames=0)#

Creates an empty molecule of numAtoms atoms.

Parameters:
  • numAtoms (int) – Number of atoms to create in the molecule.

  • numFrames (int, optional) – Number of empty trajectory frames to create in the molecule. Default is 0.

Example

>>> newmol = Molecule().empty(100)
fileloc: list#

The location of the files used to read this Molecule

filter(sel, _logger=True)#

Removes all atoms not included in the selection

Parameters:

sel (str) – Atom selection string. See more here

Returns:

removed – An array of all atoms which did not belong to sel and were removed from the Molecule object

Return type:

np.ndarray

Examples

>>> mol=tryp.copy()
>>> mol.filter('protein')
formalcharge: ndarray[Any, dtype[int32]]#

The formal charge of each atom.

property frame#

The currently active frame of the Molecule on which methods will be applied

static fromDict(moldict)#
static fromRDKitMol(rmol)#

Converts an RDKit molecule to a Molecule object

Parameters:

rmol (rdkit.Chem.rdchem.Mol) – The RDKit molecule to convert

Examples

>>> from rdkit import Chem
>>> from rdkit.Chem import AllChem
>>> rmol = Chem.MolFromSmiles("C1CCCCC1")
>>> rmol = Chem.AddHs(rmol)
>>> AllChem.EmbedMolecule(rmol)
>>> AllChem.MMFFOptimizeMolecule(rmol)
>>> mol = Molecule.fromRDKitMol(rmol)
property fstep#

The frame-step of the trajectory

get(field, sel=None, fileBonds=True, guessBonds=True)#

Retrieve a specific PDB field based on the selection

Parameters:
  • field (str) – The field we want to get. To see a list of all available fields do print(Molecule._atom_and_coord_fields).

  • sel (str) – Atom selection string for which atoms we want to get the field from. Default all. See more here

  • fileBonds (bool) – If True will use bonds read from files.

  • guessBonds (bool) – If True will use guessed bonds. Both fileBonds and guessBonds can be combined.

Returns:

vals – Array of values of field for all atoms in the selection.

Return type:

np.ndarray

Examples

>>> mol=tryp.copy()
>>> mol.get('resname')
array(['ILE', 'ILE', 'ILE', ..., 'HOH', 'HOH', 'HOH'], dtype=object)
>>> mol.get('resname', sel='resid 158')
array(['LEU', 'LEU', 'LEU', 'LEU', 'LEU', 'LEU', 'LEU', 'LEU'], dtype=object)
getCenter(sel='all', com=False)#

Get the center of an atom selection

Parameters:
  • sel (str) – Atom selection string for which to calculate the center of mass

  • com (bool) – If True it will calculate the center of mass. If False it will calculate the geometric center.

Returns:

center – The center of mass or geometric center of the selection

Return type:

np.ndarray

getDihedral(atom_quad)#

Get the value of a dihedral angle.

Parameters:

atom_quad (list) – Four atom indexes corresponding to the atoms defining the dihedral

Returns:

angle – The angle in radians

Return type:

float

Examples

>>> mol.getDihedral([0, 5, 8, 12])
getNeighbors(idx, bonds=None)#

Returns all atoms bonded to a specific atom

Parameters:
  • idx (int) – The atom for which to find bonded atoms

  • bonds (np.ndarray) – Override the bonds stored in the Molecule object

Returns:

atoms – The atoms bonded to idx

Return type:

list of int

getResidues(fields=('resid', 'insertion', 'chain'), sel='all', return_idx=True)#

Get unique ids for the residues of the Molecule

Parameters:
  • fields (np.ndarray or tuple) – An array of Molecule attributes. Once a change in any of the fields happens, a new ID will be created in residues. The default fields are (“resid”, “insertion”, “chain”) which means that a new residue ID will be created if the resid or insertion or chain changes from the previous atom to the next one.

  • sel (str) – Atomselection for which to return the residues. Default is “all”. See more here

  • return_idx (bool) – If set to True, the method will return the indices of each unique residue

Returns:

  • residues (np.ndarray) – An array of unique ids for the residues

  • idx (list) – A list of arrays, each containing the indices corresponding to the unique values in residues. Will only be returned if return_idx is set to True.

Examples

>>> mol = Molecule('5zmz')
>>> residues, idx = mol.getResidues()
>>> residues
array([0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2,
       2, 2, 2, 3, 3, 3, 3, 3, 4])
>>> idx
[array([0, 1, 2, 3, 4, 5, 6, 7]),
 array([ 8,  9, 10, 11, 12, 13, 14, 15, 16]),
 array([17, 18, 19, 20, 21, 22, 23, 24]),
 array([25, 26, 27, 28, 29]),
 array([30])]
getSequence(one_letter=True, dict_key='chain', return_idx=False, sel='all', _logger=True)#

Return the aminoacid sequence of the Molecule.

Parameters:
  • one_letter (bool) – Whether to return one-letter or three-letter AA codes. There should be only one atom per residue.

  • dict_key (str | None) – If None, the function will return a dictionary with keys “protein” and “nucleic” (if they exist) and the concatenated sequence as the value. If “chain” or “segid” is passed, the function will return a dictionary with the sequence of each chain or segment.

  • return_idx (bool) – If True, the function also returns the indexes of the atoms of each residue in the sequence

  • sel (str) – Atomselection for which to return the sequence

Returns:

sequence – The primary sequence as a dictionary.

Return type:

str

Examples

>>> mol = Molecule("3PTB")
>>> mol.getSequence()
{'A': 'IVGGYTCGANTVPYQVSLNSGYHFCGGSLINSQWVVSAAHCYKSGIQVRLGEDNINVVEGNEQFISASKSIVHPSYNSNTLNNDIMLIKLKSAASLNSRVASISLPTSCASAGTQCLISGWGNTKSSGTSYPDVLKCLKAPILSDSSCKSAYPGQITSNMFCAGYLEGGKDSCQGDSGGPVVCSGKLQGIVSWGSGCAQKNKPGVYTKVCNYVSWIKQTIASN'}
>>> mol.getSequence(sel="resid 16 to 50")
{'A': 'IVGGYTCGANTVPYQVSLNSGYHFCGGSLINSQ'}
>>> mol = Molecule("1LKK")
>>> seq = mol.getSequence(one_letter=False, dict_key="chain")
>>> seq.keys()
dict_keys(['A', 'B'])
>>> seq['B']
['PTR', 'GLU', 'GLU', 'ILE']
>>> seq = mol.getSequence(one_letter=True, dict_key="chain")
>>> seq['B']
'XEEI'
>>> seq = mol.getSequence(one_letter=True, dict_key="segid")
>>> seq.keys()
dict_keys(['1', '2'])
>>> seq, idx = mol.getSequence(return_idx=True)
>>> idx['B'][-1] # The atom indexes of the last residue in chain B
array([1718, 1719, 1720, 1721, 1722, 1723, 1724, 1725, 1726, 1727, 1728,
       1729, 1730, 1731, 1732, 1733, 1734, 1735, 1736, 1737])
guessBonds(rdkit=False)#
hasBond(idx1, idx2)#

Checks if the Molecule has a bond between two atom indexes

Parameters:
  • idx1 (int) – The index of the one atom

  • idx2 (int) – The index of the other atom

Returns:

  • has (bool) – True if the Molecule has that bond, False if not

  • btype (str) – The bond type of that bond

  • bidx (int) – The index of that bond in the bond/bondtype array

impropers: ndarray[Any, dtype[uint32]]#

Atom quadruplets corresponding to improper dihedral terms.

insert(mol, index, collisions=False, coldist=1.3, removesel='all', invertcollisions=False)#

Insert the atoms of one molecule into another at a specific index.

Parameters:
  • mol (Molecule) – Molecule to be inserted

  • index (integer) – The atom index at which the passed molecule will be inserted

  • collisions (bool) – If set to True it will remove residues of mol which collide with atoms of this Molecule object.

  • coldist (float) – Collision distance in Angstrom between atoms of the two molecules. Anything closer will be considered a collision.

  • removesel (str) – Atomselection for atoms to be removed from the molecule in case of collisions.

  • invertcollisions (bool) – If invertcollisions is set to True it will remove residues of this Molecule which collide with atoms of the passed mol molecule.

Example

>>> mol=tryp.copy()
>>> mol.numAtoms
1701
>>> mol.insert(tryp, 0)
>>> mol.numAtoms
3402
insertion: ndarray[Any, dtype[object_]]#

The insertion flag of the atoms if read from a PDB.

masses: ndarray[Any, dtype[float32]]#

The mass of each atom.

moveBy(vector, sel=None)#
mutateResidue(sel, newres)#

Mutates a residue by deleting its sidechain and renaming it

Parameters:
  • sel (str) – Atom selection string for the residue we want to mutate. The selection needs to include all atoms of the residue. See more here

  • newres (str) – The name of the new residue

Examples

>>> mol=tryp.copy()
>>> mol.mutateResidue('resid 158', 'ARG')
name: ndarray[Any, dtype[object_]]#

The name of each atom.

property numAtoms#

Number of atoms in the molecule

property numBonds#

Number of bonds in the molecule

property numFrames#

Number of coordinate frames in the molecule

property numResidues: int#

The number of residues in the Molecule

occupancy: ndarray[Any, dtype[float32]]#

The occupancy value of each atom if read from a PDB.

read(filename, type=None, skip=None, frames=None, append=False, overwrite='all', keepaltloc='A', guess=None, guessNE=None, _logger=True, **kwargs)#

Read topology, coordinates and trajectory files in multiple formats.

Detects from the extension the file type and loads it into Molecule

Parameters:
  • filename (str) – Name of the file we want to read

  • type (str, optional) – File type of the file. If None, it’s automatically determined by the extension

  • skip (int, optional) – If the file is a trajectory, skip every skip frames

  • frames (list, optional) – If the file is a trajectory, read only the given frames

  • append (bool, optional) – If the file is a trajectory or coor file, append the coordinates to the previous coordinates. Note append is slow.

  • overwrite (str, list of str) – A list of the existing fields in Molecule that we wish to overwrite when reading this file. Set to None if you don’t want to overwrite any existing fields.

  • keepaltloc (str) – Set to any string to only keep that specific altloc. Set to ‘all’ if you want to keep all alternative atom positions.

  • guess (list of str) – Properties of the molecule to guess. Can be any combination of (‘bonds’, ‘angles’, ‘dihedrals’)

  • guessNE (list of str) – Properties of the molecule to guess if it’s Non-Existent. Can be any combination of (‘bonds’, ‘angles’, ‘dihedrals’)

record: ndarray[Any, dtype[object_]]#

The record field of a PDB file if the topology was read from a PDB.

remove(selection, _logger=True)#

Remove atoms from the Molecule

Parameters:

selection (str) – Atom selection string of the atoms we want to remove. See more here

Returns:

removed – The list of atoms removed

Return type:

np.array

Example

>>> mol=tryp.copy()
>>> mol.remove('name CA')               
array([   1,    9,   16,   20,   24,   36,   43,   49,   53,   58,...
removeBond(idx1, idx2)#

Remove an existing bond between a pair of atoms

Parameters:
  • idx1 (int) – The index of the one atom

  • idx2 (int) – The index of the other atom

renumberResidues(returnMapping=False, start=0, modulo=None)#

Renumbers protein residues incrementally.

It checks for changes in either of the resid, insertion, chain or segid fields and in case of a change it creates a new residue number.

Parameters:

returnMapping (bool) – If set to True, the method will also return the mapping between the old and new residues

Examples

>>> mapping = mol.renumberResidues(returnMapping=True)
reorderAtoms(order)#

Reorder atoms in Molecule

Changes the order of atoms in the Molecule to the defined order.

Parameters:

order (list) – A list containing the new order of atoms

Examples

>>> mol = Molecule()
>>> _ = mol.empty(4)
>>> mol.name[:] = ['N', 'C', 'H', 'S']
>>> neworder = [1, 3, 2, 0]
>>> mol.reorderAtoms(neworder)
>>> print(mol.name)
['C' 'S' 'H' 'N']
reps: Representations#

A list of representations that is used when visualizing the molecule

resid: ndarray[Any, dtype[int64]]#

The residue ID of each atom.

resname: ndarray[Any, dtype[object_]]#

The residue name of each atom.

rotateBy(M, center=(0, 0, 0), sel='all')#

Rotate a selection of atoms by a given rotation matrix around a center

Parameters:
  • M (np.ndarray) – The rotation matrix

  • center (list) – The rotation center

  • sel (str) – Atom selection string for atoms to rotate. See more here

Examples

>>> from moleculekit.util import rotationMatrix
>>> mol = tryp.copy()
>>> mol.rotateBy(rotationMatrix([0, 1, 0], 1.57))
segid: ndarray[Any, dtype[object_]]#

The segment ID of each atom.

sequence(oneletter=True, noseg=False, return_idx=False, sel='all', _logger=True)#

DEPRECATED: Use getSequence instead.

serial: ndarray[Any, dtype[int64]]#

The serial number of each atom.

set(field, value, sel=None)#

Set the values of a Molecule field based on the selection

Parameters:
  • field (str) – The field we want to set. To see a list of all available fields do print(Molecule._atom_and_coord_fields).

  • value (string or integer) – All atoms that match the atom selection will have the field field set to this scalar value (or 3-vector if setting the coordinates)

  • sel (str) – Atom selection string for atom which to set. See more here

Examples

>>> mol=tryp.copy()
>>> mol.set('segid', 'P', sel='protein')
setDihedral(atom_quad, radians, bonds=None, guessBonds=False)#

Sets the angle of a dihedral.

Parameters:
  • atom_quad (list) – Four atom indexes corresponding to the atoms defining the dihedral

  • radians (float) – The angle in radians to which we want to set the dihedral

  • bonds (np.ndarray) – An array containing all bonds of the molecule. This is needed if multiple modifications are done as the bond guessing can get messed up if atoms come very close after the rotation.

  • guessBonds (bool) – Set to True if you want to guess bonds based on atom distances if they are not defined

Examples

>>> mol.setDihedral([0, 5, 8, 12], 0.16)
>>> # If we perform multiple modifications, calculate bonds first and pass them as argument to be safe
>>> bonds = mol._getBonds()
>>> mol.setDihedral([0, 5, 8, 12], 0.16, bonds=bonds)
>>> mol.setDihedral([18, 20, 24, 30], -1.8, bonds=bonds)
step: ndarray[Any, dtype[uint64]]#

The step for each frame of the simulation

templateResidueFromSmiles(sel, smiles, sanitizeSmiles=True, addHs=False, onlyOnAtoms=None, guessBonds=False, _logger=True)#

Template a residue from a SMILES string

This function will assign bonds, bond orders and formal charges to a residue according to a corresponding SMILES string. In addition it can also protonate the residue.

Parameters:
  • sel (str) – The atom selection of the residue which we want to template

  • smiles (str) – The SMILES string of the template residue

  • sanitizeSmiles (bool) – If True the SMILES string will be sanitized

  • addHs (bool) – If True the residue will be protonated

  • onlyOnAtoms (str) – If not None, only the atoms in this atom selection will be protonated

  • guessBonds (bool) – Set to True to guess bonds for the residue we are templating

  • _logger (bool) – If True the logger will be used to print information

Examples

>>> mol = Molecule("3ptb")
>>> mol.templateResidueFromSmiles("resname BEN", "[NH2+]=C(N)c1ccccc1", addHs=True)
>>> mol.templateResidueFromSmiles("resname GLY and resid 18", "C(C(=O))N", addHs=True)
time: ndarray[Any, dtype[float64]]#

The time for each frame of the simulation

toDict(fields=None)#

Returns a dictionary representation of the molecule

Parameters:

fields (list of str) – The fields to include in the representation. By default it will include only topology fields.

toGraph(fields=None, distances=False)#

Converts the Molecule to a networkx graph.

Each node corresponds to an atom and edges correspond to bonds

toOpenFFMolecule(sanitize=False, kekulize=False, assignStereo=True)#
toRDKitMol(sanitize=False, kekulize=False, assignStereo=True, guessBonds=False, _logger=True)#

Converts the Molecule to an RDKit molecule

Parameters:
  • sanitize (bool) – If True the molecule will be sanitized

  • kekulize (bool) – If True the molecule will be kekulized

  • assignStereo (bool) – If True the molecule will have stereochemistry assigned from its 3D coordinates

  • guessBonds (bool) – If True the molecule will have bonds guessed

  • _logger (bool) – If True the logger will be used to print information

translateBy(vector, sel=None)#

Move a selection of atoms by a given vector

Parameters:
  • vector (list) – 3D coordinates to add to the Molecule coordinates

  • sel (str) – Atom selection string of atoms which we want to move. See more here

Examples

>>> mol=tryp.copy()
>>> mol.moveBy([3, 45 , -8])
view(sel=None, style=None, color=None, guessBonds=True, viewer=None, hold=False, name=None, viewerhandle=None, gui=False, pmviewurl='http://localhost:8090')#

Visualizes the molecule in a molecular viewer

Parameters:
  • sel (str) – Atom selection string for the representation. See more here

  • style (str) – Representation style. See more here.

  • color (str or int) – Coloring mode (str) or ColorID (int). See more here.

  • guessBonds (bool) – Allow the viewer to guess bonds for the molecule

  • viewer (str ('pmview', 'pymol', 'vmd', 'webgl')) – Choose viewer backend. Default is taken from either moleculekit.config or if it doesn’t exist from moleculekit.config

  • hold (bool) – If set to True, it will not visualize the molecule but instead collect representations until set back to False.

  • name (str, optional) – A name to give to the molecule in the viewer

  • viewerhandle (VMD object, optional) – A specific viewer in which to visualize the molecule. If None it will use the current default viewer.

  • pmviewurl (string) – URL of pmview REST server

viewname: str#

The name used for the molecule in the viewer

virtualsite: ndarray[Any, dtype[bool_]]#

Whether the atom is a virtual site.

wrap(wrapsel='all', fileBonds=True, guessBonds=False, wrapcenter=None, unitcell='rectangular')#

Wraps the coordinates of the molecule into the simulation box

It assumes that all bonded groups are sequential in Molecule. I.e. you don’t have a molecule B in between the atoms of molecule A. It also requires correct bonding (ideally read from a topology file).

Parameters:
  • wrapsel (str) – Atom selection string of atoms on which to center the wrapping box. See more here

  • fileBonds (bool) – Whether to use the bonds read from the file or to guess them

  • guessBonds (bool) – Whether to guess the bonds. If fileBonds is True these will get appended

  • wrapcenter (array_like, optional) – The center around which to wrap. If not provided, it will be calculated from the atoms in wrapsel. Normally you want to use the wrapsel option and not the wrapcenter as the coordinates of the selection can change during a simulation and wrapsel will keep that selection in the center for all frames.

  • unitcell (str) – This option can be used for choosing between different unit cell representations of triclinic boxes. It doesn’t have any effect on rectangular boxes. The wrapping of a triclinic cell can be “rectangular”, “triclinic” or “compact”. Rectangular wrapping is the default and it wraps the box into a parallelepiped. Triclinic wrapping wraps the box into a triclinic box. Compact wrapping wraps the box into a shape that has the minimum volume. This can be useful for visualizing e.g. truncated octahedra or rhombic dodecahedra.

Examples

>>> mol=tryp.copy()
>>> mol.wrap()
>>> mol.wrap('protein')
write(filename, sel=None, type=None, **kwargs)#

Writes the topology and coordinates of the Molecule in any of the supported formats.

Parameters:
  • filename (str) – The filename of the file we want to write to disk

  • sel (str, optional) – Atom selection string of the atoms we want to write. If None, it will write all atoms. See more here

  • type (str, optional) – The filetype we want to write. By default, detected from the file extension

property x#

Get the x coordinates at the current frame

property y#

Get the y coordinates at the current frame

property z#

Get the z coordinates at the current frame

exception moleculekit.molecule.TopologyInconsistencyError(value)#

Bases: Exception

class moleculekit.molecule.UniqueAtomID(**kwargs)#

Bases: object

static fromMolecule(mol, sel=None, idx=None)#
selectAtom(mol, indexes=True, ignore=None)#
class moleculekit.molecule.UniqueResidueID(**kwargs)#

Bases: object

static fromMolecule(mol, sel=None, idx=None)#
selectAtoms(mol, indexes=True, ignore=None)#
moleculekit.molecule.calculateUniqueBonds(bonds, bondtype)#

Given bonds and bondtypes calculates unique bonds dropping any duplicates

Parameters:
  • bonds (np.ndarray) – The bonds of a molecule

  • bondtype (np.ndarray) – The bond type of each bond in the bonds array

Returns:

  • uqbonds (np.ndarray) – The unique bonds of the molecule

  • uqbondtype (np.ndarray) – The corresponding bond types for uqbonds

Examples

>>> from moleculekit.molecule import Molecule
>>> mol = Molecule('3PTB')
>>> mol.bonds, mol.bondtype = calculateUniqueBonds(mol.bonds, mol.bondtype)  # Overwrite the bonds and bondtypes with the unique ones
moleculekit.molecule.getBondedGroups(mol, bonds=None)#

Calculates all bonded groups in a Molecule

Parameters:
  • mol (Molecule) – A Molecule object

  • bonds (np.ndarray) – Optionally pass a different array of bonds. If None it will take the bonds from mol.bonds.

Returns:

  • groups (np.ndarray) – Groups is an array which contains the starting index of each group.

  • group (np.ndarray) – An array with the group index of each atom

Examples

>>> mol = Molecule("structure.prmtop")
>>> mol.read("output.xtc")
>>> groups, _ = getBondedGroups(mol)
>>> for i in range(len(groups)-1):
...     print(f"Group {i} starts at index {groups[i]} and ends at index {groups[i+1]-1}")
moleculekit.molecule.mol_equal(mol1, mol2, checkFields=('record', 'serial', 'name', 'altloc', 'resname', 'chain', 'resid', 'insertion', 'occupancy', 'beta', 'segid', 'element', 'charge', 'masses', 'atomtype', 'formalcharge', 'virtualsite', 'coords'), exceptFields=None, fieldPrecision=None, dtypes=False, uqBonds=False, _logger=True)#

Compare two Molecules for equality.

Parameters:
  • mol1 (Molecule) – The first molecule to compare

  • mol2 (Molecule) – The second molecule to compare to the first

  • checkFields (list) – A list of fields to compare. By default compares all atom information and coordinates in the molecule

  • exceptFields (list) – A list of fields to not compare.

  • fieldPrecision (dict) – A dictionary of field, precision key-value pairs which defines the numerical precision of the value comparisons of two arrays

  • dtypes (bool) – Set to True to also compare datatypes of the fields

  • uqBonds (bool) – Set to True to compare unique bonds instead of all bonds

  • _logger (bool) – Set to False to disable the printing of the differences in the two Molecules

Returns:

equal – Returns True if the molecules are equal or False if they are not.

Return type:

bool

Examples

>>> mol_equal(mol1, mol2, checkFields=['resname', 'resid', 'segid'])
>>> mol_equal(mol1, mol2, exceptFields=['record', 'name'])
>>> mol_equal(mol1, mol2, fieldPrecision={'coords': 1e-5})