moleculekit.util module#

moleculekit.util.assertSameAsReferenceDir(compareDir, outdir='.')#

Check if files in refdir are present in the directory given as second argument AND their content matches.

Raise an exception if not.

moleculekit.util.boundingBox(mol, sel='all')#

Calculates the bounding box of a selection of atoms.

  • mol (Molecule object) – The molecule containing the atoms

  • sel (str) – Atom selection string of atoms. See more here


bbox – The bounding box around the atoms selected in sel.

Return type:



>>> boundingBox(mol, sel='chain A')
array([[-17.3390007 , -10.43700027,  -1.43900001],
       [ 25.40600014,  27.03800011,  46.46300125]], dtype=float32)
moleculekit.util.calculateAnglesAndDihedrals(bonds, cyclicdih=False)#

Calculate all angles and dihedrals from a set of bonds.

moleculekit.util.check_port(port, host='', timeout=120)#
moleculekit.util.ensurelist(tocheck, tomod=None)#

Convert np.ndarray and scalars to lists.

Lists and tuples are left as is. If a second argument is given, the type check is performed on the first argument, and the second argument is converted.

moleculekit.util.file_diff(file, reference)#
moleculekit.util.folder_diff(folder, reference, ignore_ftypes=('.log', '.txt'))#
moleculekit.util.guessAnglesAndDihedrals(bonds, cyclicdih=False)#

Calculate all angles and dihedrals from a set of bonds.

moleculekit.util.maxDistance(mol, sel='all', origin=None)#

Calculates the max distance of a set of atoms from an origin

  • mol (Molecule object) – The molecule containing the atoms

  • sel (str) – Atom selection string for atoms for which to calculate distances. See more here

  • origin (list) – The origin x,y,z coordinates


maxd – The maximum distance in Angstrom

Return type:



>>> y = maxDistance(mol, sel='protein', origin=[0, 0, 0])
>>> print(round(y,2))
moleculekit.util.molRMSD(mol, refmol, rmsdsel1, rmsdsel2)#

Calculates the RMSD between two Molecules

  • mol (Molecule object) –

  • refmol

  • rmsdsel1

  • rmsdsel2


rmsd – The RMSD between the two structures

Return type:


moleculekit.util.opm(pdbid, keep=False, keepaltloc='A', validateElements=True)#
moleculekit.util.orientOnAxes(mol, sel='all')#

Rotate a molecule so that its main axes are oriented along XYZ.

The calculation is based on the axes of inertia of the given selection, but masses will be ignored. After the operation, the main axis will be parallel to the Z axis, followed by Y and X (the shortest axis). Only the first frame is oriented. The reoriented molecule is returned.

  • mol – The Molecule to be rotated

  • sel (str) – Atom selection string on which the rotation is computed. See more here


>>> mol = Molecule("1kdx")
>>> mol = orientOnAxes(mol,"chain B")

Read 3D numpy array from CUBE file


fname (str) – CUBE file path


  • data (np.ndarray) – 3D numpy array with the volumetric data

  • meta (dict) – Dictionary with the metadata of the CUBE file

moleculekit.util.rotationMatrix(axis, theta)#

Produces a rotation matrix given an axis and radians

Return the rotation matrix associated with counterclockwise rotation about the given axis by theta radians.

  • axis (list) – The axis around which to rotate

  • theta (float) – The rotation angle in radians


M – The rotation matrix.

Return type:



>>> M = rotationMatrix([0, 0, 1], 1.5708)
>>> M.round(4)
array([[-0., -1.,  0.],
       [ 1., -0.,  0.],
       [ 0.,  0.,  1.]])
>>> axis = [4.0, 4., 1.]
>>> theta = 1.2
>>> v = [3.0, 5., 0.]
>>>, theta), v).round(2)
array([ 2.75,  4.77,  1.92])
moleculekit.util.sequenceID(field, prepend=None, step=1)#

Array of integers which increments at value change of another array

  • field (np.ndarray or tuple) – An array of values. Once a change in value happens, a new ID will be created in seq. If a tuple of ndarrays is passed, a change in any of them will cause an increase in seq.

  • prepend (str) – A string to prepend to the incremental sequence

  • step (int) – The step size for incremeting the ID


seq – An array of equal size to field containing integers which increment every time there is a change in field

Return type:



>>> # A change in resid, insertion, chain or segid will cause an increase in the sequence
>>> sequenceID((mol.resid, mol.insertion, mol.chain, mol.segid))
array([  1,   1,   1, ..., 285, 286, 287])
>>> # it is typically used to renumber resids as follows
>>> mol.set('resid', sequenceID((mol.resid, mol.insertion, mol.chain, mol.segid)))
moleculekit.util.string_to_tempfile(content, ext)#
moleculekit.util.tempname(suffix='', create=False)#

Return a uniformly distributed rotation 3 x 3 matrix

The initial description of the calculation can be found in the section 5 of “How to generate random matrices from the classical compact groups” of Mezzadri (PDF:; arXiv:math-ph/0609050; and NOTICES of the AMS, Vol. 54 (2007), 592-604). Sample code is provided in that section as the haar_measure function.

Apparently this code can randomly provide flipped molecules (chirality-wise), so a fix found in tmadl/sklearn-random-rotation-ensembles was applied.


M – A uniformly distributed rotation 3 x 3 matrix

Return type:


moleculekit.util.wait_for_port(port, host='', timeout=240.0, _logger=False)#

Wait until a port starts accepting TCP connections. :param port: Port number. :type port: int :param host: Host address on which the port should exist. :type host: str :param timeout: In seconds. How long to wait before raising errors. :type timeout: float


TimeoutError – The port isn’t accepting connection after time specified in timeout.

moleculekit.util.writeCube(arr, filename, vecMin, vecRes)#

Writes 3D array to cube file

  • arr (np.ndarray) – 3D array with volumetric data.

  • filename (str) – string with the filename of the cubefile

  • vecMin (np.ndarray) – 3D vector denoting the minimal corner of the grid

  • vecRes (np.ndarray) – 3D vector denoting the resolution of the grid in each dimension in Angstrom

moleculekit.util.writeVoxels(arr, filename, vecMin, vecRes)#

DEPRECACTED: Use writeCube instead