moleculekit.util module#
- moleculekit.util.assertSameAsReferenceDir(compareDir, outdir='.')#
Check if files in refdir are present in the directory given as second argument AND their content matches.
Raise an exception if not.
- moleculekit.util.boundingBox(mol, sel='all')#
Calculates the bounding box of a selection of atoms.
- Parameters
- Returns
bbox – The bounding box around the atoms selected in sel.
- Return type
np.ndarray
Example
>>> boundingBox(mol, sel='chain A') array([[-17.3390007 , -10.43700027, -1.43900001], [ 25.40600014, 27.03800011, 46.46300125]], dtype=float32)
- moleculekit.util.check_port(port, host='127.0.0.1', timeout=120)#
- moleculekit.util.ensurelist(tocheck, tomod=None)#
Convert np.ndarray and scalars to lists.
Lists and tuples are left as is. If a second argument is given, the type check is performed on the first argument, and the second argument is converted.
- moleculekit.util.file_diff(file, reference)#
- moleculekit.util.find_executable(execname)#
- moleculekit.util.folder_diff(folder, reference, ignore_ftypes=('.log', '.txt'))#
- moleculekit.util.guessAnglesAndDihedrals(bonds, cyclicdih=False)#
Generate a guess of angle and dihedral N-body terms based on a list of bond index pairs.
- moleculekit.util.maxDistance(mol, sel='all', origin=None)#
Calculates the max distance of a set of atoms from an origin
- Parameters
- Returns
maxd – The maximum distance in Angstrom
- Return type
Example
>>> y = maxDistance(mol, sel='protein', origin=[0, 0, 0]) >>> print(round(y,2)) 48.39
- moleculekit.util.molRMSD(mol, refmol, rmsdsel1, rmsdsel2)#
Calculates the RMSD between two Molecules
- moleculekit.util.molTMscore(mol, ref, molsel='protein', refsel='protein')#
Calculates the TMscore between two protein Molecules
- Parameters
mol (
Molecule
object) – A Molecule containing a single or multiple framesref (
Molecule
object) – A reference Molecule containing a single frame. Will automatically keep only ref.frame.molsel (str) – Atomselect string for which atoms of mol to calculate TMScore
refsel (str) – Atomselect string for which atoms of ref to calculate TMScore
- Returns
tmscore (numpy.ndarray) – TM score (if normalized by length of ref)
rmsd (numpy.ndarray) – RMSD only OF COMMON RESIDUES for all frames. This is not the same as a full protein RMSD!!!
nali (numpy.ndarray) – Number of aligned residues
Examples
tmscore, rmsd, nali = molTMscore(mol, ref)
- moleculekit.util.natsorted(items)#
- moleculekit.util.opm(pdb, keep=False, keepaltloc='A')#
Download a molecule from the OPM.
- Parameters
- Returns
mol (Molecule) – The oriented molecule
thickness (float or None) – The bilayer thickness (both layers)
Examples
>>> mol, thickness = opm("1z98") >>> mol.numAtoms 7902 >>> thickness 28.2 >>> _, thickness = opm('4u15') >>> thickness is None True
- moleculekit.util.orientOnAxes(mol, sel='all')#
Rotate a molecule so that its main axes are oriented along XYZ.
The calculation is based on the axes of inertia of the given selection, but masses will be ignored. After the operation, the main axis will be parallel to the Z axis, followed by Y and X (the shortest axis). Only the first frame is oriented. The reoriented molecule is returned.
- Parameters
Examples
>>> mol = Molecule("1kdx") >>> mol = orientOnAxes(mol,"chain B")
- moleculekit.util.rotationMatrix(axis, theta)#
Produces a rotation matrix given an axis and radians
Return the rotation matrix associated with counterclockwise rotation about the given axis by theta radians.
- Parameters
- Returns
M – The rotation matrix.
- Return type
numpy.ndarray
Examples
>>> M = rotationMatrix([0, 0, 1], 1.5708) >>> M.round(4) array([[-0., -1., 0.], [ 1., -0., 0.], [ 0., 0., 1.]])
>>> axis = [4.0, 4., 1.] >>> theta = 1.2 >>> v = [3.0, 5., 0.] >>> np.dot(rotationMatrix(axis, theta), v).round(2) array([ 2.75, 4.77, 1.92])
- moleculekit.util.sequenceID(field, prepend=None, step=1)#
Array of integers which increments at value change of another array
- Parameters
field (np.ndarray or tuple) – An array of values. Once a change in value happens, a new ID will be created in seq. If a tuple of ndarrays is passed, a change in any of them will cause an increase in seq.
prepend (str) – A string to prepend to the incremental sequence
step (int) – The step size for incremeting the ID
- Returns
seq – An array of equal size to field containing integers which increment every time there is a change in field
- Return type
np.ndarray
Examples
>>> # A change in resid, insertion, chain or segid will cause an increase in the sequence >>> sequenceID((mol.resid, mol.insertion, mol.chain, mol.segid)) array([ 1, 1, 1, ..., 285, 286, 287]) >>> # it is typically used to renumber resids as follows >>> mol.set('resid', sequenceID((mol.resid, mol.insertion, mol.chain, mol.segid)))
- moleculekit.util.string_to_tempfile(content, ext)#
- moleculekit.util.tempname(suffix='', create=False)#
- moleculekit.util.uniformRandomRotation()#
Return a uniformly distributed rotation 3 x 3 matrix
The initial description of the calculation can be found in the section 5 of “How to generate random matrices from the classical compact groups” of Mezzadri (PDF: https://arxiv.org/pdf/math-ph/0609050.pdf; arXiv:math-ph/0609050; and NOTICES of the AMS, Vol. 54 (2007), 592-604). Sample code is provided in that section as the
haar_measure
function.Apparently this code can randomly provide flipped molecules (chirality-wise), so a fix found in https://github.com/tmadl/sklearn-random-rotation-ensembles/blob/5346f29855eb87241e616f6599f360eba12437dc/randomrotation.py was applied.
- Returns
M – A uniformly distributed rotation 3 x 3 matrix
- Return type
np.ndarray
- moleculekit.util.wait_for_port(port, host='127.0.0.1', timeout=240.0, _logger=False)#
Wait until a port starts accepting TCP connections. :param port: Port number. :type port: int :param host: Host address on which the port should exist. :type host: str :param timeout: In seconds. How long to wait before raising errors. :type timeout: float
- Raises
TimeoutError – The port isn’t accepting connection after time specified in timeout.
- moleculekit.util.writeVoxels(arr, filename, vecMin, vecRes)#
Writes grid free energy to cube file
- Parameters
arr (np.ndarray) – 3D array with volumetric data.
filename (str) – string with the filename of the cubefile
vecMin (np.ndarray) – 3D vector denoting the minimal corner of the grid
vecRes (np.ndarray) – 3D vector denoting the resolution of the grid in each dimension