moleculekit.smallmol.smallmol module#
- class moleculekit.smallmol.smallmol.SmallMol(mol, ignore_errors=False, force_reading=False, fixHs=True, removeHs=False, verbose=True, sanitize=True, _logger=True, **kwargs)#
Bases:
objectClass to manipulate small molecule structures
- Parameters:
mol (
rdkit.Chem.rdchem.Molorfilenameorsmileormoleculekit.smallmol.smallmol.SmallMol) – (i) Rdkit molecule or (ii) Location of molecule file (“.pdb”/”.mol2”) or (iii) a smile string or iv) another SmallMol object or v) moleculekit.molecule.Molecule objectignore_errors (
bool) – If True, errors will not be raised.force_reading (
bool) – If True, and the mol provided is not accepted, the molecule will be initially converted into sdffixHs (
bool) – If True, the missing hydrogens are assigned, the others are correctly assinged into the graph of the moleculeremoveHs (
bool) – If True, remove the hydrogensverbose (
bool) – If True, additional information is logged during initialization.sanitize (
bool) – If True, the molecule is sanitized after reading.
Examples
>>> import os >>> from moleculekit.smallmol.smallmol import SmallMol >>> SmallMol('CCO') >>> SmallMol('ligand.pdb', fixHs=False, removeHs=True ) >>> sm = SmallMol('benzamidine.mol2') >>> print(sm) SmallMol with 18 atoms and 1 conformers Atom field - bondtype Atom field - charge ...
Methods
Attributes
- addHs(addCoords=True)#
Adds explicit hydrogen atoms to the molecule in place.
- Parameters:
addCoords (
bool) – If True, 3D coordinates are also generated for the added hydrogens. Default: True
- align(refmol)#
Aligns the molecule in place onto a reference molecule using an Open3DAlign overlay.
The molecule’s coordinates are modified so that it is superimposed onto the reference and the resulting RMSD is logged.
- Parameters:
refmol (
SmallMolorrdkit.Chem.rdchem.Molormoleculekit.molecule.Molecule) – The reference molecule to align this molecule onto
- assignStereoChemistry(from3D=True)#
Assigns stereochemistry to the molecule in place.
- Parameters:
from3D (
bool) – If True, the stereochemistry is derived from the 3D conformer coordinates. If False, it is assigned from the molecular graph, recomputing and overwriting any existing stereo information. Default: True
- containsMetals(metalSMARTS='[Mg,Ca,Zn,As,Mn,Al,Pd,Pt,Co,Ba,Cr,Cu,Ni,Ag,Fe,Hg,Cd,Gd,Na]')#
Returns True if the molecule contains metals
- copy()#
Create a copy of the molecule object
- Returns:
newsmallmol – A copy of the object
- Return type:
- depict(sketch=True, filename=None, ipython=False, optimize=False, optimizemode='std', removeHs=True, atomlabels=None, highlightAtoms=None, resolution=(400, 200))#
Depicts the molecules. It is possible to save it into an svg file and also generates a jupiter-notebook rendering
- Parameters:
sketch (
bool) – Set to True for 2D depictionipython (
bool) – Set to True to return the jupiter-notebook renderingoptimize (
bool) – Set to True to optimize the conformation. Works only with 3D.optimizemode (
str) – Set the optimization mode for 3D conformationremoveHs (
bool) – Set to True to hide hydrogens in the depictionatomlabels (
str|None) – Accept any combinations of the following pararemters as unique string ‘%a%i%c%*’ a:atom name, i:atom index, c:atom formal charge (+/-), :chiral ( if atom is chiral)highlightAtoms (
list|None) – List of atom to highlight. It can be also a list of atom list, in this case different colors will be usedresolution (
tuple) – Resolution in pixels: (X, Y)
- Returns:
ipython_svg – An SVG rendering object if
ipythonis True, otherwise None- Return type:
IPython.display.SVGorNone
Example
>>> sm.depict(ipython=True, optimize=True, optimizemode='std') >>> sm.depict(ipython=True, sketch=True) >>> sm.depict(ipython=True, sketch=True) >>> sm.depict(ipython=True, sketch=True, atomlabels="%a%i%c") >>> ids = np.intersect1d(sm.get('idx', 'hybridization SP2'), sm.get('idx', 'element C')) >>> sm.depict(ipython=True, sketch=True,highlightAtoms=ids.tolist(), removeHs=False)
- dropFrames(frames='all')#
Removes conformers (frames) from the molecule in place.
- Parameters:
frames (
str|int|list|ndarray) – The frame indices to remove. Use"all"to remove every conformer, an integer to remove a single frame, or a list/array of indices to remove several. Default: “all”- Raises:
RuntimeError – If any requested frame index is greater than or equal to the number of conformers
- filter(sel)#
Not implemented.
- Parameters:
sel (
str|ndarray) – Atom selection (string, boolean mask, or integer index array).- Raises:
NotImplementedError – Always, since filtering atoms is not supported.
- foundBondBetween(sel1, sel2, bondtype=None)#
Checks whether at least one bond exists between the two atom selections.
It is possible to restrict the check to a specific bond type. If one or more matching bonds are found, a tuple
(True, details)is returned wheredetailsdescribes each bond. If no matching bond is found, the bare valueFalseis returned.- Parameters:
- Returns:
result – If a bond was found, a tuple
(True, details)wheredetailsis a list of lists, each holding the(idx1, idx2)atom indices of the bond and its bond type as a string. If no bond was found, the bare booleanFalse.- Return type:
- property frame: int#
The currently active conformer (frame) index.
- Returns:
frame – The index of the active conformer
- Return type:
- Raises:
RuntimeError – If the stored frame index is out of the range of available conformers
- generateConformers(num_confs=400, optimizemode='mmff', align=True, append=True, pruneRmsThresh=0.5, maxAttempts=10000, seed=None, numThreads=1, useRandomCoords=True)#
Generates ligand conformers
- Parameters:
num_confs (
int) – Number of conformers to generate.optimizemode (
str) – The optimizemode to use. Can be ‘uff’, ‘mmff’align (
bool) – If True, the conformer are aligned to the first oneappend (
bool) – If False, the current conformers are deletedpruneRmsThresh (
float) – The RMSD threshold for pruning conformersmaxAttempts (
int) – The maximum number of attempts to generate conformersseed (
int|None) – The seed for the random number generatornumThreads (
int) – The number of threads to use when embedding multiple conformationsuseRandomCoords (
bool) – Start the embedding from random coordinates instead of using eigenvalues of the distance matrix
- get(returnField, sel='all', convertType=True, invert=False)#
Returns the property for the atom specified with the selection. The selection is another atom property
- Parameters:
returnField (
str) – The field of the atom to returnsel (
str) – The selection string. It is an atom field name followed by one or more space-separated values to match for that field, for example"idx 0 1 7"or"element N". Atoms whose value for that field equals any of the given values are selected. Use"all"to select every atom.convertType (
bool) – If True, and where possible the returnField is converted in rdkit object Default: Trueinvert (
bool) – If True, the selection is inverted Default: False
- Returns:
values – The array of values for the property
- Return type:
np.array
Example
>>> sm.get('element', 'idx 0 1 7') array(['C', 'C', 'H'], dtype='<U1') >>> sm.get('hybridization', 'element N') array([rdkit.Chem.rdchem.HybridizationType.SP2, rdkit.Chem.rdchem.HybridizationType.SP2], dtype=object) >>> sm.get('hybridization', 'element N', convertType=False) array([3, 3]) >>> sm.get('element', 'hybridization sp2') array(['C', 'C', 'C', 'C', 'C', 'C', 'C', 'N', 'N'], dtype='<U1') >>> sm.get('element', 'hybridization S') array(['H', 'H', 'H', 'H', 'H', 'H', 'H', 'H', 'H'], dtype='<U1') >>> sm.get('element', 'hybridization 1') array(['H', 'H', 'H', 'H', 'H', 'H', 'H', 'H', 'H'], dtype='<U1') >>> sm.get('atomobject', 'element N') array([<rdkit.Chem.rdchem.Atom object at 0x7faf616dd120>, <rdkit.Chem.rdchem.Atom object at 0x7faf616dd170>], dtype=object)
- getAtoms()#
Returns an array with the rdkit.Chem.rdchem.Atom objects present in the molecule.
- Returns:
atoms – An object array of the rdkit.Chem.rdchem.Atom objects of the molecule
- Return type:
- getCenter()#
Returns the geometrical center of the molecule for the currently active conformation.
- Returns:
center – The (x, y, z) coordinates of the geometrical center
- Return type:
- getDescriptors(prefix='', ignore=('Ipc',))#
Calculate descriptors for the molecule
Returns rdkit descriptors for the molecule, like DESC_NumRotatableBonds or DESC_MolLogP. See rdkit.Chem.Descriptors for more.
- getFingerprint(mode, radius=2, num_bits=1024)#
Computes a single molecular fingerprint of the requested type.
- Parameters:
mode (
str) – The fingerprint type to compute. One of ‘Morgan’, ‘MACCS’, ‘AvalonCount’.radius (
int) – Radius to define a local environment. Only used for the ‘Morgan’ fingerprint.num_bits (
int) – The number of bits to use in the fingerprint. Larger avoids collisions. Used for the ‘Morgan’ and ‘AvalonCount’ fingerprints.
- Returns:
fingerprint – The computed fingerprint for the chosen
mode: a hashed Morgan count fingerprint for ‘Morgan’, a MACCS keys bit vector for ‘MACCS’, or an Avalon count fingerprint for ‘AvalonCount’.- Return type:
rdkit fingerprint object- Raises:
RuntimeError – If
modeis not one of the supported fingerprint types
- getProp(prop_name)#
Returns a given property of the molecule.
- getTautomers(canonical=True, genConformers=False, returnScores=True, maxTautomers=200, filterTauts=None)#
Enumerates the tautomers of the molecule.
- Parameters:
canonical (
bool) – If True, only the single canonical tautomer is returned. If False, all enumerated tautomers are returned. Default: TruegenConformers (
bool) – If True, a conformer is generated for each returned tautomer. Default: FalsereturnScores (
bool) – If True, the tautomer scores are also returned alongside the tautomers. Default: TruemaxTautomers (
int) – The maximum number of tautomers to enumerate. Default: 200filterTauts (
float|None) – If not None, only tautomers whose score is within this value of the maximum score are kept. Default: None
- Returns:
tautomers (
SmallMolLib) – A library containing the enumerated tautomersscores (
list) – The scores of the returned tautomers. Only returned ifreturnScoresis True.
- isChiral(returnDetails=False)#
Returns True if the molecule has at least one chiral atom. If returnDetails is set as True, a list of tuples with the atom idx and chiral type is returned.
- Parameters:
returnDetails (
bool) – If True, returns the chiral atoms and their chiral types Default: False- Returns:
Example
>>> chiralmol.isChiral() True >>> chiralmol.isChiral(returnDetails=True) (True, [('C2', 'R')])
- property ligname: str#
The ligand name of the molecule.
Returns the value of the molecule’s
_Nameproperty, or"UNK"if it is not set.- Returns:
ligname – The ligand name
- Return type:
- property numAtoms: int#
The number of atoms in the molecule.
- Returns:
numatoms – The number of atoms
- Return type:
- property numFrames: int#
The number of conformers (frames) of the molecule.
- Returns:
numframes – The number of conformers
- Return type:
- removeHs()#
Removes explicit hydrogen atoms from the molecule in place.
- sanitize()#
Sanitizes the molecule in place using rdkit.
This cleans up the molecule, computing properties such as valences, ring information and aromaticity.
- setProp(key, value)#
Sets a property on the molecule.
The value is stored as a string on the underlying molecule.
- Parameters:
key (
str) – The name of the property to setvalue – The value to store. It is converted to a string before being stored.
- stripSalts()#
Removes any salts from the molecule
- toMolecule(ids=None)#
Return the moleculekit.molecule.Molecule
- toSMARTS(explicitHs=False)#
Returns the smarts string of the molecule
- toSMILES(explicitHs=False, kekulizeSmile=True)#
Returns the smiles string of the molecule
- view(*args, **kwargs)#
Visualizes the molecule.
The molecule is converted to a moleculekit.molecule.Molecule and all arguments are forwarded to its
viewmethod.
- write(fname, frames=None, merge=True)#
Writes the molecule to a file.
The output format is determined by the file extension. For
.sdffiles the molecule is written with rdkit; other formats are written by first converting to a moleculekit.molecule.Molecule.- Parameters:
fname (
str) – The output file name. The extension determines the file format.frames (
list|None) – The conformer indices to write. If None, all conformers are written. Default: Nonemerge (
bool) – Only used for.sdfoutput. If True, all conformers are written to a single file. If False, one file is written per conformer with the frame index appended to the file name. Default: True