moleculekit.tools.graphalignment module#
- moleculekit.tools.graphalignment.compareGraphs(G, H, fields=('element',), tolerance=0.5, returnmatching=False)#
Computes a similarity score between two molecular graphs.
The score is based on the size of the maximum common substructure found via the maximal clique of the product graph of G and H, normalized by the size of the larger graph. The algorithm is based on “Chemoisosterism in the Proteome”, X. Jalencas, J. Mestres, JCIM 2013.
- Parameters:
G (
networkx.Graph) – The first molecular graph.H (
networkx.Graph) – The second molecular graph.fields (
tuple) – A tuple of the node fields that are used to match atoms between the two graphs.tolerance (
float) – How different distances between two atom pairs can be for them to match in the product graph.returnmatching (
bool) – If True, also returns the size of the maximum common substructure and the matching node pairs.
- Returns:
score (
float) – The similarity score between the two graphs, between 0 and 1.cliquesize (
int) – The number of matched nodes in the maximum common substructure. Only returned if returnmatching is True.matching (
list) – A list of (node_G, node_H) pairs describing the matched nodes. Only returned if returnmatching is True.
- moleculekit.tools.graphalignment.createProductGraph(G, H, tolerance, fields)#
- moleculekit.tools.graphalignment.makeMolGraph(mol, sel, fields)#
- moleculekit.tools.graphalignment.maximalSubstructureAlignment(mol1, mol2, sel1='all', sel2='all', fields=('element',), tolerance=0.5, visualize=False)#
Aligns two molecules on the largest common substructure
- Parameters:
mol1 (
Molecule) – The reference molecule on which to alignmol2 (
Molecule) – The second molecule which will be rotated and translated to align on mol1sel1 (
str|ndarray) – Atom selection of the atoms of mol1 to align. Can be a selection string, a boolean mask, or an integer index array. See more heresel2 (
str|ndarray) – Atom selection of the atoms of mol2 to align. Can be a selection string, a boolean mask, or an integer index array. See more herefields (
tuple) – A tuple of the fields that are used to match atomstolerance (
float) – How different can distances be between to atom pairs for them to match in the product graphvisualize (
bool) – If set to True it will visualize the alignment
- Returns:
newmol – A copy of mol2 aligned on mol1
- Return type:
- moleculekit.tools.graphalignment.mcsAtomMatching(mol1, mol2, atomCompare='elements', bondCompare='any', _logger=True)#
Maximum common substructure atom matching.
Given two molecules it will find their maximum common substructure using rdkit and return the atoms in both molecules which matched.
- Parameters:
- Returns:
Examples
>>> mol1 = Molecule("OIC.cif") >>> mol1.atomtype = mol1.element >>> mol2 = Molecule("5vbl") >>> mol2.filter("resname OIC") >>> atm1, atm2 = mcsAtomMatching(mol1, mol2, bondCompare="any") >>> print(mol1.name[atm1], mol2.name[atm2]) ['N' 'CA' 'C' 'O' 'CB' 'CG' 'CD' 'C7' 'C6' 'C5' 'C4'] ['N' 'CA' 'C' 'O' 'CB' 'CG' 'CD' 'C7' 'C6' 'C5' 'C4']