moleculekit.smallmol.tools.clustering module#
- moleculekit.smallmol.tools.clustering.DiceDistances(fp1, fps)#
Returns the dice row based on fingeprints passed
- moleculekit.smallmol.tools.clustering.ParallelExecutor(**joblib_args)#
A wrapper for joblib.Parallel to allow custom progress bars.
- moleculekit.smallmol.tools.clustering.TanimotoDistances(fp1, fps)#
Returns the tanimoto row based on fingeprints passed
- moleculekit.smallmol.tools.clustering.cluster(smallmol_list, method, distThresholds=0.2, returnDetails=True, removeHs=True)#
Return the SmallMol objects grouped in the cluster. It can also return the details of the clusters computed.
- Parameters:
smallmol_list (list) – The list of moleculekit.smallmol.smallmol.SmallMol objects
method (str) – The cluster methods. Can be [‘maccs’, ‘pathFingerprints’, ‘atomsFingerprints’, ‘torsionsFingerprints’, ‘circularFingerprints’, ‘shape’, ‘mcs’]
distThresholds (float) – The disance cutoff for the clusters Default: 0.2
returnDetails (bool) – If True, the cluster details are also returned Default: True
removeHs (bool) – If True, the hydrogens are not considered Default: True
- Returns:
clusters (list) – List of lists, That contains the SmallMol objects grouped based on the cluster belongings
details (list) – A list with all the cluster details
- moleculekit.smallmol.tools.clustering.getMaximumCommonSubstructure(smallmol_list, removeHs=True, returnAtomIdxs=False)#
Returns the maximum common substructure and two list of lists. The first one contains for each molecules the atom indexes that are part of the MCS, the second list contains the indexes that are not part of the MCS.
- Parameters:
- Returns:
mcs_mol (rdkit.Chem.rdchem.Mol) – The MCS molecule
atom_mcs_list (list) – A list of lists containing the atom indexes that are part of the MCS
atom_no_mcs_list (list) – A list of lists containing the atom indexes that are not part of the MCS