moleculekit.distance module#
- moleculekit.distance.calculate_contacts(mol, sel1, sel2, periodic, threshold=4)#
Calculate atom contacts within a distance threshold for each frame.
For every frame of mol, finds all pairs of atoms (one from sel1 and one from sel2) whose interatomic distance is below threshold.
- Parameters:
mol (
Molecule) – The Molecule (single or multi-frame) whose coordinates are used.sel1 (
ndarray) – A 1D boolean atom-selection mask (lengthmol.numAtoms) selecting the first group of atoms.sel2 (
ndarray) – A 1D boolean atom-selection mask (lengthmol.numAtoms) selecting the second group of atoms. If it is equal to sel1, self-contacts within the selection are computed.periodic (
str|None) – How to treat periodic boundary conditions when computing distances. If None, no periodic wrapping is applied (the molecule box is ignored). If"chains", the minimum image convention is applied across atoms of different chains. If"selections", it is applied between the two selections. When not None, mol must contain non-zero box dimensions for every frame.threshold (
float) – Distance cutoff in Angstrom below which a pair of atoms is considered in contact. Default is 4.
- Returns:
contacts – One entry per frame of mol. Each entry is a 2D array of shape (N, 2) and dtype uint32 containing the atom-index pairs in contact for that frame.
- Return type:
- Raises:
RuntimeError – If periodic is not None but the molecule has no valid box dimensions, if the number of box frames does not match the number of coordinate frames, or if periodic is not one of None,
"chains"or"selections".
- moleculekit.distance.cdist(coords1, coords2)#
Compute the pairwise Euclidean distances between two sets of points.
- Parameters:
- Returns:
distances – A 2D array of shape (N, M) and dtype float32 where element
[i, j]is the Euclidean distance betweencoords1[i]andcoords2[j].- Return type:
Examples
>>> distances = cdist(coords1, coords2)
- moleculekit.distance.find_clashes(mol, sel1=None, sel2=None, overlap=0.6, exclude_bonded=True, exclude_14=True, guess_bonds=True)#
Find pairs of atoms that sterically clash with each other.
A clash is defined as a pair of atoms whose interatomic distance is less than
r_vdw_1 + r_vdw_2 - overlapwhere VdW radii come frommoleculekit.periodictable. Uses the bundledcKDTree(ported from SciPy) for fast neighbor lookup.- Parameters:
mol (
Molecule) – The molecule to analyze.sel1 (
str|ndarray|None) – First selection (atom-selection string, boolean mask, or integer index array). If None, all atoms are used.sel2 (
str|ndarray|None) – Second selection (atom-selection string, boolean mask, or integer index array). If None, usessel1(self-clashes).overlap (
float) – How much VdW overlap is tolerated before flagging as a clash, in Angstroms. Default 0.6 – i.e. atoms clash when they overlap by more than 0.6 Å of their combined VdW radii. Set to 0 for strict contact (any overlap counts), or negative for looser definitions.exclude_bonded (
bool) – If True, 1-2 (directly bonded) and 1-3 (angle) neighbors are excluded from the clash search. Default True.exclude_14 (
bool) – If True, 1-4 (dihedral) neighbors are also excluded. Default True.guess_bonds (
bool) – If True, supplementsmol.bondswith moleculekit’s distance/covalent-radius based bond guesser. This catches inter-residue peptide bonds, disulfides, etc. that are often absent frommol.bondson PDB-loaded structures. Set to False ifmol.bondsis already complete (e.g. for systems built from a topology file) to skip the guessing overhead and avoid false positives from overlapping atoms. Default True.
- Returns:
clashes (
numpy.ndarrayofshape (N,2),dtype int) – Pairs of atom indices that clash. Pairs are ordered so the first index is always < the second. Empty array if no clashes.distances (
numpy.ndarrayofshape (N,),dtype float32) – Distance (Å) for each clash pair.overlaps (
numpy.ndarrayofshape (N,),dtype float32) – Overlap amount(r_vdw_1 + r_vdw_2) - distancefor each pair. Pairs are returned sorted by overlap (most severe first).
Examples
>>> mol = Molecule("3ptb") >>> clashes, distances, overlaps = find_clashes(mol) >>> for (a, b), d, o in zip(clashes, distances, overlaps): ... print(f"{mol.name[a]}({a}) <-> {mol.name[b]}({b}): " ... f"d={d:.2f} overlap={o:.2f}")
- moleculekit.distance.pdist(coords)#
Compute the pairwise Euclidean distances within a single set of points.
- Parameters:
coords (
ndarray) – A 2D array of shape (N, D) with the coordinates of the points.- Returns:
distances – A 1D array of length
N * (N - 1) / 2and dtype float32 containing the condensed upper-triangular pairwise distances. Usesquareform()to convert it to a full (N, N) distance matrix.- Return type:
Examples
>>> distances = pdist(coords)
- moleculekit.distance.squareform(distances)#
Convert a condensed pairwise distance vector into a square distance matrix.
- Parameters:
distances (
ndarray) – A 1D condensed distance vector of lengthN * (N - 1) / 2, such as the one produced bypdist().- Returns:
matrix – A 2D symmetric distance matrix of shape (N, N) with a zero diagonal.
- Return type:
Examples
>>> matrix = squareform(pdist(coords))