How to fetch from RCSB and OPM#
Goal#
Programmatically download structures from the RCSB PDB and membrane-oriented coordinates from the Orientations of Proteins in Membranes (OPM) database.
Minimal example#
from moleculekit.molecule import Molecule
# Download directly from RCSB by 4-character PDB ID
mol = Molecule("3PTB")
print(mol.numAtoms)
Parameters that matter#
Function |
Key parameters |
What it does |
|---|---|---|
|
4-character string |
Fetches and parses the PDB entry |
|
|
Returns a list of ligand component IDs for that entry |
|
|
Downloads the OPM-oriented structure; |
|
|
Aligns |
Common variations#
# List the ligands bound in a structure, then fetch
from moleculekit.rcsb import rcsbFindLigands
ligands = rcsbFindLigands("3PTB")
print(ligands)
mol = Molecule("3PTB")
# Fetch a membrane protein in its OPM orientation
from moleculekit.opm import get_opm_pdb
mol, thickness = get_opm_pdb("1BL8")
# Align your own structure to its OPM equivalent
from moleculekit.opm import align_to_opm
mol = Molecule("my_structure.pdb")
results = align_to_opm(mol, maxalignments=3)
# results is a list, one entry per OPM hit. Each entry has the OPM PDB ID,
# the membrane thickness, and a list of high-scoring sequence pairs (HSPs)
# whose aligned_mol is `mol` re-imaged into the OPM frame.
for hit in results:
print(hit["pdbid"], "thickness:", hit["thickness"])
for hsp in hit["hsps"]:
aligned_mol = hsp["aligned_mol"] # Molecule, oriented in the OPM membrane frame
print(f" TM={hsp['TM-Score']:.2f} RMSD={hsp['Common RMSD']:.2f} Å")
Gotchas#
RCSB downloads respect the server rate limits; avoid hammering the API in tight loops.
Set the
LOCAL_PDB_REPOenvironment variable to a local PDB mirror directory to avoid repeated network downloads.OPM membership requires a known PDB ID or a successful BLAST sequence alignment; when nothing matches,
align_to_opm()returns an empty list (notNone).align_to_opm()returns alist[dict]— the alignedMoleculeobjects live underhit["hsps"][j]["aligned_mol"], not at the top level.get_opm_pdb()withkeep=False(default) strips the dummy membrane atoms that OPM adds; passkeep=Trueif you need them for visualization.