{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "# System Building: μ-opioid Receptor in Membrane with morphinan antagonist"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In this tutorial, we will showcase how to build a protein system embeded in a membrane with a ligand in one of the compartments for simulating binding. The sample system is a mu-opioid receptor (the protein) in a membrane and a morphinan antagonist (the ligand).\n",
    "\n",
    "Let's start by doing some imports and definitions:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "slideshow": {
     "slide_type": "-"
    }
   },
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "2024-06-11 15:51:03,138 - numexpr.utils - INFO - Note: NumExpr detected 20 cores but \"NUMEXPR_MAX_THREADS\" not set, so enforcing safe limit of 8.\n",
      "2024-06-11 15:51:03,138 - numexpr.utils - INFO - NumExpr defaulting to 8 threads.\n",
      "2024-06-11 15:51:03,219 - rdkit - INFO - Enabling RDKit 2022.09.1 jupyter extensions\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "Please cite HTMD: Doerr et al.(2016)JCTC,12,1845. https://dx.doi.org/10.1021/acs.jctc.6b00049\n",
      "HTMD Documentation at: https://software.acellera.com/htmd/\n",
      "\n",
      "You are on the latest HTMD version (2.3.28+0.g1f64e666d.dirty).\n",
      "\n"
     ]
    }
   ],
   "source": [
    "from htmd.ui import *\n",
    "from htmd.home import home\n",
    "from os.path import join\n",
    "from moleculekit.config import config\n",
    "\n",
    "config(viewer='webgl')\n",
    "datadir = home(dataDir='mor')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## Prepare the protein"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "-"
    }
   },
   "source": [
    "View the file as it comes from the OPM database:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "9918e258196b42598e4ef22c2df58460",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": []
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "57f39d15450244bc8e7e2e64d18a0afa",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "NGLWidget()"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "Molecule(join(datadir, '4dkl.pdb')).view()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "Retrieve the structure from OPM, do not keep the `DUM` atoms, and print the thickness as calculated by OPM:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "32.0"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "from moleculekit.opm import get_opm_pdb\n",
    "prot, thickness = get_opm_pdb('4dkl', keep=False)\n",
    "thickness"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Remove non-protein atoms, and keep only a monomer."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "2024-06-11 15:51:05,557 - moleculekit.molecule - INFO - Removed 2574 atoms. 2262 atoms remaining in the molecule.\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "array([   0,    1,    2, ..., 4808, 4809, 4810], dtype=int32)"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "prot.filter('(protein and noh and chain B) or (water and within 5 of (chain B and protein))')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Automatically detecting segments and assigning names to them."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "2024-06-11 15:51:05,587 - moleculekit.tools.autosegment - INFO - Created segment P0 between resid 65 and 263.\n",
      "2024-06-11 15:51:05,588 - moleculekit.tools.autosegment - INFO - Created segment P1 between resid 270 and 352.\n"
     ]
    }
   ],
   "source": [
    "prot = autoSegment(prot, sel=\"protein\")\n",
    "prot.set(\"segid\", \"W\", sel=\"water\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "2024-06-11 15:51:05,684 - moleculekit.tools.preparation - WARNING - Both chains and segments are defined in Molecule.chain / Molecule.segid, however they are inconsistent. Protein preparation will use the chain information.\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "---- Molecule chain report ----\n",
      "Chain A:\n",
      "    First residue: HOH   717  \n",
      "    Final residue: HOH   717  \n",
      "Chain B:\n",
      "    First residue: MET    65  \n",
      "    Final residue: HOH   735  \n",
      "---- End of chain report ----\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "2024-06-11 15:51:07,626 - moleculekit.tools.preparation - INFO - Modified residue ASP   114 B to ASH\n",
      "2024-06-11 15:51:07,627 - moleculekit.tools.preparation - INFO - Modified residue CYS   140 B to CYX\n",
      "2024-06-11 15:51:07,627 - moleculekit.tools.preparation - INFO - Modified residue HIS   171 B to HID\n",
      "2024-06-11 15:51:07,627 - moleculekit.tools.preparation - INFO - Modified residue CYS   217 B to CYX\n",
      "2024-06-11 15:51:07,628 - moleculekit.tools.preparation - INFO - Modified residue HIS   223 B to HID\n",
      "2024-06-11 15:51:07,628 - moleculekit.tools.preparation - INFO - Modified residue HIS   297 B to HID\n",
      "2024-06-11 15:51:07,628 - moleculekit.tools.preparation - INFO - Modified residue HIS   319 B to HIE\n",
      "2024-06-11 15:51:07,629 - moleculekit.tools.preparation - WARNING - Dubious protonation state: the pKa of 1 residues is within 1.0 units of pH 7.4.\n",
      "2024-06-11 15:51:07,630 - moleculekit.tools.preparation - WARNING - Dubious protonation state:    ASP   114 B (pKa= 7.85)\n"
     ]
    }
   ],
   "source": [
    "prot = systemPrepare(prot)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "outputs": [
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "4111826bbd4541e0847eb174dfed5d11",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "NGLWidget()"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "prot.reps.add(sel='segid P0', style='NewCartoon', color=1)\n",
    "prot.reps.add(sel='segid P1', style='NewCartoon', color=2)\n",
    "prot.view()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## Add a sodium atom in the receptor"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "fcea2ca4dea54e35831f2d3977647a56",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "NGLWidget()"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "sod = Molecule(join(datadir, 'sod.pdb'))\n",
    "sod.set('segid','S1')\n",
    "prot.append(sod)\n",
    "prot.reps.add(sel='ions', style='VDW', color='green')\n",
    "prot.view()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## Embed the protein into a membrane"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [],
   "source": [
    "memb = Molecule(join(datadir, 'membrane80by80C36.pdb'))\n",
    "memb.set('segid', 'M')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Center the membrane onto the protein center"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [],
   "source": [
    "pcenter = np.mean(prot.get('coords','protein'),axis=0)\n",
    "mcenter = np.mean(memb.get('coords'),axis=0)\n",
    "memb.moveBy(pcenter-mcenter)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "And now embed. \n",
    "\n",
    "The two are equivalent - `append` with `collisions=True` only\n",
    "adds atoms if they do not clash  sterically."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "2024-06-11 15:51:08,522 - moleculekit.molecule - INFO - Removed 319 residues from appended Molecule due to collisions.\n"
     ]
    }
   ],
   "source": [
    "mol = prot.copy()\n",
    "mol.append(memb, collisions=True)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## Visualize the embedded system"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "de587444c95b4ad4800ae2ecfc6d81ff",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "NGLWidget()"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "mol.reps.add(sel='protein', style='NewCartoon', color='Secondary Structure')\n",
    "mol.reps.add(sel='ions', style='VDW', color='green')\n",
    "mol.reps.add(sel='lipids', style='Lines')\n",
    "mol.view()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## Add a ligand"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [],
   "source": [
    "import random\n",
    "lig = Molecule(join(datadir, 'MOL.cif'))\n",
    "lig.set('segid','L');\n",
    "lcenter = np.mean(lig.get('coords'),axis=0)\n",
    "newlcenter=[random.uniform(-10, 10), random.uniform(-10, 10),  43 ]\n",
    "lig.rotateBy(uniformRandomRotation(), lcenter)\n",
    "lig.moveBy(newlcenter-lcenter)\n",
    "mol.append(lig)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## Put it in a water box"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "2024-06-11 15:51:10,118 - htmd.builder.solvate - INFO - Using water pdb file at: /home/sdoerr/Work/htmd/htmd/share/solvate/wat.pdb\n",
      "2024-06-11 15:51:10,551 - htmd.builder.solvate - INFO - Replicating 8 water segments, 2 by 2 by 2\n",
      "Solvating: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 8/8 [00:10<00:00,  1.27s/it]\n",
      "2024-06-11 15:51:21,290 - htmd.builder.solvate - INFO - 8853 water molecules were added to the system.\n"
     ]
    }
   ],
   "source": [
    "coo = mol.get('coords','noh and (lipids or protein)')\n",
    "m = np.min(coo, axis=0) + [0, 0, -5]\n",
    "M = np.max(coo, axis=0) + [0, 0, 20]\n",
    "smol = solvate(mol, minmax=np.vstack((m,M)))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## Visualize"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {
    "slideshow": {
     "slide_type": "-"
    }
   },
   "outputs": [
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "7cbec227ef19453087030c1378670fca",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "NGLWidget()"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "smol.reps.add(sel='segid L', style='Licorice')\n",
    "smol.reps.add(sel='water', style='Lines')\n",
    "smol.view()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## Build with AMBER"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "2024-06-11 15:51:26,607 - htmd.builder.amber - INFO - Detecting disulfide bonds.\n",
      "2024-06-11 15:51:26,614 - htmd.builder.builder - INFO - One disulfide bond was added\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Disulfide Bond between: UniqueResidueID<resname: 'CYX', chain: 'B', resid: 78, insertion: '', segid: 'P0'>\n",
      "                   and: UniqueResidueID<resname: 'CYX', chain: 'B', resid: 155, insertion: '', segid: 'P0'>\n",
      "\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "2024-06-11 15:51:27,682 - htmd.builder.amber - INFO - Starting the build.\n",
      "2024-06-11 15:51:29,609 - htmd.builder.amber - INFO - Finished building.\n",
      "2024-06-11 15:51:30,594 - moleculekit.writers - WARNING - Field \"resid\" of PDB overflows. Your data will be truncated to 4 characters.\n",
      "2024-06-11 15:51:35,176 - htmd.builder.builder - WARNING - Found cis peptide bond in 1 frames: [0] in the omega diheral \"Angle of (HID 160 CA  ) (HID 160 C  ) (PRO 161 N  ) (PRO 161 CA  ) \" with indexes [2528, 2541, 2543, 2553]\n",
      "2024-06-11 15:51:35,516 - htmd.builder.ionize - INFO - Adding 15 anions + 0 cations for neutralizing and 64 ions for the given salt concentration 0.15 M.\n",
      "2024-06-11 15:51:44,191 - htmd.builder.amber - INFO - Starting the build.\n",
      "2024-06-11 15:51:46,213 - htmd.builder.amber - INFO - Finished building.\n",
      "2024-06-11 15:51:47,141 - moleculekit.writers - WARNING - Field \"resid\" of PDB overflows. Your data will be truncated to 4 characters.\n",
      "2024-06-11 15:51:51,546 - htmd.builder.builder - WARNING - Found cis peptide bond in 1 frames: [0] in the omega diheral \"Angle of (HID 160 CA  ) (HID 160 C  ) (PRO 161 N  ) (PRO 161 CA  ) \" with indexes [2528, 2541, 2543, 2553]\n",
      "2024-06-11 15:51:51,569 - py.warnings - WARNING - /home/sdoerr/miniforge3/envs/htmd/lib/python3.10/site-packages/Bio/pairwise2.py:278: BiopythonDeprecationWarning: Bio.pairwise2 has been deprecated, and we intend to remove it in a future release of Biopython. As an alternative, please consider using Bio.Align.PairwiseAligner as a replacement, and contact the Biopython developers if you still need the Bio.pairwise2 module.\n",
      "  warnings.warn(\n",
      "\n",
      "2024-06-11 15:51:54,048 - moleculekit.tools.sequencestructuralalignment - INFO - Alignment #0 was done on 282 residues: 2-285\n"
     ]
    }
   ],
   "source": [
    "molbuilt = amber.build(smol, param=[join(datadir, 'MOL.frcmod')], topo=[join(datadir, 'MOL.cif')], outdir='./final-build', saltconc=0.15)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## Visualize built system"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "2024-06-11 15:51:54,436 - moleculekit.writers - WARNING - Field \"resid\" of PDB overflows. Your data will be truncated to 4 characters.\n"
     ]
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "66d17028b2d943c982d4d1afeea3b1c5",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "NGLWidget()"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "molbuilt.view()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "source": [
    "The `molbuilt` is a `Molecule` object that contains the built system, but the full contents to run a simulation are located in the `outdir` (`./final-build` in this case)."
   ]
  }
 ],
 "metadata": {
  "anaconda-cloud": {},
  "celltoolbar": "Slideshow",
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.13"
  },
  "widgets": {
   "state": {
    "1741c23784064c52b59c79251cfaf8e6": {
     "views": [
      {
       "cell_index": 14
      }
     ]
    },
    "223dfc92ccae4887a46893321ac12ed8": {
     "views": [
      {
       "cell_index": 14
      }
     ]
    },
    "416b4c797ae44b82bb5e4406a6e73a49": {
     "views": [
      {
       "cell_index": 32
      }
     ]
    },
    "594aa85570984b3887612502c4597e7c": {
     "views": [
      {
       "cell_index": 22
      }
     ]
    },
    "79e1a424c2554d5fb4bee580cb01833b": {
     "views": [
      {
       "cell_index": 3
      }
     ]
    },
    "9121f11441b2485fab877a116bde2606": {
     "views": [
      {
       "cell_index": 28
      }
     ]
    },
    "927a4600d16a4577b29a137506f3cc17": {
     "views": [
      {
       "cell_index": 3
      }
     ]
    },
    "c1825c2b1a3c4196a7d995f244f98aef": {
     "views": [
      {
       "cell_index": 32
      }
     ]
    },
    "e629564e29704ce6826a175f5c451ac2": {
     "views": [
      {
       "cell_index": 12
      }
     ]
    },
    "f832672917214baa99112a489918050e": {
     "views": [
      {
       "cell_index": 12
      }
     ]
    },
    "fb0fd032ceaf4485aaa81c7d1ad8ee3b": {
     "views": [
      {
       "cell_index": 22
      }
     ]
    },
    "fc1dea2f053a4c3aa924e70e29b91df6": {
     "views": [
      {
       "cell_index": 28
      }
     ]
    }
   },
   "version": "1.2.0"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}