{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "# Adaptive sampling"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "-"
    }
   },
   "source": [
    "In this tutorial, we will showcase how to use adaptive sampling simulations on a molecular system. The sample system in this case is the NTL9 protein.\n",
    "\n",
    "Let's import HTMD and do some definitions:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "slideshow": {
     "slide_type": "-"
    }
   },
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "2024-06-11 16:32:33,117 - numexpr.utils - INFO - Note: NumExpr detected 20 cores but \"NUMEXPR_MAX_THREADS\" not set, so enforcing safe limit of 8.\n",
      "2024-06-11 16:32:33,117 - numexpr.utils - INFO - NumExpr defaulting to 8 threads.\n",
      "2024-06-11 16:32:33,286 - rdkit - INFO - Enabling RDKit 2022.09.1 jupyter extensions\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "Please cite HTMD: Doerr et al.(2016)JCTC,12,1845. https://dx.doi.org/10.1021/acs.jctc.6b00049\n",
      "HTMD Documentation at: https://software.acellera.com/htmd/\n",
      "\n",
      "You are on the latest HTMD version (2.3.28+5.g9dbc5091a.dirty).\n",
      "\n"
     ]
    }
   ],
   "source": [
    "from htmd.ui import *"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## Get the generators folder structure"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Get the data for this tutorial [here](https://ndownloader.figshare.com/files/65181267). Alternatively, you can download the data using `wget`:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "import os, glob\n",
    "assert os.system('wget -q https://ndownloader.figshare.com/files/65181267 -O adaptive_generators.zip && unzip -q -o adaptive_generators.zip && rm adaptive_generators.zip') == 0\n",
    "for file in glob.glob('./generators/*/run.sh'):\n",
    "    os.chmod(file, 0o755)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[01;34mgenerators\u001b[0m\n",
      "\u251c\u2500\u2500 \u001b[01;34mntl9_1ns_0\u001b[0m\n",
      "\u2502\u00a0\u00a0 \u251c\u2500\u2500 input\n",
      "\u2502\u00a0\u00a0 \u251c\u2500\u2500 input.coor\n",
      "\u2502\u00a0\u00a0 \u251c\u2500\u2500 input.xsc\n",
      "\u2502\u00a0\u00a0 \u251c\u2500\u2500 parameters\n",
      "\u2502\u00a0\u00a0 \u251c\u2500\u2500 \u001b[01;32mrun.sh\u001b[0m\n",
      "\u2502\u00a0\u00a0 \u251c\u2500\u2500 structure.pdb\n",
      "\u2502\u00a0\u00a0 \u2514\u2500\u2500 structure.psf\n",
      "\u251c\u2500\u2500 \u001b[01;34mntl9_1ns_1\u001b[0m\n",
      "\u2502\u00a0\u00a0 \u251c\u2500\u2500 input\n",
      "\u2502\u00a0\u00a0 \u251c\u2500\u2500 input.coor\n",
      "\u2502\u00a0\u00a0 \u251c\u2500\u2500 input.xsc\n",
      "\u2502\u00a0\u00a0 \u251c\u2500\u2500 parameters\n",
      "\u2502\u00a0\u00a0 \u251c\u2500\u2500 \u001b[01;32mrun.sh\u001b[0m\n",
      "\u2502\u00a0\u00a0 \u251c\u2500\u2500 structure.pdb\n",
      "\u2502\u00a0\u00a0 \u2514\u2500\u2500 structure.psf\n",
      "\u2514\u2500\u2500 \u001b[01;34mntl9_1ns_2\u001b[0m\n",
      "    \u251c\u2500\u2500 input\n",
      "    \u251c\u2500\u2500 input.coor\n"
     ]
    }
   ],
   "source": [
    "!tree generators | head -20"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## Adaptive classes"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "-"
    }
   },
   "source": [
    "HTMD has two types of adaptive sampling:\n",
    "\n",
    "* AdaptiveMD (free exploration)\n",
    "* AdaptiveGoal (exploration + exploitation)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Create a directory for each type of adaptive and copy the generators into them:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {
    "scrolled": true
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'./adaptivegoal/generators'"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "os.makedirs('./adaptivemd', exist_ok=True)\n",
    "os.makedirs('./adaptivegoal', exist_ok=True)\n",
    "shutil.copytree('./generators', './adaptivemd/generators')\n",
    "shutil.copytree('./generators', './adaptivegoal/generators')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## AdaptiveMD"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's change directory to the `adaptivemd` one and work there:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [],
   "source": [
    "os.chdir('./adaptivemd')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "* Setup the queue that will be used for simulations. \n",
    "* Tell it to store completed trajectories in the data folder as this is where `AdaptiveMD` expects them to be by default"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [],
   "source": [
    "queue = LocalGPUQueue()\n",
    "queue.datadir = './data'"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [],
   "source": [
    "ad = AdaptiveMD()\n",
    "ad.app = queue"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "* Set the `nmin`, `nmax` and `nepochs`"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [],
   "source": [
    "ad.nmin = 1\n",
    "ad.nmax = 3\n",
    "ad.nepochs = 3"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "* Choose what projection to use for the construction of the Markov model"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [],
   "source": [
    "protsel = 'protein and name CA'\n",
    "ad.projection = MetricSelfDistance(protsel)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "* Set the `updateperiod` of the Adaptive to define how often it will poll for completed simulations and redo the analysis"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [],
   "source": [
    "ad.updateperiod = 120 # execute every 2 minutes"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "Launch the `AdaptiveMD` run:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "2024-06-11 16:32:58,958 - htmd.adaptive.adaptive - INFO - Processing epoch 0\n",
      "2024-06-11 16:32:58,960 - htmd.adaptive.adaptive - INFO - Epoch 0, generating first batch\n",
      "2024-06-11 16:32:58,983 - jobqueues.util - INFO - Trying to determine all GPU devices\n",
      "2024-06-11 16:32:59,035 - jobqueues.localqueue - INFO - Using GPU devices 0\n",
      "2024-06-11 16:32:59,037 - jobqueues.util - INFO - Trying to determine all GPU devices\n",
      "2024-06-11 16:32:59,093 - jobqueues.localqueue - INFO - Queueing /home/sdoerr/Work/htmd/tutorials/adaptivemd/input/e1s1_ntl9_1ns_0\n",
      "2024-06-11 16:32:59,095 - jobqueues.localqueue - INFO - Queueing /home/sdoerr/Work/htmd/tutorials/adaptivemd/input/e1s2_ntl9_1ns_1\n",
      "2024-06-11 16:32:59,097 - jobqueues.localqueue - INFO - Running /home/sdoerr/Work/htmd/tutorials/adaptivemd/input/e1s1_ntl9_1ns_0 on device 0\n",
      "2024-06-11 16:32:59,101 - jobqueues.localqueue - INFO - Queueing /home/sdoerr/Work/htmd/tutorials/adaptivemd/input/e1s3_ntl9_1ns_2\n",
      "2024-06-11 16:32:59,105 - htmd.adaptive.adaptive - INFO - Sleeping for 120 seconds.\n",
      "2024-06-11 16:34:59,208 - htmd.adaptive.adaptive - INFO - Processing epoch 1\n",
      "2024-06-11 16:34:59,209 - htmd.adaptive.adaptive - INFO - Retrieving simulations.\n",
      "2024-06-11 16:34:59,209 - htmd.adaptive.adaptive - INFO - 3 simulations in progress\n",
      "2024-06-11 16:34:59,210 - htmd.adaptive.adaptive - INFO - Sleeping for 120 seconds.\n",
      "2024-06-11 16:35:14,002 - jobqueues.localqueue - INFO - Completed /home/sdoerr/Work/htmd/tutorials/adaptivemd/input/e1s1_ntl9_1ns_0\n",
      "2024-06-11 16:35:14,003 - jobqueues.localqueue - INFO - Running /home/sdoerr/Work/htmd/tutorials/adaptivemd/input/e1s2_ntl9_1ns_1 on device 0\n",
      "2024-06-11 16:36:59,319 - htmd.adaptive.adaptive - INFO - Processing epoch 1\n",
      "2024-06-11 16:36:59,320 - htmd.adaptive.adaptive - INFO - Retrieving simulations.\n",
      "2024-06-11 16:36:59,321 - htmd.adaptive.adaptive - INFO - 2 simulations in progress\n",
      "2024-06-11 16:36:59,321 - htmd.adaptive.adaptive - INFO - Sleeping for 120 seconds.\n",
      "2024-06-11 16:37:30,333 - jobqueues.localqueue - INFO - Completed /home/sdoerr/Work/htmd/tutorials/adaptivemd/input/e1s2_ntl9_1ns_1\n",
      "2024-06-11 16:37:30,334 - jobqueues.localqueue - INFO - Running /home/sdoerr/Work/htmd/tutorials/adaptivemd/input/e1s3_ntl9_1ns_2 on device 0\n",
      "2024-06-11 16:38:59,420 - htmd.adaptive.adaptive - INFO - Processing epoch 1\n",
      "2024-06-11 16:38:59,421 - htmd.adaptive.adaptive - INFO - Retrieving simulations.\n",
      "2024-06-11 16:38:59,422 - htmd.adaptive.adaptive - INFO - 1 simulations in progress\n",
      "2024-06-11 16:38:59,422 - htmd.adaptive.adaptiverun - INFO - Postprocessing new data\n",
      "Creating simlist: 100%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588| 3/3 [00:00<00:00, 704.37it/s]\n",
      "Filtering trajectories: 100%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588| 2/2 [00:00<00:00,  2.82it/s]\n",
      "Projecting trajectories: 100%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588| 2/2 [00:00<00:00, 316.56it/s]\n",
      "2024-06-11 16:39:00,784 - htmd.projections.metric - INFO - Frame step 0.1ns was read from the trajectories. If it looks wrong, redefine it by manually setting the MetricData.fstep property.\n",
      "2024-06-11 16:39:00,785 - htmd.metricdata - INFO - Dropped 0 trajectories from 2 resulting in 2\n",
      "2024-06-11 16:39:01,850 - htmd.metricdata - INFO - Dropped 0 trajectories from 2 resulting in 2\n",
      "2024-06-11 16:39:01,862 - htmd.adaptive.adaptiverun - WARNING - Using less macrostates than requested due to lack of microstates. macronum = 3\n",
      "2024-06-11 16:39:02,165 - htmd.model - INFO - 95.0% of the data was used\n",
      "2024-06-11 16:39:02,166 - htmd.model - INFO - Number of trajectories that visited each macrostate:\n",
      "2024-06-11 16:39:02,167 - htmd.model - INFO - [2 2]\n",
      "Writing inputs: 100%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588| 2/2 [00:00<00:00,  4.13it/s]\n",
      "2024-06-11 16:39:02,657 - jobqueues.localqueue - INFO - Queueing /home/sdoerr/Work/htmd/tutorials/adaptivemd/input/e2s1_e1s1p0f4\n",
      "2024-06-11 16:39:02,657 - jobqueues.localqueue - INFO - Queueing /home/sdoerr/Work/htmd/tutorials/adaptivemd/input/e2s2_e1s2p0f3\n",
      "2024-06-11 16:39:02,658 - htmd.adaptive.adaptive - INFO - Finished submitting simulations.\n",
      "2024-06-11 16:39:02,658 - htmd.adaptive.adaptive - INFO - Sleeping for 120 seconds.\n",
      "2024-06-11 16:39:47,336 - jobqueues.localqueue - INFO - Completed /home/sdoerr/Work/htmd/tutorials/adaptivemd/input/e1s3_ntl9_1ns_2\n",
      "2024-06-11 16:39:47,337 - jobqueues.localqueue - INFO - Running /home/sdoerr/Work/htmd/tutorials/adaptivemd/input/e2s1_e1s1p0f4 on device 0\n",
      "2024-06-11 16:41:02,757 - htmd.adaptive.adaptive - INFO - Processing epoch 2\n",
      "2024-06-11 16:41:02,759 - htmd.adaptive.adaptive - INFO - Retrieving simulations.\n",
      "2024-06-11 16:41:02,760 - htmd.adaptive.adaptive - INFO - 2 simulations in progress\n",
      "2024-06-11 16:41:02,761 - htmd.adaptive.adaptive - INFO - Sleeping for 120 seconds.\n",
      "2024-06-11 16:42:04,402 - jobqueues.localqueue - INFO - Completed /home/sdoerr/Work/htmd/tutorials/adaptivemd/input/e2s1_e1s1p0f4\n",
      "2024-06-11 16:42:04,403 - jobqueues.localqueue - INFO - Running /home/sdoerr/Work/htmd/tutorials/adaptivemd/input/e2s2_e1s2p0f3 on device 0\n",
      "2024-06-11 16:43:02,860 - htmd.adaptive.adaptive - INFO - Processing epoch 2\n",
      "2024-06-11 16:43:02,861 - htmd.adaptive.adaptive - INFO - Retrieving simulations.\n",
      "2024-06-11 16:43:02,862 - htmd.adaptive.adaptive - INFO - 1 simulations in progress\n",
      "2024-06-11 16:43:02,862 - htmd.adaptive.adaptiverun - INFO - Postprocessing new data\n",
      "Creating simlist: 100%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588| 5/5 [00:00<00:00, 1180.96it/s]\n",
      "Filtering trajectories: 100%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588| 4/4 [00:00<00:00,  5.07it/s]\n",
      "Projecting trajectories: 100%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588| 4/4 [00:00<00:00, 293.88it/s]\n",
      "2024-06-11 16:43:03,931 - htmd.projections.metric - INFO - Frame step 0.1ns was read from the trajectories. If it looks wrong, redefine it by manually setting the MetricData.fstep property.\n",
      "2024-06-11 16:43:03,932 - htmd.metricdata - INFO - Dropped 0 trajectories from 4 resulting in 4\n",
      "2024-06-11 16:43:05,564 - htmd.metricdata - INFO - Dropped 0 trajectories from 4 resulting in 4\n",
      "2024-06-11 16:43:05,854 - htmd.model - INFO - 100.0% of the data was used\n",
      "2024-06-11 16:43:05,855 - htmd.model - INFO - Number of trajectories that visited each macrostate:\n",
      "2024-06-11 16:43:05,856 - htmd.model - INFO - [2 4 4 3 4]\n",
      "Writing inputs: 100%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588| 2/2 [00:00<00:00,  3.53it/s]\n",
      "2024-06-11 16:43:06,427 - jobqueues.localqueue - INFO - Queueing /home/sdoerr/Work/htmd/tutorials/adaptivemd/input/e3s1_e2s1p0f6\n",
      "2024-06-11 16:43:06,428 - jobqueues.localqueue - INFO - Queueing /home/sdoerr/Work/htmd/tutorials/adaptivemd/input/e3s2_e1s1p0f1\n",
      "2024-06-11 16:43:06,429 - htmd.adaptive.adaptive - INFO - Finished submitting simulations.\n",
      "2024-06-11 16:43:06,429 - htmd.adaptive.adaptive - INFO - Sleeping for 120 seconds.\n",
      "2024-06-11 16:44:21,527 - jobqueues.localqueue - INFO - Completed /home/sdoerr/Work/htmd/tutorials/adaptivemd/input/e2s2_e1s2p0f3\n",
      "2024-06-11 16:44:21,528 - jobqueues.localqueue - INFO - Running /home/sdoerr/Work/htmd/tutorials/adaptivemd/input/e3s1_e2s1p0f6 on device 0\n",
      "2024-06-11 16:45:06,528 - htmd.adaptive.adaptive - INFO - Processing epoch 3\n",
      "2024-06-11 16:45:06,529 - htmd.adaptive.adaptive - INFO - Retrieving simulations.\n",
      "2024-06-11 16:45:06,530 - htmd.adaptive.adaptive - INFO - 2 simulations in progress\n",
      "2024-06-11 16:45:06,531 - htmd.adaptive.adaptive - INFO - Sleeping for 120 seconds.\n",
      "2024-06-11 16:46:38,893 - jobqueues.localqueue - INFO - Completed /home/sdoerr/Work/htmd/tutorials/adaptivemd/input/e3s1_e2s1p0f6\n",
      "2024-06-11 16:46:38,894 - jobqueues.localqueue - INFO - Running /home/sdoerr/Work/htmd/tutorials/adaptivemd/input/e3s2_e1s1p0f1 on device 0\n",
      "2024-06-11 16:47:06,628 - htmd.adaptive.adaptive - INFO - Processing epoch 3\n",
      "2024-06-11 16:47:06,629 - htmd.adaptive.adaptive - INFO - Retrieving simulations.\n",
      "2024-06-11 16:47:06,629 - htmd.adaptive.adaptive - INFO - 1 simulations in progress\n",
      "2024-06-11 16:47:06,630 - htmd.adaptive.adaptive - INFO - Sleeping for 120 seconds.\n",
      "2024-06-11 16:48:56,028 - jobqueues.localqueue - INFO - Completed /home/sdoerr/Work/htmd/tutorials/adaptivemd/input/e3s2_e1s1p0f1\n",
      "2024-06-11 16:49:06,740 - htmd.adaptive.adaptive - INFO - Processing epoch 3\n",
      "2024-06-11 16:49:06,741 - htmd.adaptive.adaptive - INFO - Retrieving simulations.\n",
      "2024-06-11 16:49:06,741 - htmd.adaptive.adaptive - INFO - 0 simulations in progress\n",
      "2024-06-11 16:49:06,742 - htmd.adaptive.adaptive - INFO - Reached maximum number of epochs 3\n"
     ]
    }
   ],
   "source": [
    "ad.run()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## AdaptiveGoal"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now let's change to the `adaptivegoal` directory and work there instead:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [],
   "source": [
    "os.chdir('../adaptivegoal')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "-"
    }
   },
   "source": [
    "* Most of the class arguments are identical to AdaptiveMD"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [],
   "source": [
    "adg = AdaptiveGoal()\n",
    "adg.app = queue\n",
    "adg.nmin = 1\n",
    "adg.nmax = 3\n",
    "adg.nepochs = 2\n",
    "adg.generatorspath = './generators'\n",
    "adg.projection = MetricSelfDistance('protein and name CA')\n",
    "adg.updateperiod = 120  # execute every 2 minutes\n",
    "adg.goalfunction = None  # set to None just as an example"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "* It requires the `goalfunction` argument which defines a goal\n",
    "* We can define a variety of different goal functions"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## The goal function"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The goal function will:\n",
    "* take as input a `Molecule` object of a simulation and \n",
    "* produce as output a score for each frame of that simulation. \n",
    "* The higher the score, the more desirable that simulation frame for being respawned."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "### RMSD goal function"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "For this goal function, we will use a crystal structure of NTL9.\n",
    "\n",
    "You can download the structure from the following link and save it on the `adaptivegoal` directory:\n",
    "\n",
    "* [NTL9 crystal structure](http://pub.htmd.org/tutorials/adaptive-sampling/ntl9_crystal.pdb).\n",
    "\n",
    "Alternatively, you can download the structure using `wget`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [],
   "source": [
    "assert os.system('wget -q http://pub.htmd.org/tutorials/adaptive-sampling/ntl9_crystal.pdb') == 0"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "We can define a simple goal function that uses the RMSD between the conformation sampled and a reference (in this case, the crystal structure), and returns a score to be evaluated by the `AdaptiveGoal` algorithm:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [],
   "source": [
    "ref = Molecule('./ntl9_crystal.pdb')\n",
    "\n",
    "def mygoalfunction(mol):\n",
    "    rmsd = MetricRmsd(ref, 'protein and name CA').project(mol)\n",
    "    return -rmsd  # or even 1/rmsd\n",
    "\n",
    "adg.goalfunction = mygoalfunction"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "`AdaptiveGoal` ranks conformations from a high to low score. For the case of RMSD, since we want lower RMSD to give higher score, the symetric value is returned instead (the inverse would also work)."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "Launch the `AdaptiveGoal` run:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "2024-06-11 16:49:07,264 - htmd.adaptive.adaptive - INFO - Processing epoch 0\n",
      "2024-06-11 16:49:07,265 - htmd.adaptive.adaptive - INFO - Epoch 0, generating first batch\n",
      "2024-06-11 16:49:07,286 - jobqueues.localqueue - INFO - Queueing /home/sdoerr/Work/htmd/tutorials/adaptivegoal/input/e1s1_ntl9_1ns_0\n",
      "2024-06-11 16:49:07,287 - jobqueues.localqueue - INFO - Queueing /home/sdoerr/Work/htmd/tutorials/adaptivegoal/input/e1s2_ntl9_1ns_1\n",
      "2024-06-11 16:49:07,287 - jobqueues.localqueue - INFO - Running /home/sdoerr/Work/htmd/tutorials/adaptivegoal/input/e1s1_ntl9_1ns_0 on device 0\n",
      "2024-06-11 16:49:07,288 - jobqueues.localqueue - INFO - Queueing /home/sdoerr/Work/htmd/tutorials/adaptivegoal/input/e1s3_ntl9_1ns_2\n",
      "2024-06-11 16:49:07,289 - htmd.adaptive.adaptive - INFO - Sleeping for 120 seconds.\n",
      "2024-06-11 16:51:07,308 - htmd.adaptive.adaptive - INFO - Processing epoch 1\n",
      "2024-06-11 16:51:07,309 - htmd.adaptive.adaptive - INFO - Retrieving simulations.\n",
      "2024-06-11 16:51:07,309 - htmd.adaptive.adaptive - INFO - 3 simulations in progress\n",
      "2024-06-11 16:51:07,310 - htmd.adaptive.adaptive - INFO - Sleeping for 120 seconds.\n",
      "2024-06-11 16:51:18,374 - jobqueues.localqueue - INFO - Completed /home/sdoerr/Work/htmd/tutorials/adaptivegoal/input/e1s1_ntl9_1ns_0\n",
      "2024-06-11 16:51:18,375 - jobqueues.localqueue - INFO - Running /home/sdoerr/Work/htmd/tutorials/adaptivegoal/input/e1s2_ntl9_1ns_1 on device 0\n",
      "2024-06-11 16:53:07,408 - htmd.adaptive.adaptive - INFO - Processing epoch 1\n",
      "2024-06-11 16:53:07,409 - htmd.adaptive.adaptive - INFO - Retrieving simulations.\n",
      "2024-06-11 16:53:07,409 - htmd.adaptive.adaptive - INFO - 2 simulations in progress\n",
      "2024-06-11 16:53:07,410 - htmd.adaptive.adaptive - INFO - Sleeping for 120 seconds.\n",
      "2024-06-11 16:53:28,138 - jobqueues.localqueue - INFO - Completed /home/sdoerr/Work/htmd/tutorials/adaptivegoal/input/e1s2_ntl9_1ns_1\n",
      "2024-06-11 16:53:28,139 - jobqueues.localqueue - INFO - Running /home/sdoerr/Work/htmd/tutorials/adaptivegoal/input/e1s3_ntl9_1ns_2 on device 0\n",
      "2024-06-11 16:55:07,508 - htmd.adaptive.adaptive - INFO - Processing epoch 1\n",
      "2024-06-11 16:55:07,509 - htmd.adaptive.adaptive - INFO - Retrieving simulations.\n",
      "2024-06-11 16:55:07,510 - htmd.adaptive.adaptive - INFO - 1 simulations in progress\n",
      "2024-06-11 16:55:07,510 - htmd.adaptive.adaptiverun - INFO - Postprocessing new data\n",
      "Creating simlist: 100%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588| 3/3 [00:00<00:00, 1872.46it/s]\n",
      "Filtering trajectories: 100%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588| 2/2 [00:00<00:00,  3.81it/s]\n",
      "Projecting trajectories: 100%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588| 2/2 [00:00<00:00, 409.90it/s]\n",
      "2024-06-11 16:55:08,491 - htmd.projections.metric - INFO - Frame step 0.1ns was read from the trajectories. If it looks wrong, redefine it by manually setting the MetricData.fstep property.\n",
      "2024-06-11 16:55:08,492 - htmd.metricdata - INFO - Dropped 0 trajectories from 2 resulting in 2\n",
      "2024-06-11 16:55:08,976 - htmd.metricdata - INFO - Dropped 0 trajectories from 2 resulting in 2\n",
      "Projecting trajectories: 100%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588| 2/2 [00:00<00:00, 37.06it/s]\n",
      "2024-06-11 16:55:09,158 - htmd.projections.metric - INFO - Frame step 0.1ns was read from the trajectories. If it looks wrong, redefine it by manually setting the MetricData.fstep property.\n",
      "2024-06-11 16:55:09,160 - htmd.adaptive.adaptiverun - WARNING - Using less macrostates than requested due to lack of microstates. macronum = 3\n",
      "2024-06-11 16:55:09,299 - htmd.model - WARNING - PCCA returned empty macrostates. Reducing the number of macrostates to 2.\n",
      "2024-06-11 16:55:09,301 - htmd.model - INFO - 100.0% of the data was used\n",
      "2024-06-11 16:55:09,302 - htmd.model - INFO - Number of trajectories that visited each macrostate:\n",
      "2024-06-11 16:55:09,302 - htmd.model - INFO - [2 2]\n",
      "Writing inputs: 100%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588| 2/2 [00:00<00:00,  6.22it/s]\n",
      "2024-06-11 16:55:09,630 - jobqueues.localqueue - INFO - Queueing /home/sdoerr/Work/htmd/tutorials/adaptivegoal/input/e2s1_e1s2p0f3\n",
      "2024-06-11 16:55:09,630 - jobqueues.localqueue - INFO - Queueing /home/sdoerr/Work/htmd/tutorials/adaptivegoal/input/e2s2_e1s2p0f4\n",
      "2024-06-11 16:55:09,631 - htmd.adaptive.adaptive - INFO - Finished submitting simulations.\n",
      "2024-06-11 16:55:09,631 - htmd.adaptive.adaptive - INFO - Sleeping for 120 seconds.\n",
      "2024-06-11 16:55:37,450 - jobqueues.localqueue - INFO - Completed /home/sdoerr/Work/htmd/tutorials/adaptivegoal/input/e1s3_ntl9_1ns_2\n",
      "2024-06-11 16:55:37,451 - jobqueues.localqueue - INFO - Running /home/sdoerr/Work/htmd/tutorials/adaptivegoal/input/e2s1_e1s2p0f3 on device 0\n",
      "2024-06-11 16:57:09,729 - htmd.adaptive.adaptive - INFO - Processing epoch 2\n",
      "2024-06-11 16:57:09,730 - htmd.adaptive.adaptive - INFO - Retrieving simulations.\n",
      "2024-06-11 16:57:09,731 - htmd.adaptive.adaptive - INFO - 2 simulations in progress\n",
      "2024-06-11 16:57:09,731 - htmd.adaptive.adaptive - INFO - Sleeping for 120 seconds.\n",
      "2024-06-11 16:57:48,099 - jobqueues.localqueue - INFO - Completed /home/sdoerr/Work/htmd/tutorials/adaptivegoal/input/e2s1_e1s2p0f3\n",
      "2024-06-11 16:57:48,099 - jobqueues.localqueue - INFO - Running /home/sdoerr/Work/htmd/tutorials/adaptivegoal/input/e2s2_e1s2p0f4 on device 0\n",
      "2024-06-11 16:59:09,829 - htmd.adaptive.adaptive - INFO - Processing epoch 2\n",
      "2024-06-11 16:59:09,830 - htmd.adaptive.adaptive - INFO - Retrieving simulations.\n",
      "2024-06-11 16:59:09,831 - htmd.adaptive.adaptive - INFO - 1 simulations in progress\n",
      "2024-06-11 16:59:09,832 - htmd.adaptive.adaptive - INFO - Sleeping for 120 seconds.\n",
      "2024-06-11 16:59:58,223 - jobqueues.localqueue - INFO - Completed /home/sdoerr/Work/htmd/tutorials/adaptivegoal/input/e2s2_e1s2p0f4\n",
      "2024-06-11 17:01:09,870 - htmd.adaptive.adaptive - INFO - Processing epoch 2\n",
      "2024-06-11 17:01:09,878 - htmd.adaptive.adaptive - INFO - Retrieving simulations.\n",
      "2024-06-11 17:01:09,879 - htmd.adaptive.adaptive - INFO - 0 simulations in progress\n",
      "2024-06-11 17:01:09,880 - htmd.adaptive.adaptive - INFO - Reached maximum number of epochs 2\n"
     ]
    }
   ],
   "source": [
    "adg.run()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "### Functions with multiple arguments"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The goal function can also take multiple arguments. This allows flexibility and on-the-fly comparisons to non-static conformations (i.e. compare with different references as the run progresses). Here, we redefine the previous goal function with multiple arguments:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {},
   "outputs": [],
   "source": [
    "def newgoalfunction(mol, crystal):\n",
    "    rmsd = MetricRmsd(crystal, 'protein and name CA').project(mol)\n",
    "    return -rmsd  # or even 1/rmsd"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "Now we clean the previous `AdaptiveGoal` run, and start a new one with the new goal function:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {
    "slideshow": {
     "slide_type": "-"
    }
   },
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "2024-06-11 17:01:09,923 - htmd.adaptive.adaptive - INFO - Processing epoch 0\n",
      "2024-06-11 17:01:09,924 - htmd.adaptive.adaptive - INFO - Epoch 0, generating first batch\n",
      "2024-06-11 17:01:09,935 - jobqueues.localqueue - INFO - Queueing /home/sdoerr/Work/htmd/tutorials/adaptivegoal/input/e1s1_ntl9_1ns_0\n",
      "2024-06-11 17:01:09,935 - jobqueues.localqueue - INFO - Queueing /home/sdoerr/Work/htmd/tutorials/adaptivegoal/input/e1s2_ntl9_1ns_1\n",
      "2024-06-11 17:01:09,935 - jobqueues.localqueue - INFO - Running /home/sdoerr/Work/htmd/tutorials/adaptivegoal/input/e1s1_ntl9_1ns_0 on device 0\n",
      "2024-06-11 17:01:09,936 - jobqueues.localqueue - INFO - Queueing /home/sdoerr/Work/htmd/tutorials/adaptivegoal/input/e1s3_ntl9_1ns_2\n",
      "2024-06-11 17:01:09,936 - htmd.adaptive.adaptive - INFO - Sleeping for 120 seconds.\n",
      "2024-06-11 17:03:10,040 - htmd.adaptive.adaptive - INFO - Processing epoch 1\n",
      "2024-06-11 17:03:10,049 - htmd.adaptive.adaptive - INFO - Retrieving simulations.\n",
      "2024-06-11 17:03:10,050 - htmd.adaptive.adaptive - INFO - 3 simulations in progress\n",
      "2024-06-11 17:03:10,050 - htmd.adaptive.adaptive - INFO - Sleeping for 120 seconds.\n",
      "2024-06-11 17:03:39,489 - jobqueues.localqueue - INFO - Completed /home/sdoerr/Work/htmd/tutorials/adaptivegoal/input/e1s1_ntl9_1ns_0\n",
      "2024-06-11 17:03:39,495 - jobqueues.localqueue - INFO - Running /home/sdoerr/Work/htmd/tutorials/adaptivegoal/input/e1s2_ntl9_1ns_1 on device 0\n",
      "2024-06-11 17:05:10,158 - htmd.adaptive.adaptive - INFO - Processing epoch 1\n",
      "2024-06-11 17:05:10,161 - htmd.adaptive.adaptive - INFO - Retrieving simulations.\n",
      "2024-06-11 17:05:10,161 - htmd.adaptive.adaptive - INFO - 2 simulations in progress\n",
      "2024-06-11 17:05:10,161 - htmd.adaptive.adaptive - INFO - Sleeping for 120 seconds.\n",
      "2024-06-11 17:05:55,679 - jobqueues.localqueue - INFO - Completed /home/sdoerr/Work/htmd/tutorials/adaptivegoal/input/e1s2_ntl9_1ns_1\n",
      "2024-06-11 17:05:55,680 - jobqueues.localqueue - INFO - Running /home/sdoerr/Work/htmd/tutorials/adaptivegoal/input/e1s3_ntl9_1ns_2 on device 0\n",
      "2024-06-11 17:07:10,260 - htmd.adaptive.adaptive - INFO - Processing epoch 1\n",
      "2024-06-11 17:07:10,261 - htmd.adaptive.adaptive - INFO - Retrieving simulations.\n",
      "2024-06-11 17:07:10,262 - htmd.adaptive.adaptive - INFO - 1 simulations in progress\n",
      "2024-06-11 17:07:10,262 - htmd.adaptive.adaptiverun - INFO - Postprocessing new data\n",
      "Creating simlist: 100%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588| 3/3 [00:00<00:00, 889.50it/s]\n",
      "Filtering trajectories: 100%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588| 2/2 [00:00<00:00,  3.99it/s]\n",
      "Projecting trajectories: 100%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588| 2/2 [00:00<00:00, 438.14it/s]\n",
      "2024-06-11 17:07:11,184 - htmd.projections.metric - INFO - Frame step 0.1ns was read from the trajectories. If it looks wrong, redefine it by manually setting the MetricData.fstep property.\n",
      "2024-06-11 17:07:11,185 - htmd.metricdata - INFO - Dropped 0 trajectories from 2 resulting in 2\n",
      "2024-06-11 17:07:11,757 - htmd.metricdata - INFO - Dropped 0 trajectories from 2 resulting in 2\n",
      "Projecting trajectories: 100%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588| 2/2 [00:00<00:00, 30.17it/s]\n",
      "2024-06-11 17:07:11,857 - htmd.projections.metric - INFO - Frame step 0.1ns was read from the trajectories. If it looks wrong, redefine it by manually setting the MetricData.fstep property.\n",
      "2024-06-11 17:07:11,859 - htmd.adaptive.adaptiverun - WARNING - Using less macrostates than requested due to lack of microstates. macronum = 3\n",
      "2024-06-11 17:07:12,042 - htmd.model - INFO - 95.0% of the data was used\n",
      "2024-06-11 17:07:12,043 - htmd.model - INFO - Number of trajectories that visited each macrostate:\n",
      "2024-06-11 17:07:12,043 - htmd.model - INFO - [2 2]\n",
      "Writing inputs: 100%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588| 2/2 [00:00<00:00,  5.80it/s]\n",
      "2024-06-11 17:07:12,394 - jobqueues.localqueue - INFO - Queueing /home/sdoerr/Work/htmd/tutorials/adaptivegoal/input/e2s1_e1s1p0f5\n",
      "2024-06-11 17:07:12,395 - jobqueues.localqueue - INFO - Queueing /home/sdoerr/Work/htmd/tutorials/adaptivegoal/input/e2s2_e1s1p0f5\n",
      "2024-06-11 17:07:12,395 - htmd.adaptive.adaptive - INFO - Finished submitting simulations.\n",
      "2024-06-11 17:07:12,395 - htmd.adaptive.adaptive - INFO - Sleeping for 120 seconds.\n",
      "2024-06-11 17:08:06,459 - jobqueues.localqueue - INFO - Completed /home/sdoerr/Work/htmd/tutorials/adaptivegoal/input/e1s3_ntl9_1ns_2\n",
      "2024-06-11 17:08:06,460 - jobqueues.localqueue - INFO - Running /home/sdoerr/Work/htmd/tutorials/adaptivegoal/input/e2s1_e1s1p0f5 on device 0\n",
      "2024-06-11 17:09:12,492 - htmd.adaptive.adaptive - INFO - Processing epoch 2\n",
      "2024-06-11 17:09:12,493 - htmd.adaptive.adaptive - INFO - Retrieving simulations.\n",
      "2024-06-11 17:09:12,493 - htmd.adaptive.adaptive - INFO - 2 simulations in progress\n",
      "2024-06-11 17:09:12,494 - htmd.adaptive.adaptive - INFO - Sleeping for 120 seconds.\n",
      "2024-06-11 17:10:16,325 - jobqueues.localqueue - INFO - Completed /home/sdoerr/Work/htmd/tutorials/adaptivegoal/input/e2s1_e1s1p0f5\n",
      "2024-06-11 17:10:16,325 - jobqueues.localqueue - INFO - Running /home/sdoerr/Work/htmd/tutorials/adaptivegoal/input/e2s2_e1s1p0f5 on device 0\n",
      "2024-06-11 17:11:12,593 - htmd.adaptive.adaptive - INFO - Processing epoch 2\n",
      "2024-06-11 17:11:12,595 - htmd.adaptive.adaptive - INFO - Retrieving simulations.\n",
      "2024-06-11 17:11:12,595 - htmd.adaptive.adaptive - INFO - 1 simulations in progress\n",
      "2024-06-11 17:11:12,596 - htmd.adaptive.adaptive - INFO - Sleeping for 120 seconds.\n",
      "2024-06-11 17:12:26,900 - jobqueues.localqueue - INFO - Completed /home/sdoerr/Work/htmd/tutorials/adaptivegoal/input/e2s2_e1s1p0f5\n",
      "2024-06-11 17:13:12,625 - htmd.adaptive.adaptive - INFO - Processing epoch 2\n",
      "2024-06-11 17:13:12,627 - htmd.adaptive.adaptive - INFO - Retrieving simulations.\n",
      "2024-06-11 17:13:12,628 - htmd.adaptive.adaptive - INFO - 0 simulations in progress\n",
      "2024-06-11 17:13:12,629 - htmd.adaptive.adaptive - INFO - Reached maximum number of epochs 2\n"
     ]
    }
   ],
   "source": [
    "# clean previous run\n",
    "shutil.rmtree('./input')\n",
    "shutil.rmtree('./data')\n",
    "shutil.rmtree('./filtered')\n",
    "\n",
    "# run with new goal\n",
    "ref = Molecule('./ntl9_crystal.pdb')\n",
    "adg.goalfunction = (newgoalfunction, (ref,))\n",
    "adg.run()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "### Other goal function examples"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "HTMD includes other two goal functions: The secondary structure goal function and the contacts goal function."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "#### Secondary structure goal function"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {},
   "outputs": [],
   "source": [
    "ref = Molecule('./ntl9_crystal.pdb')\n",
    "\n",
    "def ssGoal(mol, crystal):\n",
    "    crystalSS = MetricSecondaryStructure().project(crystal)[0]\n",
    "    proj = MetricSecondaryStructure().project(mol)\n",
    "    # How many crystal SS match with simulation SS\n",
    "    ss_score = np.sum(proj == crystalSS, axis=1) / proj.shape[1]  \n",
    "    return ss_score\n",
    "\n",
    "adg.goalfunction = (ssGoal, (ref,))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "#### Contacts goal function"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {},
   "outputs": [],
   "source": [
    "ref = Molecule('./ntl9_crystal.pdb')\n",
    "\n",
    "def contactGoal(mol, crystal):\n",
    "    crystalCO = MetricSelfDistance('protein and name CA', pbc=False,\n",
    "                                   metric='contacts', \n",
    "                                   threshold=10).project(crystal)\n",
    "    proj = MetricSelfDistance('protein and name CA', \n",
    "                              metric='contacts', \n",
    "                              threshold=10).project(mol)\n",
    "    # How many crystal contacts are seen?\n",
    "    co_score = np.sum(proj[:, crystalCO] == 1, axis=1)\n",
    "    co_score /= np.sum(crystalCO)\n",
    "    return ss_score\n",
    "\n",
    "adg.goalfunction = (contactGoal, (ref,))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Many more goal functions can be devised."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "anaconda-cloud": {},
  "celltoolbar": "Slideshow",
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.13"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}