Getting started with HTMD#

Assuming that you have already downloaded and installed HTMD, this tutorial introduces you to the software, specially into the Molecule class, whose features serve as a good introduction for the more complex features of HTMD.

Let’s get started! The first thing you will have to get familiar with in HTMD is the Molecule class.

Molecule objects: * Store structural information on molecules. * do not only contain a single molecule. * can contain a whole system including water, ions, proteins, ligands, lipids etc., in a similar way to VMD (a visualization software we also use in HTMD)

Think of these objects as containers of structural information.

First, we need to import HTMD, so that any class and function defined by HTMD is available in the workspace.

In HTMD, there are several submodules, and an easier way to import the most important ones is to use the following:

from htmd.ui import *
Please cite HTMD: Doerr et al.(2016)JCTC,12,1845.
https://dx.doi.org/10.1021/acs.jctc.6b00049
Documentation: http://software.acellera.com/
To update: conda update htmd -c acellera -c psi4

You are on the latest HTMD version (unpackaged : /home/joao/maindisk/software/repos/Acellera/htmd/htmd).

Reading files#

The Molecule class provides file readers for various structure formats like PDB, PRMTOP, PSF, GRO, MOL2, MAE and more. It is also able to read various MD trajectory and coordinate formats including XTC, DCD, COOR, CRD, TRR, XYZ etc. The method for reading files is Molecule.read(), however you can also specify the file name in the class constructor and it will automatically call read(). Let’s see an example:

mol = Molecule('3PTB')
2018-03-08 13:39:38,044 - htmd.molecule.readers - INFO - Using local copy for 3PTB: /home/joao/maindisk/software/repos/Acellera/htmd/htmd/data/pdb/3ptb.pdb
2018-03-08 13:39:38,221 - moleculekit.molecule - WARNING - Residue insertions were detected in the Molecule. It is recommended to renumber the residues using the Molecule.renumberResidues() method.

or just use a local file:

mol = Molecule('yourprotein.pdb')

PDB files contain both atom information and coordinates. Some other formats separate the atom information from the coordinates. In that case you can start for example by reading atom information from a PSF file and then read atom coordinates using the read method of Molecule as in the next example. You could also read them in reverse order, creating the Molecule using the XTC and then reading the PSF (it would not matter).

mol = Molecule('yourstructure.psf')
mol.read('yourtrajectory.xtc')

Writing files#

The Molecule class also provides file writers for multiple formats using the Molecule.write() method.

mol.write('yourtrajectory.dcd')
mol.write('yourstructure.prmtop')

Looking inside a Molecule#

Printing the Molecule object shows its properties:

print(mol)
Molecule with 1701 atoms and 1 frames
Atom field - altloc shape: (1701,)
Atom field - atomtype shape: (1701,)
Atom field - beta shape: (1701,)
Atom field - chain shape: (1701,)
Atom field - charge shape: (1701,)
Atom field - coords shape: (1701, 3, 1)
Atom field - element shape: (1701,)
Atom field - insertion shape: (1701,)
Atom field - masses shape: (1701,)
Atom field - name shape: (1701,)
Atom field - occupancy shape: (1701,)
Atom field - record shape: (1701,)
Atom field - resid shape: (1701,)
Atom field - resname shape: (1701,)
Atom field - segid shape: (1701,)
Atom field - serial shape: (1701,)
angles shape: (0, 3)
bonds shape: (42, 2)
bondtype shape: (42,)
box shape: (3, 1)
boxangles shape: (3, 1)
crystalinfo: {'a': 54.890000000000001, 'b': 58.520000000000003, 'c': 67.629999999999995, 'alpha': 90.0, 'beta': 90.0, 'gamma': 90.0, 'sGroup': ['P', '21', '21', '21'], 'z': 4, 'numcopies': 4, 'rotations': array([[[ 1.,  0.,  0.],
        [ 0.,  1.,  0.],
        [ 0.,  0.,  1.]],

       [[-1.,  0.,  0.],
        [ 0., -1.,  0.],
        [ 0.,  0.,  1.]],

       [[-1.,  0.,  0.],
        [ 0.,  1.,  0.],
        [ 0.,  0., -1.]],

       [[ 1.,  0.,  0.],
        [ 0., -1.,  0.],
        [ 0.,  0., -1.]]]), 'translations': array([[  0.   ,   0.   ,   0.   ],
       [ 27.445,   0.   ,  33.815],
       [  0.   ,  29.26 ,  33.815],
       [ 27.445,  29.26 ,   0.   ]])}
dihedrals shape: (0, 4)
fileloc shape: (1, 2)
impropers shape: (0, 4)
reps:
ssbonds shape: (0,)
step shape: (1,)
time shape: (1,)
topoloc: 3PTB
viewname: 3PTB

Properties and methods of Molecule objects#

Each Molecule object has a number of properties (data associated to the molecule) and methods (operations that you can perform on the molecule). Some of the properties correspond to data which is usually found in PDB files.

Properties

Methods

record

read( )

serial

write( )

name

get( )

resname

set( )

chain

atomselect( )

resid

copy( )

segid

filter( )

coords

append( )

box

insert( )

reps

view( )

moveBy( )

rotateBy( )

Properties can be accessed,

  • either directly:

mol.serial
array([   1,    2,    3, ..., 1700, 1701, 1702])

or,

  • via the Molecule.get method:

mol.get('serial')
array([   1,    2,    3, ..., 1700, 1701, 1702])

To get help on a particular method of the Molecule() class, one can do:

help(Molecule.get)
Help on function get in module moleculekit.molecule:

get(self, field, sel=None)
    Retrieve a specific PDB field based on the selection

    Parameters
    ----------
    field : str
        The PDB field we want to get
    sel : str
        Atom selection string for which atoms we want to get the field from. Default all.
        See more here

    Returns
    ------
    vals : np.ndarray
        Array of values of field for all atoms in the selection.

    Examples
    --------
    >>> mol=tryp.copy()
    >>> mol.get('resname')
    array(['ILE', 'ILE', 'ILE', ..., 'HOH', 'HOH', 'HOH'], dtype=object)
    >>> mol.get('resname', sel='resid 158')
    array(['LEU', 'LEU', 'LEU', 'LEU', 'LEU', 'LEU', 'LEU', 'LEU'], dtype=object)