Analyzing the results!

What's in a trajectory and how to get at it...

The purpose of this tutorial is to provide an example of how one goes about analyzing the results from a molecular dynamics run using some of the tools supplied within AMBER and some available through other sources. Particular emphasis is placed on analyzing molecular dynamics trajectories that need to be "imaged" after running a particle mesh Ewald calculation; as you work through the demo, you will see why this is necessary...

As part of the analysis, we will feature small aspects of a series of programs. The highlighted "links" below will let you jump to particular sections in the tutorial...

To support the analysis, trajectories of the decamer poly(A)-poly(T) DNA structure we built and then equilibrated in previous sections of this tutorial are supplied. Rather than starting in a canonical B-DNA structure, this was originally started in a canonical A-DNA geometry. During the dynamics, it spontaneously (after ~200-500 ps) converted to an B-form geometry [See Cheatham & Kollman, 1996. J. Mol. Biol. 259, 434-444. for a discussion of this with a different sequence]. What you will be looking at here is the result of ~16 hours of running on the Cray T3D at PSC; the trajectory represents ~95 ps of a particle mesh Ewald trajectory from 1775 ps to ~1870 ps. Here are the files you will be looking at.

Calculating the RMS deviation to a starting structure

carnal is useful for calculating the RMS deviation of each frame in our trajectory to the first frame in the trajectory. This is useful to see how far we have drifted during the dynamics. We create a carnal input carnal_rms_start.in as follows:
FILES_IN
  PARM p1 atA.topo;
  STREAM s1 atA.traj.1775_1870ps;
FILES_OUT
  TABLE tab1 rms_to_start;
DECLARE
  GROUP gALL (RES 1 - 20);
  RMS r1 FIT gALL s1;
OUTPUT
  TABLE tab1 r1;
END
In the input file, we specifed our parm file and trajectory as input and the file rms_to_start as output. After carnal is run, this file will contain the "TABLE" of "RMS" (r1) values. In the "DECLARE" section we specified a "group" representing the atoms we would like to RMS fit. We are going to use all the DNA atoms, or residues 1 through 20. The key to this run of carnal is the RMS command which says to perform the RMS fit. [Note that you cannot have "-" in file names; this breaks carnal.] To run this example:
%  carnal < carnal_rms_start.in
This will generate the output as follows:

                          - CARNAL -
                          AMBER  4.1

                      COORDINATE  ANALYSIS


input stdin
> FILES_IN
>   PARM p1 atA.topo;
Reading parm file (atA.topo)
parm: opening atA.topo
atA.topo title:
PDB 10mer AT
>   STREAM s1 atA.traj.1775_1870ps;
Using default parm (atA.topo) for STREAM s1
 (box will be read from stream)
stream: opening atA.traj.1775_1870ps
> FILES_OUT
>   TABLE tab1 rms_to_start;
> DECLARE
>   GROUP gALL (RES 1 - 20);
** Group gALL: 638 atoms
>   RMS r1 FIT gALL s1;
RMS: positioning to initial set of stream
> OUTPUT
>   TABLE tab1 r1;
> END
(EOF reached on atA.traj.1775_1870ps)
STREAM s1: no more files/crds (iteration 95)
SUMMARY
--RMS r1: avg      16.947  dev     7.014  max   20.204  min    0.568
You will note that the RMSd reported above is tremendously high and ranges from 0.568 to 20.204. This seems like something weird is going on. Before getting into the details, let's first look at a graph of the data which will shed more insight into the time course of the deviation (i.e. is it moving uniformly away from the starting structure?) and then we will look at a movie of the trajectory.

It should be noted that the output file generated by carnal (rms_to_start) has a strange format and cannot be directly imported into graphing programs, such as xmgr. The top of the file looks like this:


L0 0.000000e+00
L0 5.675537e-01
L0 5.903252e-01
L0 5.699365e-01
L0 6.835873e-01
After running carnal, the first step is to process the output file into a form suitable for other programs. To support this goal, I developed a fairly general perl script to convert this to a format where the first column represents the time and the remaining columns represent the carnal data. This script, process_carnal.perl reads the carnal input from standard input and creates a file called "1.data". If you give an argument to this script, it will use this as the filename.

Advanced feature: If you want to specify the starting time, set the environment variable START_TIME. If you want to specify the interval between frames, set the environment variable INTERVAL.

% setenv START_TIME 0.0
% setenv INTERVAL 1.0
By default the START_TIME is zero and the INTERVAL is 1.0.
% process_carnal.perl rms_to_start.xmgr < rms_to_start

After running this, we can look at it with xmgr. xmgr a very general and powerful, mouse driven and fairly self-explanatory 2D graphing program. During my CTC session, I will give a brief tutorial on this program (if there is interest). It is available from ...

% xmgr rms_to_start.xmgr
It will look something like this:

(picture of graph of rms vs time, etc\)

Looking at this, you see that the RMS jumps from low values to near 20 angstroms and back in nearly 1 step. This is a sign of something seriously wrong with the trajectory since molecules cannot reasonably move 20 angstroms in 1 picosecond...

Looking at a movie of the trajectory

Looking at a movie will give a better idea of why the RMS jumps around. This is a sign of the imaging issue with the particle mesh Ewald code I brought up in my lecture on solvation and long ranged electrostatics.

Using moil-view to look at a movie

Moil-view is a freely available program for looking at Moil, AMBER and pdb files. In addition to displaying structures, one can also look at movies of a trajectory and perform some rudimentary trajectory analysis. For more information about it, see the WWW page maintained by the author, Carlos Simmerling.

This program is nice since not only is it freely available, but the viewing options are very nice. Models can be viewed as solid spheres, lines, mixed spheres and line, ball and cylinder, etc. The program is also set up to recognize AMBER. The drawback is that the user doesn't have much control over the movie.

To view a movie of this trajectory, start up moil-view

% moil-view
With the right button, chose the "file" menu and "Read 1st coordinates". Select "AMBER" format and then specify the file "atA.coords" at the dialog. When asked, say that "no", you do not want to use bonds as connectivity, but "yes" you would like to load a connectivity file. Again specify the "AMBER" format and load up "atA.topo". The structure should appear.

To look at a movie of the trajectory, use the right mouse button and chose the "Dynamics" menu and "View trajectory" submenu. The trajectory we are looking at was run with water, so chose "AMBER PBC" as the type and load up "atA.traj.1775_1870ps". Input the start and end frame for the next dialog that pops up...

To stop the dynamics, hit the right button.

Looking at the movie, it is very hard to see anything since all that water is in there. So, we could temporarily hide the water by going into the main menu (right button), drag "Color options" and choose "No waters". Additionally, those ions are kind of hard to visualize, so lets make them solid spheres. Chose main menu "Objects are" and submenu "Mix lines/spheres" which will pop up a dialog box which will prompt you for a selection of atoms which you want as spheres. We will choose all the atoms named NA+. The selection format is kind of tricky; turn help on for more information... Here is what it will look like and what the pick string looks like:

(moil-view pick format)

This will display our structure as follows (for one of the frames)...

(moil-view pick format)

Looking at the movie, the imaging problem is readily apparent. The DNA molecule is split across the box.

Using MD_display to look at a movie

Terry Lybrand's MD_display is actually two programs. preproc preprocesses the trajectory and display does the actual movie. Although display doesn't have all the fancy viewing features of moil-view, it does give the user control over the view of frames and can be used in stereo mode.

To preprocess the trajectory:

% preproc -c atA.traj.1775_1870ps -p atA.topo -n movie -f 95 -X
When running with trajectories with water, you need to specify the "-X" option. The "-n" gives the root name of the files to be created.

Here is an example loading 5 frames...

MD preprocessor version 1.1b
Please wait while the data is loaded.
   0.90000000E+02  0.61317974E+02  0.41857475E+02  0.41857475E+02

Box dimensions: 61.32  41.86  41.86
Constant pressure box detected...
Allocating 277380 bytes of memory for 9245 atoms, box, 5 frames
 0
 1
 2
 3
 4
5 frames read.

Writing bond file.
Writing coord file.
Done.
Now we are ready to view the trajectory. Except for one thing. One difficulty with this program is that it takes over the entire screen. This is a little annoying, since to terminate the program, you need to confirm exit by typing into the window where you started the program from. However, you can't get there since the window is obscured by the display program... Therefore, you need to use the special SGI key command to pop the current window into the background, "Alt-F3". "Alt-F1" pops a window to the front. If you aren't running the 4Dwm, or changed the button binding's, you are on your own!
% display -n movie
As you start this up, this will prompt you for a variety of information. If you are lazy, just keep hitting return until the window pops up. The startup text looks like this:





                   MD DISPLAY

                       by

            T. Callahan & T. Lybrand



Please wait while the data is loaded.
--loading connectivity information--
55662 bytes read.

--loading atom attributes--
170058 bytes read.
9275 bonds, 0 hydrogen bonds.


----Initial coloring of residues & atoms----

         0 - invisible (remove)
         1 - white
         2 - red
         3 - blue
         4 - aqua
         5 - green
         6 - magenta
         7 - orange
         8 - yellow
         9 - gray
        98 - each molecule a different color
        99 - automatic color by atom type

Color (0-99, negative or  to continue) :
Since we've got water, let's at least color the water blue. Choose "3", hit return, and enter ":WAT" (which means select residues named water). Then hit return a few times...

To exit, chose the right button and select exit; remember that "Alt-F3" pops the window back so you can type your confirmation.

To control the speed and direction of the movie, use the command bar on the bottom left. The panel on the right controls a series of other features which will be displayed in the CTC session.

Movies with vmd

Looking at sander movies with vmd is relatively straightforward. From the "mol" menu, load up files selecting "parm and crd" as the type. Load up both the prmtop and trajectory, and off you go with the movie. The "animate" menu can control the progress.

Imaging the trajectory

Now that we've seen that the molecule is spread all over the box, we need to image the trajectory to bring the molecule back together. To do this, we center one of the strands (the first molecule) to the origin, "image" or move every molecule outside the box boundaries back into the box, then move both strands to the center and image again.

There are some issues however with imaging since this implicity will center the solute to the middle of the box. This will remove all information about the center of mass motion of the solute, hence will make it impossible to measure diffusion.

With the released code, there are a few options available, specifically using ptraj or carnal. The ptraj imaging is more general and support any triclinic unit cell so this will be the one we discuss. To do this, we create a file called imageit that has the necessary commands. trajin atA.traj.1775_1870ps trajout atA.traj.1775_1870ps.imaged center :1-10 mass origin image origin center center :1-20 mass origin image origin center

% ptraj atA.topo < imageit
The output will look something like this:
Opened file atA.topo with mode (r)
Opened parm file atA.topo
Scanned in title...
Read in the control variables...
Read in atom names...
Read in charges...
Read in masses...
Read in IAC (atoms involved in L-J)...
Read in NUMEX (index to excl atom list)
Read in NNO (index for nonbond of @type)
Read in residue labels...
 A5   A    A    A    A    A    A    A    A    A3
 T5   T    T    T    T    T    T    T    T    T3
 CIO  CIO  CIO  CIO  CIO  CIO  CIO  CIO  CIO  CIO
 CIO  CIO  CIO  CIO  CIO  CIO  CIO  CIO  WAT  WAT
 WAT  WAT  WAT  WAT  WAT  WAT  WAT  WAT  WAT  WAT
 ...
 WAT
Read in the pointer list of residues...
Read in bond parameters RK and REQ...
Read in angle parameters TK and TEQ...
Read in dihedral parameters PK, PN and PHASE...
Read in L-J parameters CN1 and CN2...
Read in info for bonds w/ hydrogen...
Read in info for bonds w/out hydrogen...
Read in info for angles w/ hydrogen...
Read in info for angles w/out hydrogen...
Read in info for dihedrals w/ hydrogen...
Read in info for dihedrals w/out hydrogen...
Read in excluded atom list...
Read in h-bond parameters, AG, BG, and HBCUT...
Read in ISYMBL, ITREE, JOIN, ROTAT...
Scanning Box
Read in box information...
Successfully completed readParm.

   MAIN MENU.  Please enter commands.  Use "?" or "help"
   for more info.  "exit" or "quit" to leave program...


MAIN MENU:
At the main menu prompt, type "transform atA.traj.1775_1870ps" and the program will prompt you to answer a few questions. Answer no to every question except "Do you want the box imaged?". After answering the questions, you will need to supply a name of a new trajectory file to write and also how many molecules are important for imaging. The latter question can be ignored (just hit return) or you can specify the number of important molecules, i.e. 2 with DNA. A sample session follows:
MAIN MENU:  transform atA.traj.1775_1870ps
Opened file atA.traj.1775_1870ps with mode (r)

   Transformation will place either the center of mass
   or the center of geometry at the origin.
   Do you want to translate to center of mass?  [yes] no

   Do you want the box imaged? [yes] yes

   Do you want principal axis alignment of the trajectory? [yes] no

   Do you want to perform RMS fitting? [no] no
   Attempting to transform AMBER trajectory file atA.traj.1775_1870ps
   Translating center of geometry to origin.

Output transformed trajectory to file: atA.traj.1775_1870ps.imaged
Opened file atA.traj.1775_1870ps.imaged with mode (w)
  Output is directed to file atA.traj.1775_1870ps.imaged

   How many solute molecules are important for imaging? 2

Set      1 ........Molecule        1 (  320 to   638) translated by     0     1     0 boxes
.Molecule        1 (  320 to   638) translated by     0     1     0 boxes
.Molecule        1 (  320 to   638) translated by     0     1     0 boxes
Note that the above command will write a new trajectory which is the same size as the old one, therefore this method is very space intensive. Since most of our processing doesn't need water, we will also demonstrate how to strip the water with rdparm. To do this, use the command "processtrajectory" or "processt".

MAIN MENU: processt atA.traj.1775_1870ps.imaged
This routine will read in a trajectory file,
perform simple manipulations, and write out new or
append to existing topology files.

Opened file atA.traj.1775_1870ps.imaged with mode (r)
What is the name of the trajectory file to output? atA.traj.1775_1870ps.strip
After processing, the trajectory will be dumped to file atA.traj.1775_1870ps.strip

Set      1 ................................................
Set     50 ..............................................

WARNING in readAmberTrajectory(): EOF detected or set #96 is corrupted
Successfully read in a total of    95 coordinate sets

Do you want to output every coordinate set? [no] yes

Do you want to strip the waters? [no] yes
Start striping after atom number: [656]

Striping all atoms after 656

Starting with structure 1, ending with 95,
taking every 1 structures
Rereading and writing the coordinates to the trajectory...

Set      1 ................................................
Set     50 .............................................

Using carnal to image

In prerelease versions of carnal you can use the "IMAGE" command to process a trajectory. [I am still trying to get this to work.] See carnal_rms_start_image.in.

Measuring atomic positional fluctuations with mdanal

Note that carnal can also calculate atomic positional fluctuations, however (and even though this code may be removed in a latter release) I am going to demonstrate the use of mdanal.

mdanal is released with AMBER and does a number of useful things, including calculating the atomic positional fluctuations which are directly related to the atomic B-factors.

We will demonstrate the use of the program with the trajectory stripped of the waters. The input file is mdanal.in. To reference the trajectory, it needs to be named "fort.12" for some archaic reason. Therefore, link the stripped trajectory to "fort.12"...

% ln -s atA.traj.1775_1870ps.strip fort.12
Then run the command...
%  mdanal -i mdanal.in \
        -o mdanal.out \
        -p invacuo.topo \
        -a mdanal.pdb \
        -g mdanal.pltfil

To look at the atomic positional fluctuations you can view mdanal.pltfil with xmgr. Unless this file is editted (to strip the top of the file), xmgr will complain, but it will still work...
% xmgr mdanal.pltfil
The large fluctuations on the right are the counterions. Under the "Graph" menu, choose "World Scaling" and you can change the scale of the y-axis to 3, hit "Accept" and hit "Close" and you will see a graph like this:

(graph of fluctuations)

The larger peaks at the left, middle and right (except for the counterions on the far right) represent the beginning and ends of the helices. The periodic peaks represent the backbone phosphates. Larger peaks mean more motion (the y-axis is atomic postional fluctuations in angstroms).

rdparm: parameter topology and trajectory analysis

rdparm, is a program supplied with the AMBER 4.1 suite that allows users to check parm files (i.e. look at the list of bonds, angles, etc) and perform some rudimentary functions, such as fixing waters added with the ADD water option of edit, delete certain bonds, angles or dihedrals, etc...

To start the program, users need to specify a AMBER parameter/topology file. We will look at the poly(A)-poly(t) in vacuo prmtop, invacuo.topo.

% rdparm invacuo.topo
Opened file invacuo.topo with mode (r)
Opened parm file invacuo.topo
Scanned in title...
Read in the control variables...
Read in atom names...
Read in charges...
Read in masses...
Read in IAC (atoms involved in L-J)...
Read in NUMEX (index to excl atom list)
Read in NNO (index for nonbond of @type)
Read in residue labels...
 A5   A    A    A    A    A    A    A    A    A3
 T5   T    T    T    T    T    T    T    T    T3
 CIO  CIO  CIO  CIO  CIO  CIO  CIO  CIO  CIO  CIO
 CIO  CIO  CIO  CIO  CIO  CIO  CIO  CIO
Read in the pointer list of residues...
Read in bond parameters RK and REQ...
Read in angle parameters TK and TEQ...
Read in dihedral parameters PK, PN and PHASE...
Read in L-J parameters CN1 and CN2...
Read in info for bonds w/ hydrogen...
Read in info for bonds w/out hydrogen...
Read in info for angles w/ hydrogen...
Read in info for angles w/out hydrogen...
Read in info for dihedrals w/ hydrogen...
Read in info for dihedrals w/out hydrogen...
Read in excluded atom list...
Read in h-bond parameters, AG, BG, and HBCUT...
Read in ISYMBL, ITREE, JOIN, ROTAT...
Successfully completed readParm.

   MAIN MENU.  Please enter commands.  Use "?" or "help"
   for more info.  "exit" or "quit" to leave program...


MAIN MENU:
To see the help menu, type help or "?"
MAIN MENU:  ?

   The following commands are currently available:

   help, ?
   atoms, printAtoms
   bonds, printBonds
   angles, printAngles
   dihedrals, printDihedrals
   pertBbonds, perturbedBonds
   pertAngles, perturbedAngles
   pertDihedrals, perturbedDihedrals
   printExluded
   printLennardjones
   parmInfo
   verbose
   ddrive       
   delete        
   delperturbed  
   restrain     
   openParm     
   writeParm    
   system       
   mardi2sander 
   transform    
   translateBox 
   modifyBoxInfo
   modifyMoleculeInfo
   quit, exit
In general, you don't need to specify the whole command, just typically the first few letters of the command. You can look at the list of atoms with the command "atoms", bonds with the command "bonds", etc. The "restrain" command is used to add parm style contraints.

One thing this code can do is generate 2D RMS plots. This is a matrix of all pairwise RMSd values between a series of structures. Currently the choices are rather limited, but in future, the picking should be more general. The clustering option is currently weak. I don't advise it. While running, you will be prompted with a series of questions asking what you want to RMS and how many frames. Additionally, you can dump PlotMTV format or raw postscript. The latter is advised. Here is an example of its use for RMS fitting the DNA bases...

MAIN MENU:  rms atA.traj.1775_1870ps.strip
  This routine will compute the RMS by RMS plot for
  for a trajectory, in a format suitable for display
  by the PlotMtv plotting program.  See the routine in
  cluster.c to change anything...

  NOTE: this module is under constant modification
  and is currently being equiped with a crude
  clustering option...

Opened file atA.traj.1775_1870ps.strip with mode (r)
   Do you to want pick atom numbers for the rms? [yes] no
   Do you to want to RMS only the peptide backbone? [yes] no
   Do you to want to RMS only the CA atoms? [yes] no
   Do you to want to RMS only the :8-120 backbone atoms? [yes] no
   Do you to want only the DNA bases? [yes]
The title follows...
PDB 10mer AT

Loading the trajectory...

Set      1 ................................................
Set     50 ..............................................

WARNING in readAmberTrajectory(): EOF detected or set #96 is corrupted
  95 frames were successfully read in.
  The maximum recommended size for a 2D RMS plot via
  PlotMtv is on the order of 300 x 300.
  How many snapshots in each dimension for the 2D RMS grid?  [200] 100
  Calculating the RMS values between every 1 snapshots...
....   Do you to actually dump a contour plot? [yes]
   Do you to actually dump MTV format instead of PS? [yes] no
Input the name of a file to dump the RMS plot data (ps): vacuo.ps
Opened file vacuo.ps with mode (w)
   Do you want to attempt to cluster this trajectory? [yes] no

MAIN MENU:  q
Here is what the postscript file looks like (in postscript)

(graph of 2D RMS plot)

Along each of the axes is time and each square represents the RMSd value of the bases between the 2 time points. On the diagonal, the RMSd is zero (since the time is the same) and therefore the square is white. As the block gets darker, the RMSd value gets higher. Looking at this picture the lower square is whitish as is the upper square. Within each of these regions, the conformations are similar, yet the darker blocks show that these two regions are slightly different conformations, i.e. about a third of the way through the trajectory, it appears to move into a different "substate" of conformation.

This picture just gives the idea of what can be done here. In the meantime, before I fix my code, moil-view can be used to do the same type of calculation.


thomas <cheatham@cgl.ucsf.edu>
Last modified: February, 2002