The purpose of this tutorial is to provide an example of how one goes about analyzing the results from a molecular dynamics run using some of the tools supplied within AMBER and some available through other sources. Particular emphasis is placed on analyzing molecular dynamics trajectories that need to be "imaged" after running a particle mesh Ewald calculation; as you work through the demo, you will see why this is necessary...
As part of the analysis, we will feature small aspects of a series of programs. The highlighted "links" below will let you jump to particular sections in the tutorial...
To support the analysis, trajectories of the decamer poly(A)-poly(T) DNA structure we built and then equilibrated in previous sections of this tutorial are supplied. Rather than starting in a canonical B-DNA structure, this was originally started in a canonical A-DNA geometry. During the dynamics, it spontaneously (after ~200-500 ps) converted to an B-form geometry [See Cheatham & Kollman, 1996. J. Mol. Biol. 259, 434-444. for a discussion of this with a different sequence]. What you will be looking at here is the result of ~16 hours of running on the Cray T3D at PSC; the trajectory represents ~95 ps of a particle mesh Ewald trajectory from 1775 ps to ~1870 ps. Here are the files you will be looking at.
FILES_IN PARM p1 atA.topo; STREAM s1 atA.traj.1775_1870ps; FILES_OUT TABLE tab1 rms_to_start; DECLARE GROUP gALL (RES 1 - 20); RMS r1 FIT gALL s1; OUTPUT TABLE tab1 r1; ENDIn the input file, we specifed our parm file and trajectory as input and the file rms_to_start as output. After carnal is run, this file will contain the "TABLE" of "RMS" (r1) values. In the "DECLARE" section we specified a "group" representing the atoms we would like to RMS fit. We are going to use all the DNA atoms, or residues 1 through 20. The key to this run of carnal is the RMS command which says to perform the RMS fit. [Note that you cannot have "-" in file names; this breaks carnal.] To run this example:
% carnal < carnal_rms_start.inThis will generate the output as follows:
- CARNAL -
AMBER 4.1
COORDINATE ANALYSIS
input stdin
> FILES_IN
> PARM p1 atA.topo;
Reading parm file (atA.topo)
parm: opening atA.topo
atA.topo title:
PDB 10mer AT
> STREAM s1 atA.traj.1775_1870ps;
Using default parm (atA.topo) for STREAM s1
(box will be read from stream)
stream: opening atA.traj.1775_1870ps
> FILES_OUT
> TABLE tab1 rms_to_start;
> DECLARE
> GROUP gALL (RES 1 - 20);
** Group gALL: 638 atoms
> RMS r1 FIT gALL s1;
RMS: positioning to initial set of stream
> OUTPUT
> TABLE tab1 r1;
> END
(EOF reached on atA.traj.1775_1870ps)
STREAM s1: no more files/crds (iteration 95)
SUMMARY
--RMS r1: avg 16.947 dev 7.014 max 20.204 min 0.568
You will note that the RMSd reported above is tremendously high and ranges
from 0.568 to 20.204. This seems like something weird is going on.
Before getting into the details, let's first look at a graph of the data
which will shed more insight into the time course of the deviation
(i.e. is it moving uniformly away from the starting structure?) and
then we will look at a movie of the trajectory.
It should be noted that the output file generated by carnal (rms_to_start) has a strange format and cannot be directly imported into graphing programs, such as xmgr. The top of the file looks like this:
L0 0.000000e+00 L0 5.675537e-01 L0 5.903252e-01 L0 5.699365e-01 L0 6.835873e-01After running carnal, the first step is to process the output file into a form suitable for other programs. To support this goal, I developed a fairly general perl script to convert this to a format where the first column represents the time and the remaining columns represent the carnal data. This script, process_carnal.perl reads the carnal input from standard input and creates a file called "1.data". If you give an argument to this script, it will use this as the filename.
Advanced feature: If you want to specify the starting time, set the environment variable START_TIME. If you want to specify the interval between frames, set the environment variable INTERVAL.
% setenv START_TIME 0.0 % setenv INTERVAL 1.0By default the START_TIME is zero and the INTERVAL is 1.0.
% process_carnal.perl rms_to_start.xmgr < rms_to_startAfter running this, we can look at it with xmgr. xmgr a very general and powerful, mouse driven and fairly self-explanatory 2D graphing program. During my CTC session, I will give a brief tutorial on this program (if there is interest). It is available from ...
% xmgr rms_to_start.xmgrIt will look something like this:
Looking at this, you see that the RMS jumps from low values to near 20 angstroms and back in nearly 1 step. This is a sign of something seriously wrong with the trajectory since molecules cannot reasonably move 20 angstroms in 1 picosecond...
This program is nice since not only is it freely available, but the viewing options are very nice. Models can be viewed as solid spheres, lines, mixed spheres and line, ball and cylinder, etc. The program is also set up to recognize AMBER. The drawback is that the user doesn't have much control over the movie.
To view a movie of this trajectory, start up moil-view
% moil-viewWith the right button, chose the "file" menu and "Read 1st coordinates". Select "AMBER" format and then specify the file "atA.coords" at the dialog. When asked, say that "no", you do not want to use bonds as connectivity, but "yes" you would like to load a connectivity file. Again specify the "AMBER" format and load up "atA.topo". The structure should appear.
To look at a movie of the trajectory, use the right mouse button and chose the "Dynamics" menu and "View trajectory" submenu. The trajectory we are looking at was run with water, so chose "AMBER PBC" as the type and load up "atA.traj.1775_1870ps". Input the start and end frame for the next dialog that pops up...
To stop the dynamics, hit the right button.
Looking at the movie, it is very hard to see anything since all that water is in there. So, we could temporarily hide the water by going into the main menu (right button), drag "Color options" and choose "No waters". Additionally, those ions are kind of hard to visualize, so lets make them solid spheres. Chose main menu "Objects are" and submenu "Mix lines/spheres" which will pop up a dialog box which will prompt you for a selection of atoms which you want as spheres. We will choose all the atoms named NA+. The selection format is kind of tricky; turn help on for more information... Here is what it will look like and what the pick string looks like:
This will display our structure as follows (for one of the frames)...
Looking at the movie, the imaging problem is readily apparent.
The DNA molecule is split across the box.
Using MD_display to look at a movie
Terry Lybrand's MD_display is actually two programs. preproc
preprocesses the trajectory and display does the actual
movie. Although display doesn't have all the fancy
viewing features of moil-view, it does give the user control
over the view of frames and can be used in stereo mode.
To preprocess the trajectory:
% preproc -c atA.traj.1775_1870ps -p atA.topo -n movie -f 95 -XWhen running with trajectories with water, you need to specify the "-X" option. The "-n" gives the root name of the files to be created.
Here is an example loading 5 frames...
MD preprocessor version 1.1b Please wait while the data is loaded. 0.90000000E+02 0.61317974E+02 0.41857475E+02 0.41857475E+02 Box dimensions: 61.32 41.86 41.86 Constant pressure box detected... Allocating 277380 bytes of memory for 9245 atoms, box, 5 frames 0 1 2 3 4 5 frames read. Writing bond file. Writing coord file. Done.Now we are ready to view the trajectory. Except for one thing. One difficulty with this program is that it takes over the entire screen. This is a little annoying, since to terminate the program, you need to confirm exit by typing into the window where you started the program from. However, you can't get there since the window is obscured by the display program... Therefore, you need to use the special SGI key command to pop the current window into the background, "Alt-F3". "Alt-F1" pops a window to the front. If you aren't running the 4Dwm, or changed the button binding's, you are on your own!
% display -n movieAs you start this up, this will prompt you for a variety of information. If you are lazy, just keep hitting return until the window pops up. The startup text looks like this:
MD DISPLAY
by
T. Callahan & T. Lybrand
Please wait while the data is loaded.
--loading connectivity information--
55662 bytes read.
--loading atom attributes--
170058 bytes read.
9275 bonds, 0 hydrogen bonds.
----Initial coloring of residues & atoms----
0 - invisible (remove)
1 - white
2 - red
3 - blue
4 - aqua
5 - green
6 - magenta
7 - orange
8 - yellow
9 - gray
98 - each molecule a different color
99 - automatic color by atom type
Color (0-99, negative or to continue) :
Since we've got water, let's at least color the water
blue. Choose "3", hit return, and enter ":WAT" (which means
select residues named water). Then hit return a few times...
To exit, chose the right button and select exit; remember that "Alt-F3" pops the window back so you can type your confirmation.
To control the speed and direction of the movie, use the
command bar on the bottom left. The panel on the right controls
a series of other features which will be displayed in the CTC
session.
There are some issues however with imaging since this implicity will
center the solute to the middle of the box. This will remove all
information about the center of mass motion of the solute, hence will
make it impossible to measure diffusion.
With the released code, there are a few options available, specifically using ptraj or carnal. The ptraj imaging is more general and support any triclinic unit cell so this will be the one we discuss.
To do this, we create a file called imageit that has the necessary commands.
trajin atA.traj.1775_1870ps
trajout atA.traj.1775_1870ps.imaged
center :1-10 mass origin
image origin center
center :1-20 mass origin
image origin center
Movies with vmd
Looking at sander movies with vmd is relatively straightforward. From
the "mol" menu, load up files selecting "parm and crd" as the type.
Load up both the prmtop and trajectory, and off you go with the movie.
The "animate" menu can control the progress.
Imaging the trajectory
Now that we've seen that the molecule is spread all over the box,
we need to image the trajectory to bring the molecule back together.
To do this, we center one of the strands (the first molecule) to the
origin, "image" or move every molecule outside the box boundaries back
into the box, then move both strands to the center and image again.
% ptraj atA.topo < imageit
The output will look something like this:
Opened file atA.topo with mode (r)
Opened parm file atA.topo
Scanned in title...
Read in the control variables...
Read in atom names...
Read in charges...
Read in masses...
Read in IAC (atoms involved in L-J)...
Read in NUMEX (index to excl atom list)
Read in NNO (index for nonbond of @type)
Read in residue labels...
A5 A A A A A A A A A3
T5 T T T T T T T T T3
CIO CIO CIO CIO CIO CIO CIO CIO CIO CIO
CIO CIO CIO CIO CIO CIO CIO CIO WAT WAT
WAT WAT WAT WAT WAT WAT WAT WAT WAT WAT
...
WAT
Read in the pointer list of residues...
Read in bond parameters RK and REQ...
Read in angle parameters TK and TEQ...
Read in dihedral parameters PK, PN and PHASE...
Read in L-J parameters CN1 and CN2...
Read in info for bonds w/ hydrogen...
Read in info for bonds w/out hydrogen...
Read in info for angles w/ hydrogen...
Read in info for angles w/out hydrogen...
Read in info for dihedrals w/ hydrogen...
Read in info for dihedrals w/out hydrogen...
Read in excluded atom list...
Read in h-bond parameters, AG, BG, and HBCUT...
Read in ISYMBL, ITREE, JOIN, ROTAT...
Scanning Box
Read in box information...
Successfully completed readParm.
MAIN MENU. Please enter commands. Use "?" or "help"
for more info. "exit" or "quit" to leave program...
MAIN MENU:
At the main menu prompt, type "transform atA.traj.1775_1870ps"
and the program will prompt you to answer a few questions.
Answer no to every question except "Do you want the box imaged?".
After answering the questions, you will need to supply a name of a
new trajectory file to write and also how many molecules are important
for imaging. The latter question can be ignored (just hit return)
or you can specify the number of important molecules, i.e. 2 with
DNA. A sample session follows:
MAIN MENU: transform atA.traj.1775_1870ps
Opened file atA.traj.1775_1870ps with mode (r)
Transformation will place either the center of mass
or the center of geometry at the origin.
Do you want to translate to center of mass? [yes] no
Do you want the box imaged? [yes] yes
Do you want principal axis alignment of the trajectory? [yes] no
Do you want to perform RMS fitting? [no] no
Attempting to transform AMBER trajectory file atA.traj.1775_1870ps
Translating center of geometry to origin.
Output transformed trajectory to file: atA.traj.1775_1870ps.imaged
Opened file atA.traj.1775_1870ps.imaged with mode (w)
Output is directed to file atA.traj.1775_1870ps.imaged
How many solute molecules are important for imaging? 2
Set 1 ........Molecule 1 ( 320 to 638) translated by 0 1 0 boxes
.Molecule 1 ( 320 to 638) translated by 0 1 0 boxes
.Molecule 1 ( 320 to 638) translated by 0 1 0 boxes
Note that the above command will write a new trajectory which is
the same size as the old one, therefore this method is very space
intensive. Since most of our processing doesn't need water, we will
also demonstrate how to strip the water with rdparm.
To do this, use the command "processtrajectory" or "processt".
MAIN MENU: processt atA.traj.1775_1870ps.imaged
This routine will read in a trajectory file,
perform simple manipulations, and write out new or
append to existing topology files.
Opened file atA.traj.1775_1870ps.imaged with mode (r)
What is the name of the trajectory file to output? atA.traj.1775_1870ps.strip
After processing, the trajectory will be dumped to file atA.traj.1775_1870ps.strip
Set 1 ................................................
Set 50 ..............................................
WARNING in readAmberTrajectory(): EOF detected or set #96 is corrupted
Successfully read in a total of 95 coordinate sets
Do you want to output every coordinate set? [no] yes
Do you want to strip the waters? [no] yes
Start striping after atom number: [656]
Striping all atoms after 656
Starting with structure 1, ending with 95,
taking every 1 structures
Rereading and writing the coordinates to the trajectory...
Set 1 ................................................
Set 50 .............................................
Using carnal to image
In prerelease versions of carnal you can use the "IMAGE"
command to process a trajectory. [I am still trying to get this
to work.]
See carnal_rms_start_image.in.
Measuring atomic positional fluctuations with mdanal
Note that carnal can also calculate atomic positional fluctuations,
however (and even though this code may be removed in a latter
release) I am going to demonstrate the use of mdanal.
mdanal is released with AMBER and does a number of useful things, including calculating the atomic positional fluctuations which are directly related to the atomic B-factors.
We will demonstrate the use of the program with the trajectory stripped of the waters. The input file is mdanal.in. To reference the trajectory, it needs to be named "fort.12" for some archaic reason. Therefore, link the stripped trajectory to "fort.12"...
% ln -s atA.traj.1775_1870ps.strip fort.12Then run the command...
% mdanal -i mdanal.in \
-o mdanal.out \
-p invacuo.topo \
-a mdanal.pdb \
-g mdanal.pltfil
To look at the atomic positional fluctuations you can view mdanal.pltfil
with xmgr. Unless this file is editted (to strip the top of the file),
xmgr will complain, but it will still work...
% xmgr mdanal.pltfilThe large fluctuations on the right are the counterions. Under the "Graph" menu, choose "World Scaling" and you can change the scale of the y-axis to 3, hit "Accept" and hit "Close" and you will see a graph like this:
The larger peaks at the left, middle and right (except for the counterions on the far right) represent the beginning and ends of the helices. The periodic peaks represent the backbone phosphates. Larger peaks mean more motion (the y-axis is atomic postional fluctuations in angstroms).
rdparm: parameter topology and trajectory analysis
rdparm, is a program supplied with the AMBER 4.1 suite that
allows users to check parm files (i.e. look at the list of bonds,
angles, etc) and perform some rudimentary functions, such as
fixing waters added with the ADD water option of edit, delete
certain bonds, angles or dihedrals, etc...
To start the program, users need to specify a AMBER parameter/topology file. We will look at the poly(A)-poly(t) in vacuo prmtop, invacuo.topo.
% rdparm invacuo.topo Opened file invacuo.topo with mode (r) Opened parm file invacuo.topo Scanned in title... Read in the control variables... Read in atom names... Read in charges... Read in masses... Read in IAC (atoms involved in L-J)... Read in NUMEX (index to excl atom list) Read in NNO (index for nonbond of @type) Read in residue labels... A5 A A A A A A A A A3 T5 T T T T T T T T T3 CIO CIO CIO CIO CIO CIO CIO CIO CIO CIO CIO CIO CIO CIO CIO CIO CIO CIO Read in the pointer list of residues... Read in bond parameters RK and REQ... Read in angle parameters TK and TEQ... Read in dihedral parameters PK, PN and PHASE... Read in L-J parameters CN1 and CN2... Read in info for bonds w/ hydrogen... Read in info for bonds w/out hydrogen... Read in info for angles w/ hydrogen... Read in info for angles w/out hydrogen... Read in info for dihedrals w/ hydrogen... Read in info for dihedrals w/out hydrogen... Read in excluded atom list... Read in h-bond parameters, AG, BG, and HBCUT... Read in ISYMBL, ITREE, JOIN, ROTAT... Successfully completed readParm. MAIN MENU. Please enter commands. Use "?" or "help" for more info. "exit" or "quit" to leave program... MAIN MENU:To see the help menu, type help or "?"
MAIN MENU: ? The following commands are currently available: help, ? atoms, printAtoms bonds, printBonds angles, printAngles dihedrals, printDihedrals pertBbonds, perturbedBonds pertAngles, perturbedAngles pertDihedrals, perturbedDihedrals printExluded printLennardjones parmInfo verbose ddriveIn general, you don't need to specify the whole command, just typically the first few letters of the command. You can look at the list of atoms with the command "atoms", bonds with the command "bonds", etc. The "restrain" command is used to add parm style contraints.delete delperturbed restrain openParm writeParm system mardi2sander transform translateBox modifyBoxInfo modifyMoleculeInfo quit, exit
One thing this code can do is generate 2D RMS plots. This is a matrix of all pairwise RMSd values between a series of structures. Currently the choices are rather limited, but in future, the picking should be more general. The clustering option is currently weak. I don't advise it. While running, you will be prompted with a series of questions asking what you want to RMS and how many frames. Additionally, you can dump PlotMTV format or raw postscript. The latter is advised. Here is an example of its use for RMS fitting the DNA bases...
MAIN MENU: rms atA.traj.1775_1870ps.strip This routine will compute the RMS by RMS plot for for a trajectory, in a format suitable for display by the PlotMtv plotting program. See the routine in cluster.c to change anything... NOTE: this module is under constant modification and is currently being equiped with a crude clustering option... Opened file atA.traj.1775_1870ps.strip with mode (r) Do you to want pick atom numbers for the rms? [yes] no Do you to want to RMS only the peptide backbone? [yes] no Do you to want to RMS only the CA atoms? [yes] no Do you to want to RMS only the :8-120 backbone atoms? [yes] no Do you to want only the DNA bases? [yes] The title follows... PDB 10mer AT Loading the trajectory... Set 1 ................................................ Set 50 .............................................. WARNING in readAmberTrajectory(): EOF detected or set #96 is corrupted 95 frames were successfully read in. The maximum recommended size for a 2D RMS plot via PlotMtv is on the order of 300 x 300. How many snapshots in each dimension for the 2D RMS grid? [200] 100 Calculating the RMS values between every 1 snapshots... .... Do you to actually dump a contour plot? [yes] Do you to actually dump MTV format instead of PS? [yes] no Input the name of a file to dump the RMS plot data (ps): vacuo.ps Opened file vacuo.ps with mode (w) Do you want to attempt to cluster this trajectory? [yes] no MAIN MENU: qHere is what the postscript file looks like (in postscript)
Along each of the axes is time and each square represents the RMSd value of the bases between the 2 time points. On the diagonal, the RMSd is zero (since the time is the same) and therefore the square is white. As the block gets darker, the RMSd value gets higher. Looking at this picture the lower square is whitish as is the upper square. Within each of these regions, the conformations are similar, yet the darker blocks show that these two regions are slightly different conformations, i.e. about a third of the way through the trajectory, it appears to move into a different "substate" of conformation.
This picture just gives the idea of what can be done here. In the meantime, before I fix my code, moil-view can be used to do the same type of calculation.