Lesson 8: Part 2 Running a Molecular Dynamics Simulation in Amber - joslynnlee/CHEM-454 GitHub Wiki

Run Amber MD sander

Now that we have all the ingredients: the parameter and topology file parm7, the coordinate file rst7, and the input files 01_Min.in, 02_Heat.in, 03_Prod.in, we are ready to run the actual minimization, heating, and production MD.

To do this, we will use the program sander, the general purpose MD engine of Amber (there is also a high performance version, termed pmemd, which is part of the commercial version of AMBER and the optimum choice of MD engine but for purposes of the tutorial sander will suffice). sander is run from the command line. On the command line, we can specify several more options and choose which files are to be used for input.

  1. Let's go navigate back into your directory where your files are located. Open the terminal app. Using ls and cd, change into the Amber directory on your Desktop.
ls
cd Desktop/Amber
pwd
  1. Now you must initialize the Amber environment with Conda.
$ conda activate AmberTools21

Run minimization

  1. Run the minimization of alanine dipeptide with sander.
options/commands Description
-O Overwrite the output files if they already exist
-i 01_Min.in Choose input file (default mdin)
-o 01_Min.out Write output file (default mdout)
-p parm7 Choose parameter and topology file parm7
-c rst7 Choose coordinate file rst7
-r 01_Min.ncrst Write output restart file with coordinates and velocities (default restrt)
-inf 01_Min.mdinfo Write MD info file with simulation status (default mdinfo)
$AMBERHOME/bin/sander -O -i 01_Min.in -o 01_Min.out -p parm7 -c rst7 -r 01_Min.ncrst -inf 01_Min.mdinfo

or if you get a -inf error use

$AMBERHOME/bin/sander -O -i 01_Min.in -o 01_Min.out -p parm7 -c rst7 -r 01_Min.ncrst -inf mdinfo
  1. View output files of minimization
ls

After sander completes, there should be an output file 01_Min.out, a restart file 01_Min.ncrst, and a MD info file 01_Min.mdinfo. You will use the restart file 01_Min.ncrst for the heating of the system.

In a text editor, open the 01_Min.out file. you will find the details of your minimization.


Compare the energy between the first step NSTEP 1 to the last step NSTEP 2000. What is the difference?

You should be able to see the system energy ENERGY decrease throughout the minimization. Downstream, you can compare the root-mean-squared deviations (RMSD) to see the difference between the structures.

  1. Let's create a .pdb file from the parm7 and rst7 coordinate files.
$AMBERHOME/bin/ambpdb -p parm7 -c 01_Min.ncrst > 01_Min_.pdb

You can open this in CHIMERA or PyMol.

Run heating MD

Now, using the restart file from the initial minimization, we will heat the system.

  1. Run the heating of alanine dipeptide with sander.
$AMBERHOME/bin/sander -O -i 02_Heat.in -o 02_Heat.out -p parm7 -c 01_Min.ncrst -r 02_Heat.ncrst -x 02_Heat.nc -inf 02_Heat.mdinfo

Here is a summary of the command line options for sander:

options/commands Description
-c 01_Min.ncrst Now for the input coordinates we choose the restart file from minimization
-x 02_Heat.nc Output trajectory file for MD simulation (default nc)

sander should complete the heating in a moderate amount of time (~ 2.5 mins) depending on your computer specifications.

  1. View output files of heating output

Some files are compressed and need to be unzipped. The heating output files. 02_Heat.out 02_Heat.ncrst 02_Heat.nc

  1. Open the output file 02_Heat.out to view the system output.

In the 02_Heat.out file you will find the output from the heating MD. You should be able to see system information including timestep energies, and temperature. For example on the 1000 time step:

NSTEP =     1000   TIME(PS) =       2.000  TEMP(K) =    29.48  PRESS =     0.0
 Etot   =     -6944.9552  EKtot   =       112.3015  EPtot      =     -7057.2567
 BOND   =         1.0442  ANGLE   =         1.7653  DIHED      =         9.4906
 1-4 NB =         2.6284  1-4 EEL =        46.3073  VDWAALS    =      1448.7074
 EELEC  =     -8567.1999  EHBOND  =         0.0000  RESTRAINT  =         0.0000
 Ewald error estimate:   0.4641E-03
 ------------------------------------------------------------------------------

 NMR restraints: Bond =    0.000   Angle =     0.000   Torsion =     0.000
===============================================================================

Some of the important values include:

Title Description
NSTEP The time step that the MD simulation is at
TIME The total time of the simulation (including restarts)
TEMP System temperature
PRESS System pressure
Etot Total energy of the system
EKtot Total kinetic energy of the system
EPtot Total potential energy of the system

Note that the pressure is 0.0 because the barostat (pressure control) is not being used in the heating.

Run production MD

Now that minimization and heating are complete. We move on to the actual production MD.

  1. Run the production MD of alanine dipeptide with sander.
$AMBERHOME/bin/sander -O -i 03_Prod.in -o 03_Prod.out -p parm7 -c 02_Heat.ncrst -r 03_Prod.ncrst -x 03_Prod.nc -inf 03_Prod.info &

Note: With the & at the end of the command, sander now runs in the background

  1. Now sander is running in the background. It will take some time to run the production MD. You can monitor the status of the production MD. So we will monitor the sander output file to check the status of the run. The Linux program tail will print the end of the output file.
tail -f 03_Prod.out

This prints the output file as sander is running. It's useful to keep track of the job. You can also monitor the mdinfo file (using cat 03_Prod.info) which provides detailed performance data as well as estimated time to completion.

To exit tail hold CTRL-C.

Using top command will also show what processors are being used along with memory. To quit top type q

top

Complete the MD simulation

It should take a while to complete the simulation [~ 10 minutes].

  1. View output files of Production output

The production MD output is: 03_Prod.out 03_Prod.ncrst 03_Prod.nc

Once it's complete open the output file to check that the simulation completed normally.

  1. Open the output file 03_Prod.out with gedit to view the output of the MD simulation.

Visualize the results

You've now run an MD simulation. In order to visualize the results, we will now use a program called VMD (Visual Molecular Dynamics). This is a molecular graphics program that can render molecular structures in 3D. VMD not only loads Protein Database (PDB) structure files, but also MD trajectories from many programs.

  1. To start VMD, you must be in your tutorial files in the Tutorial directory, and then run vmd.
vmd

VMD is a very useful tool to visualize protein, nucleic acid, and other biomolecular atomic structures. One of the most common formats is the PDB biomolecular structure format. To load a PDB, got to File -> New Molecule.... Then load files for a New Molecule and choose the PDB file. VMD should determine the file type Automatically.

However, we would like to visualize the alanine dipeptide trajectory. Now we will load our MD trajectory to look at the dynamics of alanine dipeptide.

  1. Create a new molecule with File -> New Molecule...

  2. Load files for New Molecule. Then choose the Amber parameter and topology file parm7. Set the file type to AMBER7 Parm. Click Load.

  3. Load files for 0: parm7. Then choose the Amber trajectory file 03_Prod.nc. Set the file type to NetCDF (AMBER, MMTK). Click Load.


If you are using VMD on Mac, there will be a file format error (Could not read file) when loading the 03_Prod.nc output. You will need to convert the .nc file to a .dcd format using the CPPTRAJ program in AmberLink:

cpptraj -p parm7 -y 03_Prod.nc -x 03_Prod.dcd

You can now load the 0: parm7' trajectory file by choosing the 03_Prod.dcdfile. VMD will read this as aCHARMM,NAMD,EXPLOR DCD Trajectory` file type. Click Load.

  1. VMD now loads your trajectory to be visualized. The main VMD window can be used to control playback. Move the speed bar to adjust slower and hit the play button.

You should be able to see the alanine dipeptide molecule as well as the many water molecules in the display. You can rotate, zoom and pan the molecules around the display with the mouse.

Many different visualization options can be changed in the Graphics -> Representations window.

Your visualization can be restricted to the alanine dipeptide as well.

Let's save the second to the last frame. This will be the final confirmation of the structure. In the VMD Main window, find the value of Frames and subtract 1. You will now find the box on the lower left corner. Here you can enter that value.This will collect the 3D coordinates of that run.

Now click on the Molecule Name in the VMD Window. Right-click to 'Save Coordinates.

Next a window will pop-up and you can change Selected atoms to all. The File Type can be .PDB. Be sure to change the Frames First and Last to NUMBER ENTERED ABOVE. You will be prompted to name the file.

  1. Change the Selected Atoms to all not water. You can change the drawing method for the molecule to a more interesting model.

  2. Change the Drawing Method to Licorice. Alanine dipeptide will look something like this:

  3. Click on the "Molecule" name

Analyze the MD results

Amber includes a suite of tools to examine and analyze MD trajectories. In this tutorial, we will do a simple analysis with several Amber programs and plot the results. The analysis will primarily be done from the command line in the terminal.

Within your folder, you can create a new directory called Analysis.

  1. Now we will use an analysis script process_mdout.perl to analyze the MD output files. This script will extract the energies, temperature, pressure, density, and volume from the MD output files and save them to individual data files.
$AMBERHOME/bin/process_mdout.perl ../02_Heat.out ../03_Prod.out

Use cpptraj to analyze the RMSD

The root mean squared deviation (RMSD) value is a measurement of how similar a structure's internal atomic coordinates are relative to some reference molecule coordinates. For this example, we will measure how the internal atomic coordinates change relative to the minimized structure. Specifically, we will analyze the alanine atoms (residue 2).

To do this analysis, we will use cpptraj, a fairly comprehensive analysis program for processing MD trajectories. This program runs simple user-written scripts that choose what trajectories to load, what analysis to run, and what processed trajectories or structure to save.

First, we will need to write a simple cpptraj script to do this analysis.

  1. Use gedit to create a cpptraj script called rmsd.cpptraj.
trajin 02_Heat.nc
trajin 03_Prod.nc
reference 01_Min.ncrst
autoimage
rms reference mass out 02_03.rms time 2.0 :2

This is a brief summary of the cpptraj script:

Title Description
trajin 02_Heat.nc Load the trajectory 02_Heat.nc
reference 01_Min.ncrst Define the structure 01_Min.ncrst as the reference structure
center :1-3 mass origin Center the residues 1-3 by mass to the system origin
image origin center Image the atoms to the origin using the center of mass of the molecule
rms reference mass out 02_03.rms time 2.0 :2 Calculate the mass weighted RMSD using the reference and output to 02_03.rms

Run cpptraj

  1. To actually run cpptraj, we must run it again from the terminal in the directory where the parm7, nc, and reference ncrst file is located.
$AMBERHOME/bin/cpptraj -p parm7 -i rmsd.cpptraj &> cpptraj.log
  1. Now our RMSD data is stored in the file 02_03.rms. Dr. Lee will plot the RMSD with xmgrace and have you investigate the output for deeper analysis.

  2. It is now quite simple to plot the data saved in the output files. A simple plotting program called xmgrace to automatically generate plots for the following MD simulation properties throughout the simulation. We use this for our convenience and the plots are found below for:

MD simulation temperature MD simulation density MD simulation total, potential, and kinetic energies.

Analysis data files

In the Alanine dipeptide MD temperature, you will see the linear increase in temperature during heating (0-20 ns). This is followed by the relatively stable temperature fluctuations about 300 K during the production simulation.

In the Alanine dipeptide MD density, during the 20-80ps, the density equilibrates to approximately 1 g/cm^3. This corresponds to a change in periodic box dimensions as the density of the system converges.

In the Alanine dipeptide MD total, potential, and kinetic energy, this plot shows the total system energy which can be decomposed to the total potential energy and the total kinetic energy.