Lesson 8: Part 2 Running a Molecular Dynamics Simulation in Amber - joslynnlee/CHEM-454 GitHub Wiki
Run Amber MD sander
Now that we have all the ingredients: the parameter and topology file parm7
, the coordinate file rst7
, and the input files 01_Min.in
, 02_Heat.in
, 03_Prod.in
, we are ready to run the actual minimization, heating, and production MD.
To do this, we will use the program sander
, the general purpose MD engine of Amber (there is also a high performance version, termed pmemd
, which is part of the commercial version of AMBER and the optimum choice of MD engine but for purposes of the tutorial sander will suffice). sander
is run from the command line. On the command line, we can specify several more options and choose which files are to be used for input.
- Let's go navigate back into your directory where your files are located. Open the
terminal
app. Usingls
andcd
, change into the Amber directory on your Desktop.
ls
cd Desktop/Amber
pwd
- Now you must initialize the Amber environment with Conda.
$ conda activate AmberTools21
Run minimization
- Run the minimization of alanine dipeptide with sander.
options/commands | Description |
---|---|
-O | Overwrite the output files if they already exist |
-i 01_Min.in | Choose input file (default mdin) |
-o 01_Min.out | Write output file (default mdout) |
-p parm7 | Choose parameter and topology file parm7 |
-c rst7 | Choose coordinate file rst7 |
-r 01_Min.ncrst | Write output restart file with coordinates and velocities (default restrt) |
-inf 01_Min.mdinfo | Write MD info file with simulation status (default mdinfo) |
$AMBERHOME/bin/sander -O -i 01_Min.in -o 01_Min.out -p parm7 -c rst7 -r 01_Min.ncrst -inf 01_Min.mdinfo
or if you get a -inf
error use
$AMBERHOME/bin/sander -O -i 01_Min.in -o 01_Min.out -p parm7 -c rst7 -r 01_Min.ncrst -inf mdinfo
- View output files of minimization
ls
After sander completes, there should be an output file 01_Min.out
, a restart file 01_Min.ncrst
, and a MD info file 01_Min.mdinfo
. You will use the restart file 01_Min.ncrst
for the heating of the system.
In a text editor, open the 01_Min.out
file. you will find the details of your minimization.
Compare the energy between the first step NSTEP 1
to the last step NSTEP 2000
. What is the difference?
You should be able to see the system energy ENERGY decrease throughout the minimization. Downstream, you can compare the root-mean-squared deviations (RMSD) to see the difference between the structures.
- Let's create a
.pdb
file from theparm7
andrst7
coordinate files.
$AMBERHOME/bin/ambpdb -p parm7 -c 01_Min.ncrst > 01_Min_.pdb
You can open this in CHIMERA or PyMol.
Run heating MD
Now, using the restart file from the initial minimization, we will heat the system.
- Run the heating of alanine dipeptide with sander.
$AMBERHOME/bin/sander -O -i 02_Heat.in -o 02_Heat.out -p parm7 -c 01_Min.ncrst -r 02_Heat.ncrst -x 02_Heat.nc -inf 02_Heat.mdinfo
Here is a summary of the command line options for sander:
options/commands | Description |
---|---|
-c 01_Min.ncrst | Now for the input coordinates we choose the restart file from minimization |
-x 02_Heat.nc | Output trajectory file for MD simulation (default nc) |
sander should complete the heating in a moderate amount of time (~ 2.5 mins) depending on your computer specifications.
- View output files of heating output
Some files are compressed and need to be unzipped. The heating output files.
02_Heat.out
02_Heat.ncrst
02_Heat.nc
- Open the output file
02_Heat.out
to view the system output.
In the 02_Heat.out
file you will find the output from the heating MD. You should be able to see system information including timestep energies, and temperature. For example on the 1000 time step:
NSTEP = 1000 TIME(PS) = 2.000 TEMP(K) = 29.48 PRESS = 0.0
Etot = -6944.9552 EKtot = 112.3015 EPtot = -7057.2567
BOND = 1.0442 ANGLE = 1.7653 DIHED = 9.4906
1-4 NB = 2.6284 1-4 EEL = 46.3073 VDWAALS = 1448.7074
EELEC = -8567.1999 EHBOND = 0.0000 RESTRAINT = 0.0000
Ewald error estimate: 0.4641E-03
------------------------------------------------------------------------------
NMR restraints: Bond = 0.000 Angle = 0.000 Torsion = 0.000
===============================================================================
Some of the important values include:
Title | Description |
---|---|
NSTEP | The time step that the MD simulation is at |
TIME | The total time of the simulation (including restarts) |
TEMP | System temperature |
PRESS | System pressure |
Etot | Total energy of the system |
EKtot | Total kinetic energy of the system |
EPtot | Total potential energy of the system |
Note that the pressure is 0.0 because the barostat (pressure control) is not being used in the heating.
Run production MD
Now that minimization and heating are complete. We move on to the actual production MD.
- Run the production MD of alanine dipeptide with sander.
$AMBERHOME/bin/sander -O -i 03_Prod.in -o 03_Prod.out -p parm7 -c 02_Heat.ncrst -r 03_Prod.ncrst -x 03_Prod.nc -inf 03_Prod.info &
Note: With the &
at the end of the command, sander
now runs in the background
- Now
sander
is running in the background. It will take some time to run the production MD. You can monitor the status of the production MD. So we will monitor the sander output file to check the status of the run. The Linux program tail will print the end of the output file.
tail -f 03_Prod.out
This prints the output file as sander is running. It's useful to keep track of the job. You can also monitor the mdinfo file
(using cat 03_Prod.info
) which provides detailed performance data as well as estimated time to completion.
To exit tail
hold CTRL-C
.
Using top
command will also show what processors are being used along with memory. To quit top
type q
top
Complete the MD simulation
It should take a while to complete the simulation [~ 10 minutes].
- View output files of Production output
The production MD output is:
03_Prod.out
03_Prod.ncrst
03_Prod.nc
Once it's complete open the output file to check that the simulation completed normally.
- Open the output file
03_Prod.out
withgedit
to view the output of the MD simulation.
Visualize the results
You've now run an MD simulation. In order to visualize the results, we will now use a program called VMD
(Visual Molecular Dynamics). This is a molecular graphics program that can render molecular structures in 3D. VMD not only loads Protein Database (PDB) structure files, but also MD trajectories from many programs.
- To start
VMD
, you must be in your tutorial files in the Tutorial directory, and then runvmd
.
vmd
VMD is a very useful tool to visualize protein, nucleic acid, and other biomolecular atomic structures. One of the most common formats is the PDB biomolecular structure format. To load a PDB, got to File -> New Molecule.... Then load files for a New Molecule and choose the PDB file. VMD should determine the file type Automatically.
However, we would like to visualize the alanine dipeptide trajectory. Now we will load our MD trajectory to look at the dynamics of alanine dipeptide.
-
Create a new molecule with File -> New Molecule...
-
Load files for New Molecule. Then choose the Amber parameter and topology file
parm7
. Set the file type toAMBER7 Parm
. Click Load. -
Load files for
0: parm7
. Then choose the Amber trajectory file03_Prod.nc
. Set the file type toNetCDF (AMBER, MMTK)
. Click Load.
If you are using VMD on Mac, there will be a file format error (Could not read file) when loading the 03_Prod.nc
output. You will need to convert the .nc
file to a .dcd
format using the CPPTRAJ program in AmberLink:
cpptraj -p parm7 -y 03_Prod.nc -x 03_Prod.dcd
You can now load the 0: parm7' trajectory file by choosing the
03_Prod.dcdfile. VMD will read this as a
CHARMM,NAMD,EXPLOR DCD Trajectory` file type. Click Load.
- VMD now loads your trajectory to be visualized. The main VMD window can be used to control playback. Move the speed bar to adjust slower and hit the play button.
You should be able to see the alanine dipeptide molecule as well as the many water molecules in the display. You can rotate, zoom and pan the molecules around the display with the mouse.
Many different visualization options can be changed in the Graphics -> Representations
window.
Your visualization can be restricted to the alanine dipeptide as well.
Let's save the second to the last frame. This will be the final confirmation of the structure. In the VMD Main window, find the value of Frames and subtract 1. You will now find the box on the lower left corner. Here you can enter that value.This will collect the 3D coordinates of that run.
Now click on the Molecule Name in the VMD Window. Right-click to 'Save Coordinates.
Next a window will pop-up and you can change Selected atoms
to all. The File Type
can be .PDB
. Be sure to change the Frames First and Last to NUMBER ENTERED ABOVE. You will be prompted to name the file.
-
Change the Selected Atoms to all not water. You can change the drawing method for the molecule to a more interesting model.
-
Change the Drawing Method to Licorice. Alanine dipeptide will look something like this:
-
Click on the "Molecule" name
Analyze the MD results
Amber includes a suite of tools to examine and analyze MD trajectories. In this tutorial, we will do a simple analysis with several Amber programs and plot the results. The analysis will primarily be done from the command line in the terminal.
Within your folder, you can create a new directory called Analysis
.
- Now we will use an analysis script
process_mdout.perl
to analyze the MD output files. This script will extract the energies, temperature, pressure, density, and volume from the MD output files and save them to individual data files.
$AMBERHOME/bin/process_mdout.perl ../02_Heat.out ../03_Prod.out
Use cpptraj to analyze the RMSD
The root mean squared deviation (RMSD) value is a measurement of how similar a structure's internal atomic coordinates are relative to some reference molecule coordinates. For this example, we will measure how the internal atomic coordinates change relative to the minimized structure. Specifically, we will analyze the alanine atoms (residue 2).
To do this analysis, we will use cpptraj, a fairly comprehensive analysis program for processing MD trajectories. This program runs simple user-written scripts that choose what trajectories to load, what analysis to run, and what processed trajectories or structure to save.
First, we will need to write a simple cpptraj script to do this analysis.
- Use gedit to create a cpptraj script called rmsd.cpptraj.
trajin 02_Heat.nc
trajin 03_Prod.nc
reference 01_Min.ncrst
autoimage
rms reference mass out 02_03.rms time 2.0 :2
This is a brief summary of the cpptraj script:
Title | Description |
---|---|
trajin 02_Heat.nc | Load the trajectory 02_Heat.nc |
reference 01_Min.ncrst | Define the structure 01_Min.ncrst as the reference structure |
center :1-3 mass origin | Center the residues 1-3 by mass to the system origin |
image origin center | Image the atoms to the origin using the center of mass of the molecule |
rms reference mass out 02_03.rms time 2.0 :2 | Calculate the mass weighted RMSD using the reference and output to 02_03.rms |
Run cpptraj
- To actually run
cpptraj
, we must run it again from the terminal in the directory where theparm7
,nc
, and referencencrst
file is located.
$AMBERHOME/bin/cpptraj -p parm7 -i rmsd.cpptraj &> cpptraj.log
-
Now our RMSD data is stored in the file
02_03.rms.
Dr. Lee will plot the RMSD with xmgrace and have you investigate the output for deeper analysis. -
It is now quite simple to plot the data saved in the output files. A simple plotting program called
xmgrace
to automatically generate plots for the following MD simulation properties throughout the simulation. We use this for our convenience and the plots are found below for:
MD simulation temperature MD simulation density MD simulation total, potential, and kinetic energies.
Analysis data files
In the Alanine dipeptide MD temperature, you will see the linear increase in temperature during heating (0-20 ns). This is followed by the relatively stable temperature fluctuations about 300 K during the production simulation.
In the Alanine dipeptide MD density, during the 20-80ps, the density equilibrates to approximately 1 g/cm^3. This corresponds to a change in periodic box dimensions as the density of the system converges.
In the Alanine dipeptide MD total, potential, and kinetic energy, this plot shows the total system energy which can be decomposed to the total potential energy and the total kinetic energy.