Lesson 8: Introduction to Force Fields with Amber - joslynnlee/CHEM-454 GitHub Wiki

What is Amber?

Amber is a suite of biomolecular simulation programs. It began in the late 1970's, and is maintained by an active development community. (see website: http://ambermd.org/)

AMBER stands for Assisted Model Building and Energy Refinement. It refers not only to the molecular dynamics programs, but also a set of force fields that describe the potential energy function and parameters of the interactions of biomolecules.

In order to run a Molecular Dynamics simulation in Amber, each molecule's interactions are described by a molecular force field. The force field has specific parameters defined for each molecule.

Amber is designed to work with several simple types of force fields, although it is most commonly used with parametrization developed by Peter Kollman, his co-workers and “descendents”. The traditional parametrization uses fixed partial charges, centered on atoms.

Molecule/Ion Type Force Field
protein ff19SB
DNA OL15
RNA OL3
carbohydrates GLYCAM_06j
lipids lipids17
organic molecules (usually ligands) gaff2
ions match water
water model tip3p, spc/e, tip4pew, and OPC

More information on force fields (http://ambermd.org/AmberModels.php)

Starting in Amber

This tutorial is designed to provide an introduction to molecular dynamics (MD) simulations with Amber. It is designed around AMBER Tools v14 and assumes that you have not used Linux or Amber before. It is designed for new users who want to learn about how to run Molecular Dynamics simulations. It does however assume that you have a machine with AmberTools, VMD and xmgrace correctly installed.

sander is the basic MD engine of Amber. pmemd is the high performance implementation of the MD engine that contains a subset of features of sander. pmemd can also be run with acceleration from graphics processing units (GPU).

In order to run an MD simulation with sander or pmemd, three key files are needed:

  • parm7 - The file that describes the parameter and topology of the molecules in the system
  • rst7 - The file that describes the initial molecular coordinates of the system
  • mdin - The file that describes the settings for the Amber MD engine

Loading AMBER structures is slightly different to how you load a .pdb file in other modeling software. There are actually two files: parm7 and rst7. The first file is the parm7 topology file. This says nothing about the locations of the atoms in a molecule‒it simply defines what each atom is, what it is bonded to, and the parameters for each atom type. The rst7 file on the other hand simply lists a set of coordinates, but says nothing about the atoms. Thus, we need both these files to load the structure.

Running Amber

  1. Open the Terminal (in Mac or Linux).
  2. Let's create a directory (folder) in the Desktop Directory Amber. First, see where you are in your directory:
pwd
  1. You may be in the home directory. You will need to navigate to the Desktop and into `Amber:
cd Desktop/Amber
  1. Now make a new directory called Practice-yourinitials-todaysdate. I usually list the year-month-date when writing files as the year changes the last verses month and day.
mkdir Practice-JL-200927
  1. Type ls to see if the new directory has been created.
ls 
  1. Now change in into your new directory:
cd  Practice-JL-200927

Prepare topology and coordinate files

  1. We must start the program Amber. To see what environment we can run Amber, we can type
conda env list
  1. To activate the new environment, type in
conda activate AmberTools21

For this first practice, we can build an alanine dipeptide as an alanine amino acid capped with an acetyl group on the N-terminus and n-methylamide on the C-terminus.

alanine dipeptide

In order to build and solvate this molecule, you will need to start xLEaP. xLEaP has another command line interface and simple molecular graphics for building the system topology and define parameters for the molecules.

  1. Start xLEaP now with the xleap command.
xleap
  1. You should see a new window like this:

Warning: Do NOT click the "X" on any LEaP window. It will quit LEaP entirely.

Note: At this point it's a good idea to turn Num Lock off for the menus to work.

Load a protein and nucleic acid force field

A MD force field is the the Hamiltonian (potential energy function) and the related parameters that describe the intra- and intermolecular interactions between the molecules in the system. In MD, the Hamiltonian is integrated to describe the forces and velocities of the molecules.

The basic form of the Amber Hamiltonian is:

In order to run a molecular dynamics simulation, we need to load a force field to describe the potential energy of alanine dipeptide.

We will use the AMBER force field FF14SB for proteins and nucleic acids. FF14SB is based off FF12SB, an updated version of FF99SB, which in turn was based on the original Amber Cornell et al (1995) [ff94] force field. The most significant changes to the FF14SB force field included updated torsion terms for the protein Phi-Psi angles and refits of the torsion terms for side chains. Together these improved the estimation of alpha helices in these molecules.

  1. Now load the FF14SB force field in the xLEaP window:
> source leaprc.protein.ff14SB

Build alanine dipeptide

We can build an alanine dipeptide as an alanine amino acid capped with an acetyl group on the N-terminus and n-methylamide on the C-terminus. After we loaded the force field ff14SB, xLEaP now has these "units" available to build into a molecule. The sequence command will create a new unit from the available units and connect them together.

  1. Use sequence to create a new unit called foo out of the ACE, ALA, and NME units. Note that the ">" indicates the command line in xLEaP.
> foo = sequence { ACE ALA NME }
  1. Now you have a single alanine dipeptide molecule stored in the unit foo. xLEaP provides a very basic editor to examine and change units and molecules. To examine the structure of the alanine dipeptide molecule. Use the edit command to view the structure.
> edit foo

From here, you can examine the topology, structure, atom names, atom types, and partial charges of the molecule. Basic editing of the molecule is also possible. Warning: Do NOT click the "X" to close this window. It will exit xLEaP. To close this window, use Unit -> Close.

Solvate alanine dipeptide

The next step to prepare this alanine dipeptide system is to solvate the molecule with explicit water molecules. In this simulation we will use add TIP3P water molecules to the system. In this type of simulation, the system has periodic boundary conditions, meaning that molecules that exit one side of the system will wrap to the other side of the system. It is important that the periodic box is large enough, i.e. there is enough water surrounding alanine dipeptide, so that the alanine dipeptide molecule does not interact with its periodic images.

There are many water models available for MD simulations. However, for this tutorial we will use the TIP3P water model for this simulation.

  1. Solvate the system with the solvatebox command.
> source leaprc.water.tip3p

TIP3PBOX specifies the type of water box to solvate with. 10.0 indicates that the molecule should have a buffer of at least 10 Angstroms between alanine dipeptide and the periodic box wall.

More info on T1P3P

> solvatebox foo TIP3PBOX 10.0

Save the Amber parm7 and rst7 input files

Now we will save the parm7 and rst7 files to the current working directory. The unit foo now contains the alanine dipeptide molecule, water molecules, and the periodic box information necessary for simulation. The parameters will be assigned from the ff99SB force field.

  1. To save the parm7 and rst7 file use the saveamberparm command.
> saveamberparm foo parm7 rst7
  1. Warning: Pay close attention to the output of this command for any warnings or errors that your parm7 and rst7 file did not build correctly. Compare your output:
Checking Unit.
Building topology.
Building atom parameters.
Building bond parameters.
Building angle parameters.
Building proper torsion parameters.
Building improper torsion parameters.
 total 4 improper torsions applied
Building H-Bond parameters.
Incorporating Non-Bonded adjustments.
Not Marking per-residue atom chain types.
Marking per-residue atom chain types.
  (Residues lacking connect0/connect1 -
   these don't have chain types marked:

        res     total affected

        WAT     630
  )
 (no restraints)
  1. We are done working in xLEap, to quit, use quit.
> quit

Prepare Amber MD sander input files

The last components needed are the input files that define the program settings for each MD run. For this system, we will perform an energy minimization on the system, then slowly heat the system, and then do production MD at the desired temperature and pressure.

  1. Minimization
  2. Heating with constant volume and temperature (NVT) for 20ps from 0K to 300K
  3. Production MD with constant pressure and temperature (NPT) at 300K and 1atm for 60ps

We will save the trajectory and write to the output file every 2ps. The Langevin thermostat will be used to control the temperature. The random number generator will be initialized with a random seed.

To control all these settings, we will write a simple input file in a text editor. Linux and Mac have many text editors available. The text editor for Linux is gedit and for Mac is BBedit. Open the program for what computer you are using.

Minimization input

  1. In the editor, copy the following lines below:
Minimize
 &cntrl
  imin=1,
  ntx=1,
  irest=0,
  maxcyc=2000,
  ncyc=1000,
  ntpr=100,
  ntwx=0,
  cut=8.0,
 /
  1. Save the file as 01_Min.in, this code includes settings for minimization. The settings can be summarized as follows:

The settings can be summarized as follows:

Setting Explanation
imin=1 Choose a minimization run
ntx=1 Read coordinates but not velocities from ASCII formatted rst7 coordinate file
irest=0 Do not restart simulation. (not applicable to minimization)
maxcyc=2000 Maximum minimization cycles
ncyc=1000 The steepest descent algorithm for the first 0-ncyc cycles, then switches the conjugate gradient algorithm for ncyc-maxcyc cycles
ntpr=100 Print to the Amber mdout output file every ntpr cycles
ntwx=0 No Amber mdcrd trajectory file written (not applicable to minimization)
cut=8.0 Nonbonded cutoff distance in Angstroms (for PME, limit of the direct space sum - do NOT reduce this below 8.0. Higher numbers give slightly better accuracy but at vastly increased computational cost.)

Heating input

  1. In the editor, in a new file, copy the following lines below:
Heat
 &cntrl
  imin=0,
  ntx=1,
  irest=0,
  nstlim=10000,
  dt=0.002,
  ntf=2,
  ntc=2,
  tempi=0.0,
  temp0=300.0,
  ntpr=100,
  ntwx=100,
  cut=8.0,
  ntb=1,
  ntp=0,
  ntt=3,
  gamma_ln=2.0,
  nmropt=1,
  ig=-1,
 /
&wt type='TEMP0', istep1=0, istep2=9000, value1=0.0, value2=300.0 /
&wt type='TEMP0', istep1=9001, istep2=10000, value1=300.0, value2=300.0 /
&wt type='END' /
  1. Save the file as 02_Heat.in, this code includes settings for heating. The settings can be summarized as follows:
Setting Explanation
imin=0 Choose a molecular dynamics (MD) run [no minimization]
nstlim=10000 Number of MD steps in run (nstlim * dt = run length in ps)
dt=0.002 Time step in picoseconds (ps). The time length of each MD step
ntf=2 Setting to not calculate force for SHAKE constrained bonds
ntc=2 Enable SHAKE to constrain all bonds involving hydrogen
tempi=0.0 Initial thermostat temperature in K (see NMROPT section)
temp0=300.0 Final thermostat temperature in K (see NMROPT section)
ntwx=1000 Write Amber trajectory file mdcrd every ntwx steps
ntb=1 Periodic boundaries for constant volume
ntp=0 No pressure control
ntt=3 Temperature control with Langevin thermostat
gamma_ln=2.0 Langevin thermostat collision frequency
nmropt=1 NMR restraints and weight changes read (see NMROPT section)
ig=-1 Randomize the seed for the pseudo-random number generator [always a good idea unless you are debugging a simulation problem]

The final three lines allow the thermostat to change its target temperature throughout the simulation. For the first 9000 steps, the temperature will increase from 0K to 300K. For steps 9001 to 10000, the temperature will remain at 300K.

Production input

NTPR and NTWX are set very low so that it is possible to analyze this short simulation. Using these settings for longer MD simulations will create very large output and trajectory files and will be slower than regular MD. For real production MD, you'll need to increase NTPR and NTWX.

The production time of this simulation is only 60ps. Ideally, we would run this simulation for much longer, but in the interest of time for this tutorial, we have limited the production simulation time.

  1. In the editor, in a new file, copy the following lines below:
Production
 &cntrl
  imin=0,
  ntx=5,
  irest=1,
  nstlim=30000,
  dt=0.002,
  ntf=2,
  ntc=2,
  temp0=300.0,
  ntpr=100,
  ntwx=100,
  cut=8.0,
  ntb=2,
  ntp=1,
  ntt=3,
  gamma_ln=2.0,
  ig=-1,
 /
  1. Save the file as 03_Prod.in, this code includes settings for heating. The settings can be summarized as follows:
Setting Explanation
ntx=5 Read coordinates and velocities from unformatted rst7 coordinate file
irest=1 Restart previous MD run [This means velocities are expected in the rst7 file and will be used to provide initial atom velocities]
temp0=300.0 Thermostat temperature. Run at 300K
ntb=2 Use periodic boundary conditions with constant pressure
ntp=1 Use the Berendsen barostat for constant pressure simulation
  1. Check if these files have been created in your directory using Terminal.
pwd
ls

Run Amber MD sander

Now that we have all the ingredients: the parameter and topology file parm7, the coordinate file rst7, and the input files 01_Min.in, 02_Heat.in, 03_Prod.in, we are ready to run the actual minimization, heating, and production MD.

To do this, we will use the program sander, the general purpose MD engine of Amber (there is also a high performance version, termed pmemd, which is part of the commercial version of AMBER and the optimum choice of MD engine but for purposes of the tutorial sander will suffice). sander is run from the command line. On the command line, we can specify several more options and choose which files are to be used for input.