Protein w Solute: Protein Preparation - mms-fcul/CpHMD-container GitHub Wiki
The first step of every CpHMD preparation is fairly similar with the one from a simple protein in solution. Most of the complexity of CpHMD running lies in the preparation of the protein we are going to simulate.
In this tutorial, we will be starting from the structure of a beta glucosidade (Bglu) with an inhibitor (PNW) from the PDB id: 3ai0.
Since the inhibitor will be treated as an include and will not be added natively to the force field, we are going to treat the protein separately from the solute.
The first step of the CpHMD workflow is naturally getting the container from the git into your home. The CpHMD container is disseminated in the form of GitHub releases. As such, you either download the container from Download link or by doing:
wget https://github.com/mms-fcul/CpHMD-container/releases/latest/download/CpHMD.sif
After having the container ready in your folder, the next step is to obtain the protein structure into your folder in the form of a .pdb
The first step is to open this pdb in your preferred editor and save a first clean pdb with only the Bglu protein (Bglu-start.pdb) and a second clead pdb with only the Inhibitor molecule PNW (PNW.pdb).
Warning
For protein with termini uncapped (N-ter and C-ter exposed), the current code is restricted to have them always titrating, hence they need to be treated so they are titrating.
The C-termini need to have any O2; OXT ;OT2 atoms removed from the starting pdb. These will be created in the pdb2gmx step with the correct CpHMD termini block.
In this first step, we will be focusing on the protein alone. We will need to convert the amino acids from their regular names into the CpHMD compliant names.
CpHMD includes tautomers for each state of the titratable amino acids, for example, an aspartic acid instead of being called ASP as regular force fields, it retains the first two letters and the third is the tautomer numbering, making the array AS0, AS1, AS2, AS3 and AS4.
This comes from the titratable carboxylic acid of the side chain having 4 different positions where protonation can occur (2 per each Oxygen atom with hydrogen facing forward or backward - AS0-3) and one state with no protonation (AS4).
To prepare your pdb, the container already has an app that reads your pdb file and exchanges the amino acids of your choice to the CpHMD nomenclature. This tool is called pdb2cphmd and can be called like so:
singularity run --app pdb2cphmd CpHMD.sif [-h for help] -p <pdb file> \
-f <force-field (GROMOS;CHARMM;Amber)> \
-r <residue names to change for CpHMD - ASP GLU HIS, etc>
in this tutorial we used:
singularity run --app pdb2cphmd CpHMD.sif -p Bglu-start.pdb -f GROMOS -r ASP GLU LYS TYR HIS
This bash script will run over your starting pdb structure and change the given amino acid names into their CpHMD compatible form. This script will output a file named after your initial file with the suffix "_CpHMD.pdb".
Additionally, the script will copy the chosen force field from the container into your base directory. This will enable the future treatment of the molecules with the CpHMD compatible force field.
Note
Any user modified/created residue on the pdb, will not be present in the force field. To add a custom residue/molecule to the CpHMD simulation, one must implement that residue block into the force field created from the pdb2cphmd app.
Completing the setup of the protein system, it is essential to generate the topology that will be used for our system.
In our case, we will use the GROMACS tool pdb2gmx. The user can perform this command on their compilation of GROMACS or, if preferred, can use the GROMACS 2024.3 provided within the container without needing to install any other software!
In this tutorial we will always use the GROMACS provided within the container:
singularity exec CpHMD.sif gmx pdb2gmx -f ./<your pdb file.pdb> -p <output-name>.top -o <output-name>.gro -ignh -renum -ter
Used in the tutorial:
singularity exec CpHMD.sif gmx pdb2gmx -f ./Bglu-start_CpHMD.pdb -p Bglu.top -o Bglu.gro -ignh -renum -ter
The program will ask you to select the force field to be used, which should be the first option (1) since the CpHMD compatible FF is present in the folder.
Alternatively, you could provide the force field to use with the -ff command, giving the specific force field folder name, G54a7pH for GROMOS, CHARMM36pH for CHARMM36, and Amber14SBpH for AMBER.
Then you will select both the water type and the termini of your system.
Important
If your protein system has uncapped termini, you need to take note of what option to give on the termini construction. If protein termini >are uncapped: When pdb2gmx asks for the termini residue you should select the option CpH+ (3) for the N-termini and CpH- (4) for the C->termini.
If the protein termini are capped: You won't need to treat these residues, hence when asked for the termini construction select None.