Using mdpocket fpocket - dkoes/docs GitHub Wiki

mdpocket/fpocket

mdpocket and its other associated programs are utilized by our group to find and analyze likely binding pockets for MD simulations. This will be an abbreviated version of the full documentation which can be found here. The general workflow of using this suite of tools is: 1) prepare snapshots, 2) Identify pockets, 3) manually extract said pockets, 4) run the scoring for said extracted pockets. If you get stuck, the documentation is the place to go (or asking someone in the groups help channel on slack).

Step 1 -- Preparing snapshots

The first thing you must do is align your simulation snapshots. This is done through the cpptraj tool (on the cluster you will need to module load amber to get this in your path). I assume that everything you need is in the current working directory. First make a snapshots directory:

mkdir snapshots

Then we must create the input file for cpptraj called traj_input.txt

parm foo.prmtop
trajin foo_md3.nc 1 last 10
strip :WAT,Cl-,Na+ outprefix stripped
autoimage
rms first @CA
trajout snapshots/snap.pdb pdb multi
run

foo.prmtop is the topology file that you used for the MD simulation

foo_md3.nc is the output trajectory file from the MD simulation (the 1 last 10 part will load every 10th frame). If you have multiple trajectories you can provide multiple trajin commands.

strip is a cpptraj command to remove atoms. Here we used it to remove the waters and chloride ions.

rms first @CA is the command to align the trajectories to the first frame by the alpha carbons.

trajout snapshots/snap.pdb pdb multi will output the frames as individual pdbfiles into a snapshots directory. These are the basis for the input to the rest of the mdpocket/fpocket scripts.

If you have three replicates, use below:

parm foo.prmtop
trajin foo_md3_1.nc 1 last 10
trajin foo_md3_2.nc 1 last 10
trajin foo_md3_3.nc 1 last 10
reference foo.rst7
strip :WAT,Cl-,Na+ outprefix stripped
autoimage
rms reference @CA    #Or a subset of stable residues for example :1-9,31-334@CA
trajout snapshots/snap.pdb pdb multi
run

Which you can run with cpptraj traj_input.txt

This will create a bunch of files in the snapshots directory, but unfortunately they are named in a not so friendly format. In order to name the files something more appropriate run the following commands:

cd snapshots
ls snap.pdb.* | cut -f3 -d"." | awk '{print "mv snap.pdb."$0" snap_"$0".pdb"}' | sh
cd ..

This will rename the files from snap.pdb.# to snap#.pdb_ and move you back into the starting directory. You will never need to be in the snapshots directory again.

Step 2 -- Identifying pockets

Now that the snapshot files are created, we must create an input file to mdpocket/fpocket which specifies each snapshot. Remember that you need to be in the root directory NOT snapshots! Luckily, there should be a built-in script to generate this for you (I assume that you are running on the cluster, if not it comes with the mdpocket distribution, and will be present wherever you installed the package)!

python /net/pulsar/home/koes/dkoes/build/fpocket2/scripts/createMDPocketInputFile.py snapshots mdpocket_input.txt

This will create the mdpocket_input.txt file in the current working directory.

Now we are ready to identify some pockets!

/net/pulsar/home/koes/dkoes/build/fpocket2/bin/mdpocket -L mdpocket_input.txt

This will create a variety of output files, but the one we are most interested in is mdpout_freq_grid.dx which is a grid file which contains measures of how frequently a given pocket is available across the snapshots.

Step 3 -- Manually extracting pockets

Load the mdpout_freq_grid.dx file into VMD for visualization. Once loaded, use the Graphics-> Representations window to change the Drawing Method to "Isosurface" and use the slider to adjust the "Isovalue". This represents the frequency that the given point is available across your snapshots (so 0.5 means open half the time). It is generally helpful to also load in one of the snapshots so you can get some idea for where the identified pockets are on the protein.

In pymol, you can load mdpout_freq_grid.dx file along with one of the snapshots and use below command with different isovalues:

isosurface surf, mdpout_freq_grid, <different isovalues>

Sadly, there is no great advice available here. You want pockets that are not too large, which also occur frequently. Once you have some isovalue which you are happy with, we can move on.

Now we need to extract those specific pockets. This is done with the extractIso.py script.

python /net/pulsar/home/koes/dkoes/build/fpocket2/scripts/extractISOPdb.py mdpout_freq_grid.dx allpockets.pdb <isovalue>

Then, you can load allpockets.pdb in pymol to split up multiple pockets if present and save them as their own individual PDB file (e.g. my_pocket.pdb)

Step 4 -- Run the scoring

Now you are finally ready to actually run the scoring!

/net/pulsar/home/koes/dkoes/build/fpocket2/bin/mdpocket -L mdpocket_input.txt -f my_pocket.pdb -v 10000

The mdpout_descriptors.txt contains all of the descriptors for each snapshot.

Note if you want the druggability_score:

Use:

for i in snapshots/*.pdb 
do
 /net/pulsar/home/koes/dkoes/build/fpocket2/bin/fpocket -f $i
done

This will perform a full analysis of each snapshot, identifying all the pockets in that snapshot. To pull out the druggability score for only the pocket you are interested in, use the script:

python3 /net/pulsar/home/koes/dkoes//git/scripts/extract_pockets.py pocket.pdb snapshots > druggability.txt

This will put descriptors for just pocket.pdb, including the druggability score, into druggability.txt.

⚠️ **GitHub.com Fallback** ⚠️