3. AMDock WORKFLOW and GUI - Valdes-Tresanco-MS/AMDock-win GitHub Wiki

To understand how AMDock works and what kind of information is needed, it is necessary to know which programs are included in the AMDock environment and which procedures are performed.

The AMDock main window has five tabs: 1) Home, 2) Docking Options, 3) Results Analysis, 4) Configuration and 5) Info.

3.1. Main tab

AMDock assists docking runs with AutoDock Vina and AutoDock4 (including AutoDock4Zn force field for zinc-containing metalloproteins). Once the docking engine has been selected, you can perform a Simple Docking, Off-Target Docking or Scoring procedure.

AMDock proceeds by following 5 steps:

  1. Defining a working directory
  2. Preparing input files
  3. Defining a search space
  4. Run docking
  5. Analysis of results

3.1.1 Defining a working directory

Firstly, a new directory is created (default name: Docking_Project) in a location specified by the user. This directory contains 2 folders: i) input, where the protein (.pdb, .ent, .pdbqt) and ligand (.pdb, .mol2, .pdbqt) files are stored and ii) results, where docking results are stored. Docking results will depend on the selected procedure (Simple Docking, Off-Target Docking or Scoring). Please, see the TUTORIALS section for more information. A *.amdock file is generated and placed in the working directory. This file contains the summarized data of the docking procedure. A log file containing all the output from all the programs and algorithms used can be also generated.

3.1.2 Preparing input files

For Protein: The cartesian coordinates of the protein are needed. These can be taken from the protein file or a protein-ligand complex. Coordinates usually come from X-ray crystallography, NMR spectroscopy cryo-EM or model-building. AMDock deals with one of the three formats listed below:

.pdbqt - This means you have prepared the protein input file by yourself and AMDock will not perform any modification. This is recommended only if you know the minimum procedure to generate a proper .pdbqt file.
.pdb or .ent - Files in these formats are processed by AMDock in order to generate a proper .pdbqt file (prepare_receptor4.py). AMDock will perform several tasks for you:

  1. remove heteroatoms. That includes water molecules, ions, crystallization reagents, ligands.
    Important! Even when ligands atoms are removed from the protein-ligand file, ligand coordinates are stored and can be used to define a search space (see TUTORIALS section). If some cofactor or ligand should be conserved at the active site because of its importance for the docking process, it is recommended to prepare a .pdbqt file in an external program since AMDock will remove all ligand heteroatoms present in the receptor (see TUTORIALS section).
  2. delete alternates (prepare_receptor4.py)
  3. determine protonation states (PDB2PQR v. 2.01)
  4. complete missing side chains (PDB2PQR v. 2.01) (Important! This software is not able to model missing residues. These residues can be modeled with external software, i.e. MODELLER)
  5. merge charges and remove non-polar hydrogens (prepare_receptor4.py)
  6. align proteins (if “off-target” docking is selected)

For Ligand: The cartesian coordinates of the ligand are needed. These usually come from X-ray crystallography, NMR spectroscopy or model-building. AMDock deals with one of the three formats listed below:

.pdbqt - This means that you have prepared a protein input file by yourself and AMDock will not perform any modification. This is recommended only if you know the minimum procedure to generate a proper .pdbqt file.
.pdb or .mol2 - Files in these formats are processed by AMDock in order to generate a proper .pdbqt file (prepare_ligand4.py). AMDock will perform several tasks for you:

  1. determines protonation states (OpenBabel v. 2.4.1 or ADT)
  2. merge charges and remove non-polar hydrogens (prepare_ligand4.py)

pH: A pH value can be set for determining the protonation states of both the receptor and the ligand.

As can be seen, AMDock should be able to handle successfully most protein and ligand files. Even so, it is always recommendable checking the input files and remove unnecessary ions, solvent, cofactors; checking for “connects” on ligand files.

3.1.3 Defining a search space

A search space is defined when the coordinates of the geometrical center and the dimensions of the box have been defined. There are multiple options in AMDock for defining the search space (box’s center and dimensions). These options are listed below and comprise from automatic definition to user guide definition of the search space.

Automatic (known as “blind docking”): potential ligand-binding sites are identified and characterized by using the AutoLigand tool. The AutoLigand code generates objects (“FILL”) with the ligand’s dimensions. Since the binding site is unknown, boxes with optimal dimensions (see Optimal Box Size 1.1) are placed on the geometric center of each of the generated objects and an independent docking run is performed for each of the predicted binding sites. Binding sites are sorted at the end according to the results from the docking run. This option is recommendable only when you have no idea where the binding site is.

Center on Residue(s): Centering on residues usually leads to small/big search space which compromises the reproducibility of the results. Therefore, instead of centering directly over the selected residues, we employed Autoligand for generating an object located at the geometric center of the selected residues. Then, a box with optimal dimensions (see Optimal Box Size 1.1) is placed at the geometric center of the generated object. This way, the optimal size of the box is guaranteed without increasing the size of the box. This option is recommendable when you know which residue(s) is/are located at the binding site. This information usually comes from mutagenesis experiments, comparing proteins belonging to the same family, in silico prediction of binding sites, etc.

Center on Hetero: A box with optimal dimensions (see Optimal Box Size 1.1) is placed on the geometric center of the ligand. This option will be available only if a protein-ligand complex was used as the protein input file. This option is recommendable when you are interested in doing redocking experiments, docking ligands belonging to the same family or ligands with the same binding mode.

Box: Box’s center and dimensions are defined by the user. This option is usually preferred by expert users. Remember, the box’s center and dimensions should be as proper as possible to ensure a suitable search space. The search space can be visualized and modified at the user convenience using PyMOL. A convenient representation is showed in PyMOL, highlighting the most important elements for each method. This constitutes an advantage as compared to traditional programs with fewer options.

3.1.4 Run docking

Running time will depend on several factors such as: the number of poses, number of rotatable bonds of ligand, etc., although AutoDock Vina tends to be faster than AutoDock4 by orders of magnitude. The whole process can be followed on the right side of the screen.

3.1.5 Analysis of results

After docking run ends, you will get automatically into the “Results Analysis” tab. There, you will observe a summarizing table with Binding Energies, Estimated Ki values, and Ligand Efficiencies.

3.2 Results Analysis tab

The result tab contains the final results of the docking run. This includes: scores values, predicted Ki values and Ligand Efficiencies. Pressing the “Show in PyMOL” button in the left bottom corner you will visualize in PyMOL the complex between the receptor and the ligand pose with the lowest energy. When performing “off-target” docking, both receptors can be visualized at the same time, which allows for a better comparison between the two predicted complexes. Polar contacts are shown as dashed yellow lines and an image with publication standards could be generated after few clicks (See APPENDIX). Keep in mind that the view is created automatically by a PyMOL script. Since it is automatic, the view cannot be in the most informative angle.

3.3 Configuration and Info tab

In the configuration tab, it is possible to change several parameters related to the programs and algorithms in AMDock such as: the number of processors for running the docking calculation, the number of poses, etc.

The info tab contains all the references to the documentation and original publications of the programs and algorithms included in AMDock.