2023.05.08.6GJ8_ChemGrid_Score - tbalius/teb_docking_test_sets GitHub Wiki
Here we describe docking with the ChemGrid scoring function.
This is automated with scripts.
The scripts are made available in this repository.
Additional scripts are available in the teb_scripts_programs repository.
There are the following software requirements:
- The user will need a Python 2 environment that has the following packages: Numpy.
- The user will need a Python 3 environment that has the following packages: Numpy, RDKit.
- The user will need AmberTools.
- The user will need ChemAxon, Corina, and Amsol7.1 for database generation.
- The user will also need a copy of UCSF Chimera for visualization and running on the command line.
- The user will need DOCK 3.7 to use blastermaster (a python 2 program) for preparing the receptor.
- The user will need DOCK 6.12. This is not released yet.
Recommended directory structure:
6GJ8_ChemGrid_tutorial
|-- 01.dockprep
|-- 02.docking
+-- 03.enrichment
Prepare the system for docking in the 01.dockprep directory.
Scripts are available here.
Download the Dock Prep scripts from GitHub by running the following wget commands:
wget https://raw.githubusercontent.com/tbalius/teb_docking_test_sets/main/scripts_for_tutorial/scripts_for_2023.05.08.6GJ8_ChemGrid_Score/0001.processing.csh
wget https://raw.githubusercontent.com/tbalius/teb_docking_test_sets/main/scripts_for_tutorial/scripts_for_2023.05.08.6GJ8_ChemGrid_Score/0002.run_chimera_dockprep.csh
wget https://raw.githubusercontent.com/tbalius/teb_docking_test_sets/main/scripts_for_tutorial/scripts_for_2023.05.08.6GJ8_ChemGrid_Score/0002p5.fix_broken_h.csh
wget https://raw.githubusercontent.com/tbalius/teb_docking_test_sets/main/scripts_for_tutorial/scripts_for_2023.05.08.6GJ8_ChemGrid_Score/0003.run.antechamber.csh
wget https://raw.githubusercontent.com/tbalius/teb_docking_test_sets/main/scripts_for_tutorial/scripts_for_2023.05.08.6GJ8_ChemGrid_Score/0004.blastermaster.csh
wget https://raw.githubusercontent.com/tbalius/teb_docking_test_sets/main/scripts_for_tutorial/scripts_for_2023.05.08.6GJ8_ChemGrid_Score/0004.blastermaster_manualProt.csh
wget https://raw.githubusercontent.com/tbalius/teb_docking_test_sets/main/scripts_for_tutorial/scripts_for_2023.05.08.6GJ8_ChemGrid_Score/0005.convert_dock3_to_dock6.csh
wget https://raw.githubusercontent.com/tbalius/teb_docking_test_sets/main/scripts_for_tutorial/scripts_for_2023.05.08.6GJ8_ChemGrid_Score/0006.run_chimera_dockprep_lig.csh
wget https://raw.githubusercontent.com/tbalius/teb_docking_test_sets/main/scripts_for_tutorial/scripts_for_2023.05.08.6GJ8_ChemGrid_Score/0007.run_ligand_building.csh
wget https://raw.githubusercontent.com/tbalius/teb_docking_test_sets/main/scripts_for_tutorial/scripts_for_2023.05.08.6GJ8_ChemGrid_Score/0007.run_ligand_building_from_smi.csh
wget https://raw.githubusercontent.com/tbalius/teb_scripts_programs/master/zzz.scripts/mol2toDOCK37type.py
wget https://raw.githubusercontent.com/tbalius/teb_scripts_programs/master/zzz.scripts/chimera_dockprep.py
wget https://raw.githubusercontent.com/tbalius/teb_scripts_programs/master/zzz.scripts/chimera_addh.py
wget https://raw.githubusercontent.com/tbalius/teb_scripts_programs/master/zzz.scripts/replace_atom_ele.py
Copy and paste the code block into a C shell script and run the script:
>> csh download_scripts.csh
You will need to modify several of the scripts, go through the scripts and change any hardcoded paths.
If there is a call to a script you do not have, it is most likely available in the teb_scripts_programs repository.
To prepare the receptor and cofactors files, run the following scripts in order:
csh 0001.processing.csh
csh 0002.run_chimera_dockprep.csh
csh 0002p5.fix_broken_h.csh
csh 0003.run.antechamber.csh
The 0001.processing.csh script downloads the 6GJ8.pdb file from the web and splits the file into receptor (with magnesium ion), ligand, and cofactor files called rec.pdb, lig.pdb, and cof.pdb, respectively. The 0002.run_chimera_dockprep.csh script first calls the 0002.replace_atom_ele.py script, which converts the carbon to an oxygen in the cofactor file to change the cofactor from GCP to GTP, then calls Chimera's Dock Prep and AddH functions on the receptor and cofactor files. The 0002p5.fix_broken_h.csh script corrects the incorrectly protonated cofactor file by replacing the name and coordinates of the misplaced hydrogen with the name and coordinates of the correctly placed hydrogen. The 0003.run.antechamber.csh script runs Amber's AnteChamber.
To generate spheres and grids for docking, run the following scripts:
- Note that you will need to wait for the blastermaster jobs to finish before proceeding to the next step.
csh 0004.blastermaster.csh
csh 0004.blastermaster_manualProt.csh
The 0004.blastermaster.csh script runs blastermaster on the receptor without the cofactor. The 0004.blastermaster_manualProt.csh script runs blastermaster on the receptor with the cofactor.
Blastermaster generates grids for DOCK 3. To convert the grids to DOCK 6 readable grids, run the 0005.convert_dock3_to_dock6.csh script:
csh 0005.convert_dock3_to_dock6.csh
To prepare the ligand for docking, run the following scripts in order:
csh 0006.run_chimera_dockprep_lig.csh
csh 0007.run_ligand_building.csh
csh 0007.run_ligand_building_from_smi.csh
The 0006.run_chimera_dockprep_lig.csh will run Chimera's Dock Prep function on the ligand file, lig.pdb. The 0007.run_ligand_building.csh and 0007.run_ligand_building_from_smi.csh scripts generate the ligand database for docking. The scripts called in the 0007.run_ligand_building.csh and 0007.run_ligand_building_from_smi.csh are distributed with DOCK 6.
The database generation scripts use the following programs: ChemAxon, Corina, and Amsol7.1. As an alternative, the tldr web server can be used to generate the ligand database using the DOCK 3.8 pipeline. A walkthrough of the using tldr is given in the main tutorial.
Change directories to the 02.docking directory.
This calculation performs an energy evaluation on the crystallographic ligand conformation without any movement.
Create the input file for the single point calculation called single_point.in and format it as follows:
conformer_search_type rigid
use_internal_energy yes
internal_energy_rep_exp 12
internal_energy_cutoff 100.0
ligand_atom_file ../01.dockprep/chimera_lig/db2_build/lig_solv.mol2
limit_max_ligands no
skip_molecule no
read_mol_solvation yes
calculate_rmsd yes
use_rmsd_reference_mol yes
rmsd_reference_filename ../01.dockprep/chimera_lig/db2_build/lig_solv.mol2
use_database_filter no
orient_ligand no
bump_filter no
score_molecules yes
contact_score_primary no
grid_score_primary no
gist_score_primary no
multigrid_score_primary no
dock3.5_score_primary yes
dock3.5_vdw_score yes
dock3.5_grd_prefix ../01.dockprep/blastermaster_cof/dock6files/chem52
dock3.5_electrostatic_score yes
dock3.5_ligand_desolvation_score volume
dock3.5_solvent_occlusion_file ../01.dockprep/blastermaster_cof/dock6files/ligand.desolv.heavy
dock3.5_redistribute_positive_desolvation no
dock3.5_hydrogen_desolvation_grid yes
dock3.5_hydrogen_solvent_occlusion_file ../01.dockprep/blastermaster_cof/dock6files/ligand.desolv.hydrogen
dock3.5_receptor_desolvation_score no
dock3.5_write_atomic_energy_contrib no
dock3.5_score_vdw_scale 1
dock3.5_score_es_scale 1
minimize_ligand no
simplex_final_min no
simplex_random_seed 0
simplex_restraint_min no
atom_model all
vdw_defn_file ${dock6_path}/parameters/vdw_AMBER_parm94.dock3_7.defn
flex_defn_file ${dock6_path}/parameters/flex.defn
flex_drive_file ${dock6_path}/parameters/flex_drive.tbl
ligand_outfile_prefix single_point.out
write_mol_solvation no
write_orientations no
num_scored_conformers 1
score_threshold 100.0
rank_ligands no
To run the single point calculation:
>> ${dock6_path}/bin/dock6 -i single_point.in -o single_point.out
- ${dock6_path} is a variable (placeholder) for the DOCK 6 installation path (e.g. /home/baliuste/zzz.github/dock6)
This calculation performs a minimization run on the crystallographic ligand conformation, allowing the ligand to subtly adjust its conformation (internal degrees of freedom) and orientation (3 translational and 3 orientational degrees of freedom).
Create the input file for energy minimization called min.in and format it as follows:
conformer_search_type rigid
use_internal_energy yes
internal_energy_rep_exp 12
internal_energy_cutoff 100.0
ligand_atom_file ../01.dockprep/chimera_lig/db2_build/lig_solv.mol2
limit_max_ligands no
skip_molecule no
read_mol_solvation yes
calculate_rmsd yes
use_rmsd_reference_mol yes
rmsd_reference_filename ../01.dockprep/chimera_lig/db2_build/lig_solv.mol2
use_database_filter no
orient_ligand no
bump_filter no
score_molecules yes
contact_score_primary no
grid_score_primary no
gist_score_primary no
multigrid_score_primary no
dock3.5_score_primary yes
dock3.5_vdw_score yes
dock3.5_grd_prefix ../01.dockprep/blastermaster_cof/dock6files/chem52
dock3.5_electrostatic_score yes
dock3.5_ligand_desolvation_score volume
dock3.5_solvent_occlusion_file ../01.dockprep/blastermaster_cof/dock6files/ligand.desolv.heavy
dock3.5_redistribute_positive_desolvation no
dock3.5_hydrogen_desolvation_grid yes
dock3.5_hydrogen_solvent_occlusion_file ../01.dockprep/blastermaster_cof/dock6files/ligand.desolv.hydrogen
dock3.5_receptor_desolvation_score no
dock3.5_write_atomic_energy_contrib no
dock3.5_score_vdw_scale 1
dock3.5_score_es_scale 1
minimize_ligand yes
simplex_max_iterations 1000
simplex_tors_premin_iterations 0
simplex_max_cycles 1
simplex_score_converge 0.1
simplex_cycle_converge 1.0
simplex_trans_step 1.0
simplex_rot_step 0.1
simplex_tors_step 10.0
simplex_final_min no
simplex_random_seed 0
simplex_restraint_min no
atom_model all
vdw_defn_file ${dock6_path}/parameters/vdw_AMBER_parm94.dock3_7.defn
flex_defn_file ${dock6_path}/parameters/flex.defn
flex_drive_file ${dock6_path}/parameters/flex_drive.tbl
ligand_outfile_prefix min.out
write_mol_solvation no
write_orientations no
num_scored_conformers 1
score_threshold 100.0
rank_ligands no
Run energy minimization:
>> ${dock6_path}/bin/dock6 -i min.in -o min.out
To view the RMSD of the minimized ligand conformation, head the generated MOL2 file:
>> head min.out_scored.mol2
The minimized ligand conformation can be visualized in Chimera by opening the output MOL2 file with ViewDock.
To test orienting without considering ligand flexibility, we perform rigid docking. We use the crystallographic ligand conformation for this test. This is primarily used for testing purposes only, not for screening.
Create the input file for rigid docking called rigid.in and format it as follows:
conformer_search_type rigid
use_internal_energy yes
internal_energy_rep_exp 12
internal_energy_cutoff 100.0
ligand_atom_file ../01.dockprep/chimera_lig/db2_build/lig_solv.mol2
limit_max_ligands no
skip_molecule no
read_mol_solvation yes
calculate_rmsd yes
use_rmsd_reference_mol yes
rmsd_reference_filename ../01.dockprep/chimera_lig/db2_build/lig_solv.mol2
use_database_filter no
orient_ligand yes
automated_matching no
automated_matching_iteration no
distance_tolerance 0.25
distance_minimum 0.0
nodes_minimum 3
nodes_maximum 5
receptor_site_file ../01.dockprep/blastermaster_cof/dock6files/matching_spheres.sph
max_orientations 1000
critical_points no
chemical_matching no
use_ligand_spheres no
bump_filter no
score_molecules yes
contact_score_primary no
grid_score_primary no
gist_score_primary no
multigrid_score_primary no
dock3.5_score_primary yes
dock3.5_vdw_score yes
dock3.5_grd_prefix ../01.dockprep/blastermaster_cof/dock6files/chem52
dock3.5_electrostatic_score yes
dock3.5_ligand_desolvation_score volume
dock3.5_solvent_occlusion_file ../01.dockprep/blastermaster_cof/dock6files/ligand.desolv.heavy
dock3.5_redistribute_positive_desolvation no
dock3.5_hydrogen_desolvation_grid yes
dock3.5_hydrogen_solvent_occlusion_file ../01.dockprep/blastermaster_cof/dock6files/ligand.desolv.hydrogen
dock3.5_receptor_desolvation_score no
dock3.5_write_atomic_energy_contrib no
dock3.5_score_vdw_scale 1
dock3.5_score_es_scale 1
minimize_ligand yes
simplex_max_iterations 1000
simplex_tors_premin_iterations 0
simplex_max_cycles 1
simplex_score_converge 0.1
simplex_cycle_converge 1.0
simplex_trans_step 1.0
simplex_rot_step 0.1
simplex_tors_step 10.0
simplex_final_min no
simplex_random_seed 0
simplex_restraint_min no
atom_model all
vdw_defn_file ${dock6_path}/parameters/vdw_AMBER_parm94.dock3_7.defn
flex_defn_file ${dock6_path}/parameters/flex.defn
flex_drive_file ${dock6_path}/parameters/flex_drive.tbl
ligand_outfile_prefix rigid.out
write_mol_solvation no
write_orientations no
num_scored_conformers 1
score_threshold 100.0
rank_ligands no
Run rigid docking:
>> ${dock6_path}/bin/dock6 -i rigid.in -o rigid.out
To view the RMSD of the docked ligand conformation, head the generated MOL2 file:
>> head rigid.out_scored.mol2
The rigid.out_scored.mol2 file can be visualized using Chimera's ViewDock:
This test combines orienting and ligand flexibility. We will dock using both the crystallographic ligand conformation and the ligand database built from SMILES for this test.
Flexible docking with crystallographic ligand conformations
Create the input file for flexible docking with the crystallographic ligand conformation called flex.in and format it as follows:
conformer_search_type flex
write_fragment_libraries no
user_specified_anchor no
limit_max_anchors no
min_anchor_size 5
pruning_use_clustering yes
pruning_max_orients 1000
pruning_clustering_cutoff 100
pruning_conformer_score_cutoff 100.0
pruning_conformer_score_scaling_factor 1.0
use_clash_overlap no
write_growth_tree no
use_internal_energy yes
internal_energy_rep_exp 12
internal_energy_cutoff 100.0
ligand_atom_file ../01.dockprep/chimera_lig/db2_build/lig_solv.mol2
limit_max_ligands no
skip_molecule no
read_mol_solvation yes
calculate_rmsd yes
use_rmsd_reference_mol yes
rmsd_reference_filename ../01.dockprep/chimera_lig/db2_build/lig_solv.mol2
use_database_filter no
orient_ligand yes
automated_matching yes
receptor_site_file ../01.dockprep/blastermaster_cof/dock6files/matching_spheres.sph
max_orientations 1000
critical_points no
chemical_matching no
use_ligand_spheres no
bump_filter no
score_molecules yes
contact_score_primary no
grid_score_primary no
gist_score_primary no
multigrid_score_primary no
dock3.5_score_primary yes
dock3.5_vdw_score yes
dock3.5_grd_prefix ../01.dockprep/blastermaster_cof/dock6files/chem52
dock3.5_electrostatic_score yes
dock3.5_ligand_desolvation_score volume
dock3.5_solvent_occlusion_file ../01.dockprep/blastermaster_cof/dock6files/ligand.desolv.heavy
dock3.5_redistribute_positive_desolvation no
dock3.5_hydrogen_desolvation_grid yes
dock3.5_hydrogen_solvent_occlusion_file ../01.dockprep/blastermaster_cof/dock6files/ligand.desolv.hydrogen
dock3.5_receptor_desolvation_score no
dock3.5_write_atomic_energy_contrib no
dock3.5_score_vdw_scale 1
dock3.5_score_es_scale 1
minimize_ligand yes
minimize_anchor yes
minimize_flexible_growth yes
use_advanced_simplex_parameters no
minimize_flexible_growth_ramp no
simplex_max_cycles 1
simplex_score_converge 0.1
simplex_cycle_converge 1.0
simplex_trans_step 1.0
simplex_rot_step 0.1
simplex_tors_step 10.0
simplex_anchor_max_iterations 500
simplex_grow_max_iterations 250
simplex_grow_tors_premin_iterations 0
simplex_final_min no
simplex_random_seed 0
simplex_restraint_min no
atom_model all
vdw_defn_file ${dock6_path}/parameters/vdw_AMBER_parm94.dock3_7.defn
flex_defn_file ${dock6_path}/parameters/flex.defn
flex_drive_file ${dock6_path}/parameters/flex_drive.tbl
ligand_outfile_prefix flex.out
write_mol_solvation no
write_orientations no
num_scored_conformers 1
score_threshold 100.0
rank_ligands no
Run flexible docking:
>> ${dock6_path}/bin/dock6 -i flex.in -o flex.out
To view the RMSD of the top scoring docked ligand conformation, head the generated MOL2 file:
>> head flex.out_scored.mol2
The flex.out_scored.mol2 file can be visualized using Chimera's ViewDock:
In the example image, the crystallographic ligand is shown in blue and the docked pose is shown in green.
Flexible docking with ligand database built from SMILES
Create the input file for flexible docking with the ligand database generated from SMILES called flex_from_smi.in and format it as follows:
conformer_search_type flex
write_fragment_libraries no
user_specified_anchor no
limit_max_anchors no
min_anchor_size 5
pruning_use_clustering yes
pruning_max_orients 1000
pruning_clustering_cutoff 100
pruning_conformer_score_cutoff 100.0
pruning_conformer_score_scaling_factor 1.0
use_clash_overlap no
write_growth_tree no
use_internal_energy yes
internal_energy_rep_exp 12
internal_energy_cutoff 100.0
ligand_atom_file ../01.dockprep/db2_build_from_smi/db_build_working/F0K/F0K_0_output_lig_solv.mol2
limit_max_ligands no
skip_molecule no
read_mol_solvation yes
calculate_rmsd no
use_database_filter no
orient_ligand yes
automated_matching yes
receptor_site_file ../01.dockprep/blastermaster_cof/dock6files/matching_spheres.sph
max_orientations 1000
critical_points no
chemical_matching no
use_ligand_spheres no
bump_filter no
score_molecules yes
contact_score_primary no
grid_score_primary no
gist_score_primary no
multigrid_score_primary no
dock3.5_score_primary yes
dock3.5_vdw_score yes
dock3.5_grd_prefix ../01.dockprep/blastermaster_cof/dock6files/chem52
dock3.5_electrostatic_score yes
dock3.5_ligand_desolvation_score volume
dock3.5_solvent_occlusion_file ../01.dockprep/blastermaster_cof/dock6files/ligand.desolv.heavy
dock3.5_redistribute_positive_desolvation no
dock3.5_hydrogen_desolvation_grid yes
dock3.5_hydrogen_solvent_occlusion_file ../01.dockprep/blastermaster_cof/dock6files/ligand.desolv.hydrogen
dock3.5_receptor_desolvation_score no
dock3.5_write_atomic_energy_contrib no
dock3.5_score_vdw_scale 1
dock3.5_score_es_scale 1
minimize_ligand yes
minimize_anchor yes
minimize_flexible_growth yes
use_advanced_simplex_parameters no
minimize_flexible_growth_ramp no
simplex_max_cycles 1
simplex_score_converge 0.1
simplex_cycle_converge 1.0
simplex_trans_step 1.0
simplex_rot_step 0.1
simplex_tors_step 10.0
simplex_anchor_max_iterations 500
simplex_grow_max_iterations 250
simplex_grow_tors_premin_iterations 0
simplex_final_min no
simplex_random_seed 0
simplex_restraint_min no
atom_model all
vdw_defn_file ${dock6_path}/parameters/vdw_AMBER_parm94.dock3_7.defn
flex_defn_file ${dock6_path}/parameters/flex.defn
flex_drive_file ${dock6_path}/parameters/flex_drive.tbl
ligand_outfile_prefix flex_from_smi.out
write_mol_solvation no
write_orientations no
num_scored_conformers 1
score_threshold 100.0
rank_ligands no
Run flexible docking using the new input file:
>> ${dock6_path}/bin/dock6 -i flex_from_smi.in -o flex_from_smi.out
To view the RMSD of the top scoring docked ligand conformation, head the generated MOL2 file:
>> head flex_from_smi.out_scored.mol2
The flex_from_smi.out_scored.mol2 file can be visualized using Chimera's ViewDock:
In the example image, the crystallographic ligand is shown in blue and the docked pose is shown in red.
In both cases, flexible docking with the crystallographic ligand conformation and with the ligand database built from SMILES, the docked pose only partially matches the crystallographic ligand. While DOCK is able to position the pocket-facing portion of the ligand correctly, DOCK is unable to do so for the solvent exposed portion of the ligand ending with the methylimidazole ring. When docking to the dimer, the additional interactions from the second protein allow DOCK to find poses that better match the crystallographic ligand pose as seen here.
To dock the ligand database built from SMILES using the hierarchical database (HDB) sampling routine, first create a file called sdi.txt that lists the .db2.gz files created by the 0007.run_ligand_building_from_smi.csh script.
>> ls ../01.dockprep/db2_build_from_smi/db_build_working/F0K/F0K_*_db2/*.db2.gz > sdi.txt
Create the input file for docking with HDB called hdb.in and format it as follows:
conformer_search_type HDB
num_per_search 10
skip_broken no
hdb_db2_input_file sdi.txt
hdb_db2_search_score_threshold 10.0
use_internal_energy yes
internal_energy_rep_exp 12
internal_energy_cutoff 100.0
use_database_filter no
orient_ligand yes
automated_matching yes
receptor_site_file ../01.dockprep/blastermaster_cof/dock6files/matching_spheres.sph
max_orientations 1000
critical_points no
chemical_matching no
use_ligand_spheres no
bump_filter no
score_molecules yes
contact_score_primary no
grid_score_primary no
gist_score_primary no
multigrid_score_primary no
dock3.5_score_primary yes
dock3.5_vdw_score yes
dock3.5_grd_prefix ../01.dockprep/blastermaster_cof/dock6files/chem52
dock3.5_electrostatic_score yes
dock3.5_ligand_desolvation_score volume
dock3.5_solvent_occlusion_file ../01.dockprep/blastermaster_cof/dock6files/ligand.desolv.heavy
dock3.5_redistribute_positive_desolvation no
dock3.5_hydrogen_desolvation_grid yes
dock3.5_hydrogen_solvent_occlusion_file ../01.dockprep/blastermaster_cof/dock6files/ligand.desolv.hydrogen
dock3.5_receptor_desolvation_score no
dock3.5_write_atomic_energy_contrib no
dock3.5_score_vdw_scale 1
dock3.5_score_es_scale 1
minimize_ligand yes
simplex_max_iterations 1000
simplex_tors_premin_iterations 0
simplex_max_cycles 1
simplex_score_converge 0.1
simplex_cycle_converge 1.0
simplex_trans_step 1.0
simplex_rot_step 0.1
simplex_tors_step 10.0
simplex_final_min no
simplex_random_seed 0
simplex_restraint_min no
atom_model all
vdw_defn_file ${dock6_path}/parameters/vdw_AMBER_parm94.dock3_7.defn
flex_defn_file ${dock6_path}/parameters/flex.defn
flex_drive_file ${dock6_path}/parameters/flex_drive.tbl
ligand_outfile_prefix hdb.out
write_mol_solvation no
write_orientations no
num_scored_conformers 1
score_threshold 100.0
rank_ligands no
Run docking using the hdb.in input file:
>> ${dock6_path}/bin/dock6 -i hdb.in -o hdb.out
In enrichment calculations, a set of known binders, actives, and a set of presumed non-binders, decoys, are docked to test the ability of the docking software to distinguish between actives and decoys. In this section, an example enrichment calculation is performed using DOCK 6 and a sample list of 5 ligands.
Change directories to the 03.enrichment directory.
This tutorial will use the same 5 ligands from the PDB that bind in the Switch I/II (SOS) Pocket given in the Grid Score tutorial.
Make a subdirectory called databases.
>> mkdir databases
In the databases directory, create a ligand SMILES file called actives.smi and format it as follows:
Cn1cc(nc1)Cn2ccc3c2cc(cc3)CNCc4c(c5ccccc5[nH]4)[C@@H]6c7cc(ccc7C(=O)N6)O F0K
C[NH+](C)Cc1ccc(cc1)Nc2ccc(c(c2OC)F)c3cccc(c3)OC F8Q
c1cc2c(cc1Cl)OC[C@H](O2)CN 9R5
c1ccc(c(c1)C(=S)N2CCCC2)O 0QW
CN(C)Cc1ccc(cc1)Nc2ccc(nc2OC)c3cccc4c3OCCO4 D1W
Each line gives a ligand SMILES string and identifier, separated by whitespace. The PDB can be searched with the three-character ligand identifier to view the ligand summary page and PDB entries in which the ligand appears. The BI-2852 molecule has the id F0K.
The ligand database can be prepared using tldr as in the main tutorial (no requirements), or by using the build_ligand_simple_with_dock6.csh script (provided the user has the required programs).
To build the ligand database using tldr refer to the Grid Score tutorial.
To build the ligand database using the build_ligand_simple_with_dock6.csh script, call the script with the ligand SMILES file as the argument.
>> csh ${dock6_path}/src/hdb_lig_gen/generate/build_ligand_simple_with_dock6.csh actives.smi
To generate property-matched decoys and prepare them for docking, refer to the Grid Score tutorial.
Run DOCK
In the 03.enrichment directory, get the shell script to run DOCK by running the wget command as shown to download it directly from this repository, or by copy and pasting the script into a file.
wget https://raw.githubusercontent.com/tbalius/teb_docking_test_sets/main/scripts_for_tutorial/scripts_for_2023.05.08.6GJ8_ChemGrid_Score/run_dock6_chemscore.csh
# set directories
set pwd = `pwd`
set filedir = "${pwd}/databases"
set mountdir = "${pwd}/.."
set dockfilesdir = "${mountdir}/01.dockprep/blastermaster_cof/dock6files"
set workdir = "${pwd}/6GJ8_docking_chem"
if (-e $workdir ) then
echo "$workdir exists."
exit
endif
mkdir $workdir
cd $workdir
ls -l ${filedir}/actives.smi ${filedir}/decoys.smi
# generate sdi.txt
rm -f sdi.txt
touch sdi.txt
# for ligands
foreach id (`cat ${filedir}/actives.smi | awk '{print $2}'`)
echo $id
ls ${filedir}/ligands/1/${id}*.db2.gz >> sdi.txt
end
# UNCOMMENT ONE OF THE SECTIONS BELOW
## for DUD-E decoys
#foreach id (`cat ${filedir}/decoys.smi | awk '{print $2}'`)
# echo $id
# ls ${filedir}/decoys/*/${id}*.db2.gz >> sdi.txt
#end
## for DUDE-Z (TLDR) decoys
#foreach id (`cat ${filedir}/decoys.smi | awk '{print $2}'`)
# echo $id
# ls ${filedir}/newdecoys/decoys/${id}*.db2.gz >> sdi.txt
#end
# generate dock.in
cat << EOF > dock.in
conformer_search_type HDB
num_per_search 1
skip_broken yes
hdb_db2_input_file sdi.txt
hdb_db2_search_score_threshold 100.0
use_internal_energy no
use_database_filter no
orient_ligand yes
automated_matching no
automated_matching_iteration yes
distance_tolerance 0.2
distance_minimum 0.0
nodes_minimum 4
nodes_maximum 4
receptor_site_file dockfiles/matching_spheres.sph
max_orientations 1000
critical_points no
chemical_matching no
use_ligand_spheres no
bump_filter no
score_molecules yes
contact_score_primary no
grid_score_primary no
gist_score_primary no
multigrid_score_primary no
dock3.5_score_primary yes
dock3.5_vdw_score yes
dock3.5_grd_prefix dockfiles/chem52
dock3.5_electrostatic_score yes
dock3.5_ligand_desolvation_score volume
dock3.5_solvent_occlusion_file dockfiles/ligand.desolv.heavy
dock3.5_redistribute_positive_desolvation no
dock3.5_hydrogen_desolvation_grid yes
dock3.5_hydrogen_solvent_occlusion_file dockfiles/ligand.desolv.hydrogen
dock3.5_receptor_desolvation_score yes
dock3.5_write_atomic_energy_contrib no
dock3.5_score_vdw_scale 1
dock3.5_score_es_scale 1
minimize_ligand yes
minimize_anchor no
minimize_flexible_growth no
use_advanced_simplex_parameters no
simplex_max_iterations 500
simplex_tors_premin_iterations 0
simplex_max_cycles 1
simplex_score_converge 0.1
simplex_cycle_converge 1.0
simplex_trans_step 1.0
simplex_rot_step 0.1
simplex_tors_step 10.0
simplex_random_seed 0
simplex_restraint_min no
atom_model all
vdw_defn_file ${dock6_path}/parameters/vdw_AMBER_parm94.dock3_7.defn
flex_defn_file ${dock6_path}/parameters/flex.defn
flex_drive_file ${dock6_path}/parameters/flex_drive.tbl
ligand_outfile_prefix output
write_mol_solvation no
write_orientations no
num_scored_conformers 1
score_threshold 100.0
rank_ligands no
EOF
ln -s ${dockfilesdir} dockfiles
${dock6_path}/bin/dock6 -i dock.in -o dock.out -v
This script saves a list of the ligand and decoy DB2 files in a file called sdi.txt, creates a DOCK input file called dock.in, and then runs DOCK using the DB2 files listed in sdi.txt and the parameters given in dock.in. Depending on which decoy generation method you used, uncomment the correct code block under the "generate sdi.txt" section in the script so that the decoy DB2 files are written to sdi.txt.
Run the script:
>> csh run_dock6_gridscore.csh
Enrichment calculation
Change directories to the created 6GJ8_docking_grid directory. Get the python script called make_extract_all_file_for_all_dock_out.py that reads the dock.out files by copy and pasting the script into a file.
import sys, os
def open_dock_out(dict_zinc,filename,score_txt):
fh = open(filename,'r')
name = ''
score = 100000.0
for line in fh:
splitline = line.split()
if len(splitline) < 2:
continue
if splitline[0] == "Molecule:" :
name = splitline[1]
if splitline[0] == score_txt:
score = float(splitline[1])
if name in dict_zinc:
if (dict_zinc[name]>score):
dict_zinc[name] = score
else:
dict_zinc[name] = score
print (name, score)
fh.close()
def mySortFunc(e):
return e[0]
def write_extract(dict_zinc,file1,file2):
fho = open(file1,'w')
data = []
for key in dict_zinc:
name = key
score = dict_zinc[name]
ostring = "../AB/ABaaaaaa/t00003/ABaaaaaa0025 27991 %16s 1 2702 2702 0.10 11 1 1612 1 1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 %6.2f\n"%(name, score)
fho.write(ostring)
data.append([score,name])
fho.close()
data.sort(key=mySortFunc)
fho = open(file2,'w')
for ele in data:
name = ele[1]
score = ele[0]
ostring = "../AB/ABaaaaaa/t00003/ABaaaaaa0025 27991 %16s 1 2702 2702 0.10 11 1 1612 1 1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 %6.2f\n"%(name, score)
fho.write(ostring)
fho.close()
dict_zinc1 = {}
dirpath1 = sys.argv[1]
score_txt1 = sys.argv[2]
if (score_txt1 != "Chemgrid_Score:" or score_txt1 != "Grid_Score:" or score_txt1 != "Descriptor_Score:"):
print ("score_txt1 must be Chemgrid_Score: or Grid_Score: or Descriptor_Score:")
exit
filename2 = "extract_all.uniq.txt"
filename3 = "extract_all.sort.uniq.txt"
isdir = os.path.isdir(dirpath1)
if not (isdir):
print("Error %s is not a dir"%dirpath1)
exit()
openhandel = os.popen('ls %s/dock.out'%dirpath1)
for line in openhandel:
filename1 = line.strip()
open_dock_out(dict_zinc1,filename1,score_txt1)
write_extract(dict_zinc1,filename2,filename3)
This script reads the dock.out files, extracts the names and scores of the docked molecules, and writes the names and scores to two files: extract_all.uniq.txt and extract_all.sort.uniq.txt. In both files, each line contains the name and score of a unique molecule as determined by the name. The molecule names are written to the third column and the corresponding scores are written to the last column. The second file, extract_all.sort.uniq.txt, is sorted by score, with the best scoring (lowest energy) molecule at the top, and the worst scoring (highest energy) molecule at the bottom. The script takes two arguments, the directory path containing the dock.out files and the scoring function.
Run the script:
>> python make_extract_all_file_for_all_dock_out.py . Chemgrid_Score:
Create two files containing lists of the names of the ligands and decoys by running the following commands:
>> awk '{print $2}' ../databases/actives.smi > ligands.name
>> awk '{print $2}' ../databases/decoys.smi > decoys.name
Calculate the AUC and logAUC and plot the ROC curve by calling the following scripts:
>> ${DOCKBASE}/analysis/enrich.py -i . -l ligands.name -d decoys.name
>> ${DOCKBASE}/analysis/plots.py -i . -l ligands.name -d decoys.name
[ these scripts are distributed with DOCK 3.8 ]
The example enrichment results below were generated using the 5 ligands listed above and a background of DUDE-E decoys.
The AUC and logAUC values are given in the first line of the roc_own.txt file created by enrich.py. Head the roc_own.txt file to view the AUC and logAUC.
>> head -1 roc_own.txt
#AUC 63.60 LogAUC 17.01
The roc_own.png file created by plots.py is the plot of the ROC curve. The x-axis is plotted with a log scale of base 10 so as to emphasize early enrichment.
The output_scored.mol2 file can be visualized in Chimera using ViewDock.