Roadmap - ReactionMechanismGenerator/ARC GitHub Wiki

General

  1. output citations of all relevant external code used per species/rxn (see CiteSoft Py)
  2. save S2 and T1 in the final project info file.
  3. If S2 is high, use rohf (but then don't use SCF=QC in Gaussian for troubleshooting, use USE=L506 instead: http://gaussian.com/scf/)
  4. How to troubleshoot a wrong number of imaginary frequencies in a TS? Just try a different guess?
  5. Add an optional stability test to Gaussian jobs: http://gaussian.com/stable/
  6. Limit the number of times a job type is spawned per species
  7. Visualize the imaginary frequency of TSs
  8. If an opt job required troubleshooting, apply the same trsh for the fine opt as well
  9. Relax TS troubleshooting, check if the diameter is now larger
  10. Find the lowest barrier of a molecule (helpful for judging stability and most likely degradation pathway). See Maeda's work (Adeel's mentor, Japan) GRRM
  11. Organize scheduler and Job into smaller functions with tests.
  12. Make a separate troubleshooting module, relocate to Job
  13. Transform the xyz attributes to an array form, also consider isotopes
  14. A species should have a .rmg_mol attribute which is a list with b_mol, s_mol, and mol_list, and a .rd_mol
  15. Implement an algorithm that uses the information of unsuccessful conformers (when a rotor scan finds a lower conformer) and comes up with guesses of what the global minimum could be
  16. Support Force Fields and semi-empirical methods in Gaussian (shouldn't confuse with a composite method, here, too, there's no basis set)
  17. zmat conf improvement: probably no need for zmat comparisons, criteria to consolidate confs: if FF energy and either dmat is identical or atomType dmat is identical.
  18. XTB for conformers, see CREST that does conformational searching (https://xtb-docs.readthedocs.io/en/latest/contents.html#). XTB is also implemented in Entos QM which should be free for academic use, but perhaps it's better to use the standalone version.
  19. Use joblib for parallelization: https://joblib.readthedocs.io/en/latest/

Documentation

  1. Add tutorials (https://www.divio.com/blog/documentation/)
  2. Add ARC's logic with a workflow image of calculation order

Rotors

  1. Implement torsiondrive: https://github.com/lpwgroup/torsiondrive, https://pypi.org/project/torsiondrive/
  2. How to DETERMINE rotors for a TS w/o 2D geometry?
  3. Implement (execute) 2D rotors using Q2DTor (QChem example: https://manual.q-chem.com/5.2/Ch10.S4.html)
  4. TS's and rotor scans: when does a rotor invalidate a TS? Determine which rotors to scan in a TS. Maybe everything 1st degree in distance from an atom that has a large displacement for the imaginary freq shouldn't be a pivotal atom. Determine which rotors break the TS and invalidate them.
  5. Even if a rotor is invalidated, still change the dihedral if the initial conformer isn't the minimum.
  6. Visualize 2-D rotors: https://www.oreilly.com/library/view/python-data-science/9781491912126/ch04.html (Figure 4-34. Label contours on top of an image)
  7. Wait for all rotor scan jobs to terminate before troubleshooting rotors, so if two rotor scans find a lower conformer, the one which is lowest will be tried first (and all info will be saved for the global minimum algorithm)
  8. Rotors in Orca: https://www.youtube.com/watch?v=EKJRaC240vg
  9. Plotting 2D rotors should look like this: https://link.springer.com/article/10.1007/s00894-017-3508-4

Tests

  1. Processor
  2. input and submit files
  3. Functional server tests

New job order

  1. conformers FF -> opt conf level -> sp conf level (DLPNO)
  2. A wiser selection of lowest confs by energy (imipramine case)
  3. For TSs, start w/ the lowest conf selected at the sp level, if a site isn't accessible (no rate could be calculated), try using another conf rated by energy.
  4. Ideally, we'd like all(!) confs up to a certain E_threshold for the stereo algorithm

Reactions

  1. Add dummy RMG families in ARC with templates only (for atom mapping)
  2. Reaction atom mapping: https://chemrxiv.org/articles/Unsupervised_Attention-Guided_Atom-Mapping/12298559
  3. Implement QST2 in G, DEGSM, NEB, KinBot, Copenhagen (https://chemrxiv.org/articles/Fast_and_Automatic_Estimation_of_Transition_State_Structures_Using_Tight_Binding_Quantum_Chemical_Calculations/12600443/1)
  4. Barrierless CVTST
  5. https://onlinelibrary.wiley.com/doi/epdf/10.1002/jcc.25370
  6. QChem FSM: http://www.q-chem.com/qchem-website/manual/qchem43_manual/sect-approx_hess.html
  7. QChem IRC: http://www.q-chem.com/qchem-website/manual/qchem44_manual/sec-IRC.html
  8. Auto-generate all H-abs conformational guesses for a TS using brute force
  9. NEB: https://www.scm.com/doc/Tutorials/ADF/Transition_State_with_ASE.html, https://wiki.fysik.dtu.dk/ase/ase/neb.html, https://wiki.fysik.dtu.dk/ase/tutorials/neb/idpp.html#idpp-tutorial. ts in TeraChem calls NEB
  10. TS: check normal displacement modes (code by Colin)
  11. IRC check
  12. Clone and compile GSM automatically in ARC on a selected server, or implement pyGSM.
  13. If a species rotor scan finds a better conformer, make new TS guesses from it (keep old ones for comparison)
  14. Preserve non-reacting sites' chirality in a reaction (this might mean creating a duplicate species, since it could also be the product in a different reaction with different chirality). Important for double-ended TS guess methods. See for example: CNC1C2CCN(C1C)CC2 + HO2 <=> CNC1C2CCN([C]1C)CC2 + H2O2
  15. Add VTST, see e.g., http://akrmys.com/gpop/refmVtst.html
  16. Use the official version of AutoTST (some tests will fail)
  17. NEB GPR: https://journals.aps.org/prl/abstract/10.1103/PhysRevLett.122.156001
  18. pyGSM: https://github.com/ZimmermanGroup/molecularGSM/wiki
  19. ASE autoNEB: https://wiki.fysik.dtu.dk/ase/dev/_modules/ase/autoneb.html
  20. NEBTERPOLATION using MD to find TSs: https://pubs.acs.org/doi/full/10.1021/acs.jctc.5b00830?src=recsys
  21. https://chemrxiv.org/articles/preprint/Fast_and_Automatic_Estimation_of_Transition_State_Structures_Using_Tight_Binding_Quantum_Chemical_Calculations/12600443
  22. (growing string method [10.1021/ct400319w, 10.1063/1.4804162], nudge elastic band [10.1063/1.1329672, 10.1063/1.1323224], synchronous transit and quasi‐Newton methods [https://onlinelibrary.wiley.com/doi/epdf/10.1002/ijch.199300051], KinBot [10.1016/j.cpc.2019.106947], AutoTST [10.1021/acs.jpca.7b07361, 10.26434/chemrxiv.13277870.v2], tight binding reaction path [10.26434/chemrxiv.12600443.v1], freezing string method [10.1021/acs.jctc.5b00407, 10.1063/1.3664901], and deep learning [10.26434/chemrxiv.12302084.v2])
  23. Intersystem crossing: https://aip.scitation.org/doi/10.1063/1.4936864
  24. nebterpolation: https://pubs.acs.org/doi/10.1021/acs.jctc.5b00830
  25. GANs: https://aip.scitation.org/doi/full/10.1063/5.0055094

ESS

  1. MRCI (Automated Selection of Active Orbital Spaces: https://pubs.acs.org/doi/10.1021/acs.jctc.6b00156)
  2. check and report spin contamination (in molpro, this is in the output.log file(!) as "Spin contamination <S2-Sz2-Sz> 0.00000000")
  3. check and report T1 and leading C1 coefficient if MRCI is run
  4. Add Orca, TeraChem, psi4, Torsion-drive
  5. Implement Mark Payne's suggestion of first running freq and reading the Hess before optimizing a TS in QChem
  6. Certain functionals in QChem don't have parallelization of Hessians, so freq jobs are slow. We might be able to speed it up using IDERIV set to ` (instead of 2)
  7. CBS extrapolation, user script, eg here

Servers

  1. trsh rmg server
  2. job arrays (Colin\Brian)
  3. How to use Job Arrays: https://arc-ts.umich.edu/software/torque/job-arrays/
  4. Solve Eqw status on server: qmod -cj jobid if Eqw is detected (first wait 30 sec)

Database

  1. mongodb (https://github.com/PACChem/QTC/blob/master/qtc/dbtools.py)
  2. see https://pubs.acs.org/doi/abs/10.1021/acs.jpca.5b05448, https://pubs.acs.org/doi/abs/10.1021/jp511403a
  3. GDB: http://gdb.unibe.ch/downloads/, https://datarepository.wolframcloud.com/resources/GDB9-Database
  4. BigQuery, https://en.wikipedia.org/wiki/BigQuery
  5. pandas can be used for JSON and SQL: https://pandas.pydata.org/pandas-docs/stable/user_guide/io.html
  6. HDF5, https://www.hdfgroup.org/solutions/hdf5

Nice-to-have

  1. Output a reaction pathway (E0's) from Arkane
  2. Software Management Plan (https://github.com/softwaresaved/software-management-plans)
  3. NetworkX: https://networkx.github.io/
  4. pygal: http://www.pygal.org/en/stable/
  5. Add binders once psi4 is integrated
  6. Consider MultirefPredict, https://github.com/hjkgrp/MultirefPredict
  7. Train Force Fields, see https://github.com/andrewkleinschmidt/AROMODEL
  8. Analyze IRC: https://github.com/xiaoruiDong/RDMC/blob/main/ipython/Analyze%20scan%26irc%20jobs.ipynb
  9. Support HEAT: https://aip.scitation.org/doi/full/10.1063/1.5095937
  10. Additional method for atom mapping: DockRMSD - https://doi.org/10.1186/s13321-019-0362-7