Scoping through TPS Corpus - petermr/CEVOpen GitHub Wiki
Scoping through TPS Corpus:
-
Date 2/8/2021
I queried https://europepmc.org/ for following searches and got results as:
Query Number of hits terpene synthase 4308 terpene synthase plant 3447 terpene synthase plant volatile 1200 terpene synthase plant TPS 650 terpene synthase TPS plant volatile 376 terpene synthase TPS plant volatile compounds 355 (Research articles 312) only 188 mention both TPS & compounds -
I continued TPS corpus on Date 3/8/2021, 4/8/2021 and 5/8/2021
For 312 papers, I looked PMCID, Plant, Compound and TPS nomenclature availability.
-
Date 5/8/2021
Pls find TPS corpus 312 papers
-
**Date 6/8/2021 I continued improving scooping through TPS corpus.
-
**Date 9/8/2021
Pls find improved TPS corpus 91 papers
-
**Date 10/8/2021 and 11/8/2021
continued improving scooping through TPS corpus and INYAS presentation slides.
Out of 312 papers, only 188 papers mention both TPS and volatile compounds.
-
Date 16/8/2021
Arabidopsis
Camellia sinensis
Cinnamomum
Citrus
Lavandula
Nicotianna
Solanum
Vitis vinifera
-
INYAS Interns:
TPS genes for different species.
Develop corpus "terpene synthase oryza"
Extract terms from papers.
Create dictionary and test.
Prenyltransferases from medicinal plants
Classify those TPS for each subspecies
Check if AtTPS1 is related to OsTPS1 or something similar in oryza corpus.
-
**Date 18/8/2021
Pls find TPS volatile corpus 121 papers
-
**Date 19/8/2021 Created a template for the 5 KARYA projects https://github.com/petermr/CEVOpen/wiki/crop5
-
**Date 23/8/2021 I created testtps dictionary
-
**Date 24/8/2021, 25/8/2021 and 26/8/2021 Extracting volatile compounds from 121 papers (point 9).
-
**Date 24/8/2021, 25/8/2021 and 26/8/2021 Extracting volatile compounds from 121 papers (point 9)
-
Date 27/8/2021, 30/8/2021
-
Date 31/8/2021, 1/9/2021 Helping KARYA interns with installation of pygetpapers and ami3. 2/9/2021 meeting. 3/9/2021 Helping NIPGR intern with same.
-
Date 6/9/2021 installing softwares for https://github.com/petermr/crops/tree/main/metadata_analysis set up a virtual environment.
python -m venv project_env
creating envproject_env\Scripts\activate.bat
env activation(Warning:This Python interpreter is in a conda environment, but the environment has not been activated. Libraries may fail to load. To activate this environment run above command)
Installed anaconda and then run
C:\Users\user\anaconda3\Scripts\activate base
pip install scispacy
copy requirements.txt into sagar jadhav
Use conda to install and manage different versions of Python
conda create --name project_env python=3.6.0
conda activate project_env
-
Installed python 3.6, pycharm. Metadata analysis script runs but ami3 is not installed on my mac.
-
Installed ami3 on my mac. Set path.
-
Finding species that are highly represented in literature
pygetpapers -q "terpene synthase TPS plant" -o TPS -p -k 650
In order to run METADATA ANALYSIS script by Shweata, I followed following protocol
Create folder, Open folder into pycharm and run commands or click on **add interpreter**, then click on conda environment, select python
3.6, select conda path. Run the commands
`conda create --name project_env python=3.6.0`
`conda activate project_env`
-
Ran metadata analysis script, 1st ran downloaded papers (
pygetpapers -q "terpene synthase TPS plant" -o TPS -p -k 620
) and shown lxml not installed error. so pip install lxml. Then commented to avoid paper download again. Instead of Citrus, I added TPS. I also uncommented lines 164, 165 and 166. -
Please, find TPS metadata analysis output TPS metadata analysis
-
Extract "TPS conatining sentences": I used . (dot) in line 144 Shweata script. [words = text.split(".")] and also removed line 175.
-
Please, find TPS Senetences extraction TPS Sentences Extraction
-
Created TPS pathway dictionaries TPS pathway TPSpathway
-
Created dictionary for abbreviations of binomial nomenclature abbreviation binomial
-
git cloned pyami. set path by running command
open -e .bash_profile
. then copying the following. export P2_HOME=/Users/sagar/pyami export PATH=$PATH:$P2_HOME/py4ami -
Install pycharm. created folder valdict in pyami. add interpreter conda env, python 3.8, select conda path. Run the commands
conda create --name project_env python=3.8.0
conda activate project_env
save. close.Reopen folder. run
pip install pytest
. run test_pyamidict.py then gave lxml error.pip install lxml
. run again test_pyamidict.py . then gave py4ami module not found error. then runpip install py4ami
. Then gave error ImportError: cannot import name 'AMIDict' from 'py4ami.dict_lib'.
18/11/2021 Documentation of crops repository.
19/11/2021 Uploading corpora to crops repository
Move the file (folder) you'd like to upload to GitHub into the local directory that was created when you cloned the repository.
Open Terminal.
Change the current working directory to your local repository. cd crops
Stage the file for committ to your local repository. git add .
Commit the file that you've staged in your local repository. git commit -m "Add existing file"
Push the changes in your local repository to GitHub.com. git push
give ur username. generate token by going to settings then developer settings. copy token and paste into the password.