Search_lib Testing - Radhu903/Git_Learning- GitHub Wiki

Tester1: Radhu Ladani

OS: Windows 10

Date: 19th April 2021

search_lib.py

  • Download PAPERS into PROJECTS , find SECTIONS and index with (DICTIONARIES) into a searchable KNOWLEDGEBASE and analyse for new INSIGHTS.

    • The scheme is:
search SECTIONS in PROJECTS with (DICTIONARIES and/or PATTERNS) with (DISPLAY and/or ANALYSIS) options
  • General code syntax: python search_lib.py --dict --sect --proj
    • --dict gives main core dictionaries ['activity', 'country', 'disease', 'compound', 'plant', 'plant_genus', 'organization', 'plant_compound', 'plant_part', 'invasive_plant']

    • --sect to extract only the sections of interest ['abstract', 'acknowledge', 'affiliation', 'author', 'background', 'discussion', 'empty', 'ethics', 'fig_caption', 'front', 'introduction', 'jrnl_title', 'keyword', 'method', 'material', 'octree', 'pdfimage', 'pub_date', 'publisher', 'reference', 'results', 'results', 'sections', 'svg', 'table', 'title', 'word']

    • --proj gives number of hardcoded corpora / projects including ['liion10', 'ffml20', 'oil26', 'oil186', 'cct', 'disease', 'diffprot' 'worc_synth', 'worc_explosion', 'activity', 'hydrodistil', 'invasive', 'plantpart']

Running Search_lib(pyami) in commandline

  • Open search_lib file at specific path to my case: C:\Users\DELL\Radhu\openDiagram\physchem\python om command line.
  • Run python search_lib.py --help
C:\Users\DELL\openDiagram\physchem\python>python search_lib.py --help
running search main
usage: search_lib.py [-h] [-d [DICT ...]] [-s [SECT ...]] [-p [PROJ ...]] [--patt PATT [PATT ...]] [--demo [DEMO ...]] [-l LOGLEVEL] [--plot] [--nosearch] [--maxbars [MAXBARS]]
                     [--languages LANGUAGES [LANGUAGES ...]] [--debug DEBUG [DEBUG ...]]

Search sections with dictionaries and patterns

optional arguments:
  -h, --help            show this help message and exit
  -d [DICT ...], --dict [DICT ...]
                        dictionaries to search with, empty gives list
  -s [SECT ...], --sect [SECT ...]
                        sections to search; empty gives all
  -p [PROJ ...], --proj [PROJ ...]
                        projects to search; empty will give list
  --patt PATT [PATT ...]
                        patterns to search with; regex may need quoting
  --demo [DEMO ...]     simple demos (NYI). empty gives list. May need downloading corpora
  -l LOGLEVEL, --loglevel LOGLEVEL
                        debug level (NYI)
  --plot                plot params (NYI)
  --nosearch            search (NYI)
  --maxbars [MAXBARS]   max bars on plot (NYI)
  --languages LANGUAGES [LANGUAGES ...]
                        languages (NYI)
  --debug DEBUG [DEBUG ...]
                        debugging commands , numbers, (not formalised)
  • Run python search_lib.py --demo
    • it gives no demo given, choose from 'diffprot', 'ethics', 'fig_caption', 'invasive', 'luke', 'matthew', 'plant', 'worcester', 'word'
C:\Users\DELL\openDiagram\physchem\python>python search_lib.py --demo
running search main
args Namespace(dict=None, sect=None, proj=None, patt=None, demo=[], loglevel='foo', plot=True, nosearch=False, maxbars=25, languages=['en'], debug=None)
cmd sys.argv ['search_lib.py', '--demo']
interpreted from cmd arg.demo []
DEMOS
RUN DEMOS: []
no demo given, choose from  dict_keys(['diffprot', 'ethics', 'fig_caption', 'invasive', 'luke', 'matthew', 'plant', 'worcester', 'word'])
END DEMO
finished search

Example queries

  1. For activity dictionary we are searching in the activity corpus (PATH:https://github.com/petermr/CEVOpen/tree/master/minicorpora/activity)

    Query code: python search_lib.py --dict activity --sect introduction method --proj activity Activity

  2. Searching activity dictionary in the oil186 corpus (PATH:https://github.com/petermr/CEVOpen/tree/master/searches/oil186)

    Query code: python search_lib.py --dict activity --sect introduction --proj oil186 activity(oil186)

  3. Searching activity dictionary in both corpus oil186 and activity for section METHOD

    Query code: python search_lib.py --dict activity --sect METHOD --proj oil186 activity oil186(METHOD) activity_METHOD

  4. For activity dictionary we are searching in the plantpart corpus section introduction and method (PATH:https://github.com/petermr/CEVOpen/tree/master/minicorpora/plantpart)

    Query code: python search_lib.py --dict activity --sect introduction METHOD --proj plantpart activity(plant_part_intro) activity(plant_part_method)

  5. For plant_part dictionary we are searching in the plantpart corpus section introduction and method (PATH:https://github.com/petermr/CEVOpen/tree/master/minicorpora/plantpart)

    Query code: python search_lib.py --dict plant_part --sect introduction METHOD --proj plantpart Plantpart(INTRO) Plantpart(METHOD)

  6. For plant_compound dictionary we are searching in the oil186 corpus section introduction and method (PATH:https://github.com/petermr/CEVOpen/tree/master/searches/oil186)

    Query code: python search_lib.py --dict plant_compound --sect introduction METHOD --proj oil186 Plantcompound(INTRO) Plantcompound(METHOD)

  7. For invasive_plant dictionary we are searching in the invasive corpus for section method (PATH:https://github.com/petermr/CEVOpen/tree/master/minicorpora/invasive)

    Query code: python search_lib.py --dict invasive_plant --sect METHOD --proj invasive Invasive(METHOD)

  8. Running search_lib with --demo luke

    Query code: python search_lib.py --demo luke luke1 luke2 luke3 luke4