Dictionary eoActivity - petermr/CEVOpen GitHub Wiki

Owner : Radhu Ladani

Dictionary : Activity

Dictionary link

Find the dictionary Activity here

Overview:

Creating Dictionary using SPARQL query For that I collect wikidata IDs from the existing dictionary of activity, which is a dictionary of 438 essential oil or constituent compound biochemical and/or biological activities, 340 of which are resolved to wikidata IDs, and 336 with descriptions of 250 characters or less, created by Emanuel Faria. https://github.com/petermr/CEVOpen/tree/master/dictionary/eoActivity/eo_activity/emanuel

Procedure

  1. Go to https://www.wikidata.org/wiki/Wikidata:Main_Page and click on 'Query Service' at the left column. This will redirect you to Wikidata Query Service page where you can create your SPARQL query.
  2. Run SPARQL query using following command:
## Selecting the prefered label
## Selecting the prefered label
SELECT * WHERE {
  VALUES ?item {
    wd:Q1069606 wd:Q11905748 wd:Q1225289 wd:Q12529398 wd:Q131207 wd:Q131656 wd:Q131746 wd:Q133948 wd:Q1340459 wd:Q1349821 wd:Q1384342 wd:Q1423889 wd:Q1468324 wd:Q14862699 wd:Q1509074
    wd:Q1517948 wd:Q1536078 wd:Q1642182 wd:Q1660194 wd:Q166774 wd:Q167377 wd:Q16909071 wd:Q1696730 wd:Q17100700 wd:Q173235 wd:Q1734091 wd:Q178266 wd:Q181322 wd:Q18349602 wd:Q18356742
    wd:Q18388377 wd:Q18663259 wd:Q186752 wd:Q187661 wd:Q187689 wd:Q190012 wd:Q190334 wd:Q1926141 wd:Q1930829 wd:Q193237 wd:Q1941660 wd:Q194270 wd:Q1976211 wd:Q1981368 wd:Q200656 wd:Q206348
    wd:Q2088972 wd:Q209717 wd:Q21045470 wd:Q211036 wd:Q21139980 wd:Q211420 wd:Q2142251 wd:Q215118 wd:Q217972 wd:Q223417 wd:Q2255024 wd:Q246181 wd:Q249619 wd:Q2592323 wd:Q2602077
    wd:Q26697606 wd:Q274083 wd:Q2742649 wd:Q274493 wd:Q2853144 wd:Q2853342 wd:Q2853347 wd:Q288280 wd:Q3009547 wd:Q309035 wd:Q324089 wd:Q3410946 wd:Q3411675 wd:Q3427345 wd:Q3446580 
    wd:Q3482889 wd:Q352551 wd:Q3560867 wd:Q357896 wd:Q3705665 wd:Q377458 wd:Q3774852 wd:Q3774857 wd:Q378705 wd:Q3817359 wd:Q3922210 wd:Q4008956 wd:Q407752 wd:Q416014 wd:Q41602594 
    wd:Q421700 wd:Q421978 wd:Q430719 wd:Q445580 wd:Q4669896 wd:Q4713968 wd:Q4742080 wd:Q4803792 wd:Q486061 wd:Q4990531 wd:Q50377176 wd:Q50415114 wd:Q50429885 wd:Q50430113 wd:Q50430144
    wd:Q50430264 wd:Q50430265 wd:Q5119340 wd:Q513122 wd:Q5166679 wd:Q521616 wd:Q522817 wd:Q56062995 wd:Q567709 wd:Q572294 wd:Q575062 wd:Q575136 wd:Q575222 wd:Q575890 wd:Q576618 wd:Q578726
    wd:Q581102 wd:Q581996 wd:Q582559 wd:Q582687 wd:Q584209 wd:Q5958197 wd:Q608085 wd:Q62903 wd:Q62962 wd:Q66295 wd:Q676558 wd:Q68541106 wd:Q6881918 wd:Q7187720 wd:Q721432 wd:Q7250944
    wd:Q7251487 wd:Q73984 wd:Q7431537 wd:Q745130 wd:Q76797715 wd:Q7833952 wd:Q7902662 wd:Q827658 wd:Q846227 wd:Q847705 wd:Q84944531 wd:Q84951095 wd:Q84953056 wd:Q84953547 wd:Q84953576
    wd:Q84953633 wd:Q84953651 wd:Q84954230 wd:Q84954685 wd:Q84955111 wd:Q84955132 wd:Q84955175 wd:Q84956389 wd:Q84956474 wd:Q84956492 wd:Q84956495 wd:Q84956500 wd:Q84956514 wd:Q84956686
    wd:Q84956852 wd:Q84956887 wd:Q84957317 wd:Q84957398 wd:Q84957440 wd:Q84957471 wd:Q84957488 wd:Q84957489 wd:Q84957495 wd:Q84957504 wd:Q84957506 wd:Q84957510 wd:Q84957514 wd:Q84957515 wd:Q84958628
    wd:Q84958741 wd:Q84958793 wd:Q84959117 wd:Q84959304 wd:Q84959377 wd:Q84959751 wd:Q84959790 wd:Q84960246 wd:Q84960335 wd:Q84960524 wd:Q84961334 wd:Q84961500 wd:Q84961820 wd:Q84961856
    wd:Q84961940 wd:Q84962003 wd:Q84962361 wd:Q84962587 wd:Q84962840 wd:Q84962992 wd:Q84963984 wd:Q84997245 wd:Q84997315 wd:Q84997332 wd:Q84997335 wd:Q84997870 wd:Q84998040 wd:Q84998042 
    wd:Q84998043 wd:Q84998051 wd:Q84998059 wd:Q84998172 wd:Q84998248 wd:Q84998654 wd:Q84999146 wd:Q84999154 wd:Q85001503 wd:Q85001558 wd:Q85001732 wd:Q85001844 wd:Q85001852 wd:Q85001855
    wd:Q85001861 wd:Q85002068 wd:Q85002288 wd:Q85002611 wd:Q85002666 wd:Q85002964 wd:Q85003091 wd:Q85003128 wd:Q85003208 wd:Q85003209 wd:Q85003234 wd:Q85003391 wd:Q886593 wd:Q901434
    wd:Q901537 wd:Q901656 wd:Q905101 wd:Q905648 wd:Q910391 wd:Q911854 wd:Q911922 wd:Q927234 wd:Q93978 wd:Q955332         
  }
  SERVICE wikibase:label {
    bd:serviceParam wikibase:language "en".
    ?item rdfs:label ?itemLabel;
      skos:altLabel ?itemAltLabel;
      schema:description ?itemDescription.
  }
   SERVICE wikibase:label {
    bd:serviceParam wikibase:language "hi".
    ?item skos:altLabel ?hindialtlabel;
      rdfs:label ?hindiLabel;
      schema:description ?hindi.
  }
  SERVICE wikibase:label {
    bd:serviceParam wikibase:language "ta".
    ?item skos:altLabel ?tamilaltlabel;
      rdfs:label ?tamilLabel;
      schema:description ?tamil.
  }
  SERVICE wikibase:label {
    bd:serviceParam wikibase:language "es".
    ?item skos:altLabel ?esaltlabel;
      rdfs:label ?esLabel;
      schema:description ?es.
  }
  SERVICE wikibase:label {
    bd:serviceParam wikibase:language "fr".
    ?item skos:altLabel ?fraltlabel;
      rdfs:label ?frLabel;
      schema:description ?fr.
  }
  SERVICE wikibase:label {
    bd:serviceParam wikibase:language "de".
    ?item skos:altLabel ?dealtlabel;
      rdfs:label ?deLabel;
      schema:description ?de.
  }
  SERVICE wikibase:label {
    bd:serviceParam wikibase:language "zh".
    ?item skos:altLabel ?zhaltlabel;
      rdfs:label ?zhLabel;
      schema:description ?zh.
  }
  SERVICE wikibase:label {
    bd:serviceParam wikibase:language "ur".
    ?item skos:altLabel ?uraltlabel;
      rdfs:label ?urLabel;
      schema:description ?ur.
  }
  OPTIONAL { ?wikipedia schema:about ?item; schema:isPartOf <https://en.wikipedia.org/> }
  
}

  1. Once you get the results, Click on 'Link' and then "SPARQL endpoint' . This will let you download your SPARQL file.

  2. Open the SPARQL results in notepad.

  3. Use amidict for SPARQL mapping. Command for SPARQL mapping as follows:

amidict -vv --dictionary Activity --directory Activity --input sparql create --informat wikisparqlxml --sparqlmap wikidataURL=item,wikipediaPage=wikipedia,name=itemLabel,term=itemLabel,Description=itemDescription,Hindi=hindiLabel,Hindi_description=hindi,Hindi_altLabel=hindialtLabel,Tamil=tamilLabel,Tamil_description=tamil,Tamil_altLabel=tamilaltLabel,Spanish=esLabel,Spanish_description=es,Spanish_altLabel=esaltLabel,French=frLabel,French_description=fr,French_altLabel=fraltLabel,Germam=deLabel,German_description=de,German_altLabel=dealtLabel,Chinese=zhLabel,Chinese_altLabel=zhaltLabel,Chinese_description=zh,Urdu=urLabel,Urdu_altLabel=uraltLabel,Urdu_description=ur --transformName wikidataID=EXTRACT(wikidataURL,./(.)) --synonyms=itemAltLabel

  1. Commit all these results in Github.

  2. Activity dictionary has 148 entries in it.

  3. The attributes in the dictionary includes WikidataID, wikidataURL, description, term, name etc.

Running ami search for the "activity dictionary"

  1. For this dictionary we are searching in the oil186 corpus.

  2. This directory was used to run ami search using activitydictionary ami -p oil186 search --dictionary activity.

  3. After a successful run, HTML Documents were created that classified the papers on the basis of the _activity _while citing the frequency as well. The html datatable was like: Screenshot (578)

Running search_lib for the "activity dictionary"

  1. For this dictionary we are searching in the activity corpus (PATH:https://github.com/petermr/CEVOpen/tree/master/minicorpora/activity)

    Query code: python search_lib.py --dict activity --sect introduction method --proj activity Activity

  2. Searching activity dictionary in the oil186 corpus (PATH:https://github.com/petermr/CEVOpen/tree/master/searches/oil186)

    Query code: python search_lib.py --dict activity --sect introduction --proj oil186 activity(oil186)

  3. Searching activity dictionary in both corpus oil186 and activity for section METHOD

    Query code: python search_lib.py --dict activity --sect METHOD --proj oil186 activity oil186(METHOD) activity_METHOD