Dictionary eoplant_part - petermr/CEVOpen GitHub Wiki

Owner: Vasant Kumar

Dictionary Link

Overview:

Motivation and goals:

  • The goal is to construct a dictionary that has information about numerous plant parts as well as a brief description of each, including synonyms, descriptions, and names in multiple languages. In addition, the dictionary can be used to research or compare with other dictionaries to gain additional knowledge.

What I did:

  • Downloaded 1000 papers about plant parts.
  • Gathered the wikidata Ids for various plant parts and used them in a sparql query and transformed them to a dictionary using amidict.

Issues I faced:

  • Earlier I was experiencing issues running queries as I did not posses much computational knowledge, However, through practice I was able to run them

Aim: Creating Dictionary plant_part using SPARQL query.

Steps:

## Shttps://query.wikidata.org/electing the prefered label
## Selecting the prefered label
SELECT * WHERE {
  VALUES ?item {
    wd:Q10289985 wd:Q103129 wd:Q10437539 wd:Q107216 wd:Q107412 wd:Q1113448 wd:Q11162356 wd:Q1120914 wd:Q1125215 wd:Q1138632 wd:Q1155708 wd:Q11895190 wd:Q1192354 wd:Q12057965 wd:Q122811
    wd:Q1271979 wd:Q1277215 wd:Q134267 wd:Q1347099 wd:Q1351263 wd:Q1364 wd:Q1421859 wd:Q1425870 wd:Q1427245 wd:Q145205 wd:Q14524280wd:Q1474699 wd:Q148436 wd:Q14849087
    wd:Q148515 wd:Q148600 wd:Q1493115 wd:Q149316 wd:Q1546595 wd:Q158583 wd:Q158967 wd:Q16128920 wd:Q16535076 wd:Q171187 wd:Q1713537 wd:Q18088308 wd:Q18250160
    wd:Q183319 wd:Q1840192 wd:Q184208 wd:Q184453 wd:Q185138 wd:Q188748 wd:Q1889013 wd:Q191546 wd:Q191556 wd:Q192576 wd:Q193472wd:Q1995772 wd:Q2004067 wd:Q201851
    wd:Q207123 wd:Q207495 wd:Q216635 wd:Q217753 wd:Q220869 wd:Q224107 wd:Q22710 wd:Q2322325 wd:Q2331384 wd:Q2365301 wd:Q241368 wd:Q259028 wd:Q2746099 wd:Q27505399
    wd:Q27506529 wd:Q279513 wd:Q287 wd:Q2923673 wd:Q2933965 wd:Q304216 wd:Q30513971 wd:Q30765614 wd:Q3087886 wd:Q3089146 wd:Q3129307 wd:Q3312287 wd:Q33971 wd:Q3791538
    wd:Q380138 wd:Q3894544 wd:Q40763 wd:Q41500 wd:Q46723512 wd:Q489628 wd:Q497512 wd:Q504930 wd:Q506 wd:Q512249 wd:Q572097 wd:Q577430 wd:Q59243260 wd:Q60777361
    wd:Q609336 wd:Q62779 wd:Q643352 wd:Q65089222 wd:Q655824 wd:Q661390 wd:Q66571835 wd:Q687699 wd:Q70062083 wd:Q7079661 wd:Q7201653 wd:Q729496 wd:Q756954 wd:Q789802
    wd:Q794374 wd:Q796482 wd:Q79932 wd:Q87484743 wd:Q87485317 wd:Q87485325 wd:Q87485505 wd:Q87485933 wd:Q87485935 wd:Q87486641 wd:Q87486726 wd:Q87486867 wd:Q87486986
    wd:Q87487349 wd:Q87487358 wd:Q87487640 wd:Q87487715 wd:Q87488025 wd:Q87498057 wd:Q87499047 wd:Q87499514 wd:Q87500059 wd:Q87500134 wd:Q87501280 wd:Q87501358 wd:Q87502457
    wd:Q87502461 wd:Q87503474 wd:Q87589402 wd:Q87590992 wd:Q87591010 wd:Q87591031 wd:Q87591194 wd:Q87591759 wd:Q87592050 wd:Q87592301 wd:Q87592700 wd:Q87592986 wd:Q87592994
    wd:Q87593008 wd:Q87594836 wd:Q87606844 wd:Q87608261 wd:Q87608372 wd:Q87608694 wd:Q87608919 wd:Q87609332 wd:Q87609728 wd:Q87609996 wd:Q87610596 wd:Q87610912 wd:Q87612280
    wd:Q87612659 wd:Q87612806 wd:Q87612861 wd:Q87612938 wd:Q87613121 wd:Q87613201 wd:Q87614021 wd:Q87622342 wd:Q87622668 wd:Q87622862 wd:Q87623367 wd:Q876445 wd:Q87648435
    wd:Q87648445 wd:Q87648476 wd:Q87648478 wd:Q87648498 wd:Q87648517 wd:Q87648548 wd:Q87648554 wd:Q87648558 wd:Q87648580 wd:Q87648628 wd:Q87648640 wd:Q87648664 wd:Q87648669
    wd:Q87648681 wd:Q87648700 wd:Q87648799 wd:Q87648863 wd:Q87649544 wd:Q882214 wd:Q88224 wd:Q887231 wd:Q913294 wd:Q927202 wd:Q987774
}
  SERVICE wikibase:label {
    bd:serviceParam wikibase:language "en".
    ?item rdfs:label ?itemLabel;
      skos:altLabel ?itemAltLabel;
      schema:description ?itemDescription.
  }
  SERVICE wikibase:label {
    bd:serviceParam wikibase:language "hi".
    ?item skos:altLabel ?hindialtlabel;
      rdfs:label ?hindiLabel;
      schema:description ?hindi.
  }
  SERVICE wikibase:label {
    bd:serviceParam wikibase:language "ta".
    ?item skos:altLabel ?tamilaltlabel;
      rdfs:label ?tamilLabel;
      schema:description ?tamil.
  }
  SERVICE wikibase:label {
    bd:serviceParam wikibase:language "es".
    ?item skos:altLabel ?esaltlabel;
      rdfs:label ?esLabel;
      schema:description ?es.
  }
  SERVICE wikibase:label {
    bd:serviceParam wikibase:language "fr".
    ?item skos:altLabel ?fraltlabel;
      rdfs:label ?frLabel;
      schema:description ?fr.
  }
  SERVICE wikibase:label {
    bd:serviceParam wikibase:language "de".
    ?item skos:altLabel ?dealtlabel;
      rdfs:label ?deLabel;
      schema:description ?de.
  }
  SERVICE wikibase:label {
    bd:serviceParam wikibase:language "zh".
    ?item skos:altLabel ?zhaltlabel;
      rdfs:label ?zhLabel;
      schema:description ?zh.
  }
  SERVICE wikibase:label {
    bd:serviceParam wikibase:language "ur".
    ?item skos:altLabel ?uraltlabel;
      rdfs:label ?urLabel;
      schema:description ?ur.
  }
  OPTIONAL { ?wikipedia schema:about ?item; schema:isPartOf <https://en.wikipedia.org/> }
  OPTIONAL { ?hiwikipedia schema:about ?item; schema:isPartOf <https://hi.wikipedia.org/> }
  OPTIONAL { ?tawikipedia schema:about ?item; schema:isPartOf <https://ta.wikipedia.org/> }
  OPTIONAL { ?eswikipedia schema:about ?item; schema:isPartOf <https://es.wikipedia.org/> }
  OPTIONAL { ?frwikipedia schema:about ?item; schema:isPartOf <https://fr.wikipedia.org/> }
  OPTIONAL { ?dewikipedia schema:about ?item; schema:isPartOf <https://de.wikipedia.org/> }
  OPTIONAL { ?zhwikipedia schema:about ?item; schema:isPartOf <https://zh.wikipedia.org/> }
  OPTIONAL { ?urwikipedia schema:about ?item; schema:isPartOf <https://ur.wikipedia.org/> }
}
  • After getting results, Click on 'Link' and then "SPARQL endpoint' . SPARQL file will start downloading automatically.
  • Using amidict for SPARQL mapping will be done. Command for the same:
amidict -vv --dictionary plant_part --directory plant_part --input sparql create --informat wikisparqlxml --sparqlmap wikidataURL=item,wikipediaPage=wikipedia,name=itemLabel,term=itemLabel,Description=itemDescription,Hindi=hindiLabel,Hindi_description=hindi,Hindi_altLabel=hindialtLabel,Tamil=tamilLabel,Tamil_description=tamil,Tamil_altLabel=tamilaltLabel,Spanish=esLabel,Spanish_description=es,Spanish_altLabel=esaltLabel,French=frLabel,French_description=fr,French_altLabel=fraltLabel,Germam=deLabel,German_description=de,German_altLabel=dealtLabel,Chinese=zhLabel,Chinese_altLabel=zhaltLabel,Chinese_description=zh,Urdu=urLabel,Urdu_altLabel=uraltLabel,Urdu_description=ur --transformName wikidataID=EXTRACT(wikidataURL,./(.)) --synonyms=itemAltLabel
  • At present dictionary contains 112 entries.
  • Uploaded on Github
  • The dictionary contains WikidataID, wikidataURL, description, etc.
  • Removed the unnecessary terms that were present.
  • Edited manually to remove these terms old dictionary image
  • Removed the unrelated Wikidata Id

Unnecessary terms

  • Search for images using SPARQL query

Aim : To run ami search for the dictionary plant_part

  • For this dictionary we are searching in the plantpart corpus.
  • To do ami search ,this command was used using plant_part dictionary ami -p plantpart search --dictionary eoplant_part.xml.
  • After running the code on cmd the HTML documents were created in the minicorpora/plantpart folder and all the papers were classified according to plantpart . HTML file
  • To do ami search using different dictionary for the corpus plant_part we can use ami -p plantpart search --dictionary activity.xml Common data table2

Aim :To run ami section for the dictionary plant_part

  • To do ami search, run the command ami -p plantpart section on commandline section final

Running search_lib for the "Dictionary plant_part"