Annotation - linkedannotation/blah2015 GitHub Wiki

Annotation

Targeted set of drugs: Adderall, Ritalin, Modafinil, Adrafinil, Armodafinil, Citalopram, Escitalopram, Paroxetine, Fluoxetine, Fluvoxamine, Sertraline,

Targeted articles: Europe PMC (http://europepmc.org/)

Pipeline:

    1. Drug based article filtering (done)
    1. Patient descriptor identification (in progress)
    1. Phenotypes identification (in progress)
    1. Article sentence annotation (in progress)

Pipeline Technical details:

    1. Drug-based filtering: via NCBO Annotator of Europe PMC abstracts (code: https://github.com/jmbanda/BLAH2015)
    1. Patient descriptor identification (Simple Rule Language tool - in progress)
    1. Phenotype identification (Python dictionary-based tool available)
    1. Article sentence annotation (BRAT rapid annotation tool - in progress)

Potential issues:

  • identify the article ID needed to retrieve the article (EPMC provides many Ids for the same article)
  • Define the list of patient descriptors
  • Phenomine's list of phenotypes is quite specific: How well can we match them in our set of articles?

Annotation of CRAFT corpus with PhenoMiner terms

Task list

    1. Download CRAFT and PhenoMiner term list on GitHub (https://github.com/nhcollier/PhenoMiner) -- done
    1. Automated matching on CRAFT texts to suggest potential terms from the PhenoMiner -- Currently in progress
    1. Manual annotation step using CRAFT in BRAT and the suggested candidate term annotations from stage (1)

Potential issues:

    1. Exact term matching needs to be more flexible (may miss disjointed terms)