Test Phenotypes - PheMA/project-planning GitHub Wiki

Below is a list of eMERGE phase I & II phenotypes both Harvard & Columbia ran against ib2 & OMOP, respectively. We will use these as test cases. They are listed in the order we plan to test, based on complexity (from simpler to more complex):

No NLP:

  1. BPH – done!!!
  2. Atopic Dermatitis – CHOP’s algorithm w/o NLP (not NU’s w/ NLP & ML that is nearing completion)
  3. GERD
  4. ADHD – NU didn’t implement due to low # of cases
  5. Statins and Mace – just a keyword search of problem list, so not really “NLP”

Need NLP:

  1. Heart Failure – minimal NLP to get ejection fraction from echocardiogram reports which is not too hard
  2. Diverticulosis and diverticulitis - NLP of colonoscopy reports which are less structured
  3. Colon Polyps – NLP of colonoscopy pathology reports which tend to be very structured
  4. CAAD – NLP of carotid artery ultrasound reports
  5. Appendicitis – NLP of appendectomy pathology reports
  6. C. Diff– NLP of progress notes for positive mentions of infection w/ this bacteria, & also to find nursing home status