Error Analysis - 11791-04/project-team04 GitHub Wiki

Concepts

  • Gold standard data is wrong (year 2012)
  • Query Normalization
    • Punctuation, Spacing, Casing, Lemmatization
    • Difficult to evaluate when you can’t get anything back from the WebAPI

Discussion

  • In principle, would be good to try some form of PSR to improve results in other areas
  • Experiment improving Concepts baseline (without Uniprot / Joachem results) shows
    • PRF can destroy results
    • More results increases recall
    • Narrow results increases precision