Analysis Engine: LIST Questions - 11791-04/project-team04 GitHub Wiki

Algorithm:

  1. Identify pivot terms in the query, using unigram heuristics
  2. Locate pivot terms in abstract, extract text in a window of fixed size.
  3. Run BioNER on the extracted text.
  4. Limit to unigram entities for the moment.
  5. Wipeout recognized entities with unigram heuristics.

Advantage:

  • This method is robust and its assumption holds.
  • smaller window size -> higher P, lower R
  • shorter list -> higher P, lower R
  • Robust to overfitting
  • Efficient

To be improved:

  • Bigram LM for bigram List entities.
  • Need NLU module to understand what the question is looking for.
  • Need ontology corpus for concept matching
⚠️ **GitHub.com Fallback** ⚠️