PubAnnotation system - linkedannotation/blah2015 GitHub Wiki

related links

system improvements

  • to include the vernacular titles in the pubannotation documents. (suggestion from Pierre) [Done]
  • use text alignment to automatically translate annotation coordinates between different parsed versions of the same document. (suggestion from Lars)

tools

documentation

  • discontinuous spans (suggestion from Pierre)
  • formal specification of PubAnnotation JSON format (suggestion from Chen)
  • The section id is divid, not div_id (wrongly documented here: http://www.pubannotation.org/docs/submit-annotation/) [Done]
  • Must clearly specify how to write an entity/span that is normalized to 2 or more objects:
    • After discussion in the hackaton, the consensus seems to be to repeat the span object
  • Analogously as before, must specify how to write relations of entities (for which at least one has 2 or more normalizations):
    • Following previous possible solution, the solution here would be to repeat the relation objects too? However, what if the 2 entities are normalized to 2 or more objects, in this case we would have n*m relation objects...
  • REST API for deleting documents / annotations ?
  • REST API for creating / deleting projects?

howto

  • represent what BRAT calls a normalization annotation
    • a possible solution in PubAnnotation would be to add the normalization value as a second denotation, and to use a relation annotation to link the two denotations:
  • BRAT:
T4      DISO 66 82      cancer de vessie
N4      Reference T4 UMLS:C0005684      Malignant neoplasm of urinary bladder
  • PubAnnotation:

    "denotations": [
        {"id": "T4", "span": {"begin": 66, "end": 82}, "obj": "DISO"}
        {"id": "T4c", "span": {"begin": 66, "end": 82}, "obj": "UMLS:C0005684"}
    ]
   "relations": [
        {"id": "R1", "subj": "T4", "pred": "Normalization", "obj": "T4c"}
   ]

ToDo

  • discontinuous span annotation
  • entity description at various levels