BIOADI A Machine Learning Approach to Identify Abbreviations and Definitions in Biological Literature. - mauriceling/mauriceling.github.io GitHub Wiki

Citation: Kuo, CJ, Ling, MHT, Lin, KT, Hsu, CN. 2009. BIOADI: A Machine Learning Approach to Identify Abbreviations and Definitions in Biological Literature. BMC Bioinformatics 10(Suppl 15):S7.

Link to [Full Text] and [PDF]

Here is a permanent link to this [PDF] in my own archive.

This manuscript deals with a limitation identified in my doctoral thesis - real-time identification of gene/protein names and its abbreviations in text instead of a dictionary approach used in my thesis. We identified about 1.7 million unique long form / abbreviations pairs in the entire PubMed with 95.86% precision and 89.9% recall at an average computational speed of 10.2 seconds per thousand abstracts. At the same time, BIOADI is also a standalone tool that can be incorporated into an analysis pipeline. This study also contributed an annotated corpus to the community for tool evaluation purposes.