CASI - sporedata/researchdesigneR GitHub Wiki
General description
The Clinical Abbreviation Sense Inventory (CASI) for medical term disambiguation dataset comprises 440 of the most frequently-used abbreviations and acronyms selected from 352,267 dictated clinical notes.
The Unified Medical Language System (UMLS), Another Database of Abbreviations in Medline (ADAM), and Stedman's Medical Abbreviations, Acronyms & Symbols (4th edition) were all used to lexically align the 949 senses of each abbreviation and acronym from 500 randomly selected instances within clinical notes.
A sense inventory (SI) is a collection of abbreviations and acronyms (short forms) with their potential meanings (long forms), and other pertinent information about these terms.