ctakes lvg - apache/ctakes GitHub Wiki

Annotation Engines


Annotation Engines

LVG Annotator

Adds canonical form of words.

Source class: LvgAnnotator
Source package: org.apache.ctakes.lvg.ae
Parent class: org.apache.uima.fit.component.JCasAnnotator_ImplBase
Dependencies: Section, Base Token

Parameter Description Class Required Default
CmdCacheFileLocation File with stored cache of canonical forms String No org/apache/ctakes/lvg/ 2005_norm.voc
CmdCacheFrequencyCutoff Minimum frequency required for loading from cache int No 20
ExclusionSet Words to exclude when doing LVG normalization String[] No
LemmaCacheFileLocation Path to lemma cache file -- if useLemmaCache and postLemmas are true String No org/apache/ctakes/lvg/ 2005_lemma.voc
LemmaCacheFrequencyCutoff Threshold for the frequency of a lemma to be loaded into the cache int No 20
PostLemmas Whether to extract the lexical variants and write to cas (creates large files) boolean No false
SegmentsToSkip Segment IDs to skip during processing String[] No
UseCmdCache Use cache to track canonical forms boolean No false
UseLemmaCache Whether to use a cache for lemmas boolean No false
UseSegments Whether to use segments found in upstream cTAKES components boolean No false
XeroxTreebankMap Mapping from Xerox parts of speech to Treebank equivalents String[] No

LVG Basetoken Annotator

Adds canonical form of Base Tokens.

Source class: LvgBaseTokenAnnotator
Source package: org.apache.ctakes.lvg.ae
Parent class: org.apache.uima.analysis_component.JCasAnnotator_ImplBase
Dependencies: Section, Base Token

No available configuration parameters.

Thread-Safe LVG

Annotates Lexical Variants for terms with attempted thread safety.

Source class: ThreadSafeLvg
Source package: org.apache.ctakes.lvg.ae
Parent class: org.apache.uima.fit.component.JCasAnnotator_ImplBase
Dependencies: Base Token

No available configuration parameters.

⚠️ **GitHub.com Fallback** ⚠️