ctakes examples - apache/ctakes GitHub Wiki
Collection Readers
Annotation Engines
Output Writers
Piper Files
Build Patient document text from columnar Letter text.
Source class: LetterColumnReader
Source package: org.apache.ctakes.examples.cr
Parent class: org.apache.ctakes.core.cr.AbstractFileTreeReader
Parameter | Description | Class | Required | Default |
---|---|---|---|---|
InputDirectory | Directory for all input files. | String | Yes | |
CRtoSpace | Change windows-format CR + LF character sequences to LF + . | boolean | No | |
Encoding | The character encoding used by the input files. | String | No | |
Extensions | The extensions of the files that the collection reader will read. | String[] | No | * |
KeepCR | Keep windows-format carriage return characters at line endings. This will only keep existing characters, it will not add them. | boolean | No | |
PatientLevel | The level in the directory hierarchy at which patient identifiers exist.Default value is 1; directly under root input directory. | int | No | |
StripQuotes | Replace document-enclosing quote characters with space characters. | boolean | No | |
WriteBanner | Write a large banner at each major step of the pipeline. | String | No | no |
Assigns Body Side to Anatomic Sites.
Source class: BodySideFinder
Source package: org.apache.ctakes.examples.ae
Parent class: org.apache.uima.fit.component.JCasAnnotator_ImplBase
No available configuration parameters.
Detect Blood Pressure values in Vital Signs Section
Source class: RegexBpFinder
Source package: org.apache.ctakes.examples.ae
Parent class: org.apache.uima.fit.component.JCasAnnotator_ImplBase
No available configuration parameters.
Writes a table of Procedure information to file, sorted by character index.
Source class: RTTableFileWriter
Source package: org.apache.ctakes.examples.cc
Parent class: org.apache.ctakes.core.cc.AbstractTableFileWriter
Dependencies: Document Id, Identified Annotation
Usables: Document Id Prefix
Parameter | Description | Class | Required | Default |
---|---|---|---|---|
OutputDirectory | Directory for all output files. | File | Yes | |
SubDirectory | SubDirectory for files. | String | No | |
TableType | Type of Table to write to File. Possible values are: BSV, CSV, HTML, TAB | String | No |
Writes files listing base tokens and their spans in a directory tree.
Source class: TokenSpanWriter
Source package: org.apache.ctakes.examples.cc
Parent class: org.apache.ctakes.core.cc.AbstractJCasFileWriter
Usables: Document Id Prefix, Base Token
Parameter | Description | Class | Required | Default |
---|---|---|---|---|
OutputDirectory | Directory for all output files. | File | Yes | |
SubDirectory | SubDirectory for files. | String | No |
Writes XMI files with full representation of input text and all extracted information per View.
Source class: FileTreeViewXmiWriter
Source package: org.apache.ctakes.examples.cc
Parent class: org.apache.ctakes.core.cc.AbstractJCasFileWriter
Dependencies: Document Id
Usables: Document Id Prefix
Parameter | Description | Class | Required | Default |
---|---|---|---|---|
OutputDirectory | Directory for all output files. | File | Yes | |
SubDirectory | SubDirectory for files. | String | No |
Pipeline that has a lot of annotation engines and writes files in FHIR format.
$\textcolor{gray}{\textsf{// Pipeline that has a lot of annotation engines and writes files in FHIR format. }}$
$\textcolor{magenta}{\textbf{load}}$ SectionedRelationTemporalPipeline
$\textcolor{orange}{\textbf{package}}$ $\textcolor{blue}{\textsf{org.apache.ctakes.fhir.cc}}$
$\textcolor{olive}{\textbf{set}}$ $\textcolor{purple}{\textbf{WriteNlpFhir}}$ =$\textcolor{violet}{\textsf{true}}$
$\textcolor{green}{\textbf{add}}$ FhirJsonFileWriter
Demo pipeline for case-sensitive dictionary lookup with custom reader.
$\textcolor{gray}{\textsf{// Demo pipeline for case-sensitive dictionary lookup with custom reader. }}$
$\textcolor{gray}{\textsf{// Run using -p ApacheConDemo -i org/apache/ctakes/examples/notes/apache\_con -o (your output directory) }}$
$\textcolor{gray}{\textsf{// Use our simple extension of AbstractFileTreeReader. }}$
$\textcolor{teal}{\textbf{reader}}$ $\textcolor{blue}{\textsf{ApacheConDemoReader}}$
$\textcolor{gray}{\textsf{// Load a simple token processing pipeline from another pipeline file. }}$
$\textcolor{magenta}{\textbf{load}}$ DefaultTokenizerPipeline
$\textcolor{gray}{\textsf{// Add non-core annotators. }}$
$\textcolor{green}{\textbf{add}}$ ContextDependentTokenizerAnnotator
$\textcolor{green}{\textbf{addDescription}}$ POSTagger
$\textcolor{gray}{\textsf{// Add Chunkers. }}$
$\textcolor{magenta}{\textbf{load}}$ ChunkerSubPipe
$\textcolor{gray}{\textsf{//load DictionarySubPipe }}$
$\textcolor{gray}{\textsf{// Use the new case sensitive dictionary lookup. }}$
$\textcolor{magenta}{\textbf{load}}$ $\textcolor{blue}{\textsf{cased\_2020aa}}$
$\textcolor{green}{\textbf{add}}$ CasedAnnotationFinder
$\textcolor{gray}{\textsf{// Add Cleartk Entity Attribute annotators. }}$
$\textcolor{magenta}{\textbf{load}}$ AttributeCleartkSubPipe
$\textcolor{gray}{\textsf{// Html output }}$
$\textcolor{green}{\textbf{add}}$ $\textcolor{blue}{\textsf{html.HtmlTextWriter}}$ $\textcolor{purple}{\textbf{SubDirectory}}$ =$\textcolor{violet}{\textsf{HTML}}$
$\textcolor{gray}{\textsf{// Use our simple extension of AbstractJCasFileWriter. }}$
$\textcolor{gray}{\textsf{//add ApacheConDemoWriter }}$
$\textcolor{gray}{\textsf{// Log run time stats and completion. }}$
$\textcolor{green}{\textbf{add}}$ $\textcolor{blue}{\textsf{util.log.FinishedLogger}}$
Demo pipeline with a simple custom reader, custom annotation engine and custom writer.
$\textcolor{gray}{\textsf{// Demo pipeline with a simple custom reader, custom annotation engine and custom writer. }}$
$\textcolor{gray}{\textsf{// Run using command line parameters }}$
$\textcolor{gray}{\textsf{// -p ApacheConDemoBasic }}$
$\textcolor{gray}{\textsf{// -i org/apache/ctakes/examples/notes/apache\_con/Patient123 }}$
$\textcolor{gray}{\textsf{// -o (your output directory) }}$
$\textcolor{gray}{\textsf{// Use our simple extension of AbstractFileTreeReader. }}$
$\textcolor{teal}{\textbf{reader}}$ $\textcolor{blue}{\textsf{ApacheConDemoReader}}$
$\textcolor{gray}{\textsf{// Add our simple regex engine to the pipeline. }}$
$\textcolor{gray}{\textsf{// By default finds "biopsy". }}$
$\textcolor{green}{\textbf{add}}$ $\textcolor{blue}{\textsf{ApacheConDemoEngine}}$
$\textcolor{gray}{\textsf{// Add our simple regex engine to the pipeline. }}$
$\textcolor{gray}{\textsf{// Find Imaging mentions. }}$
$\textcolor{green}{\textbf{add}}$ $\textcolor{blue}{\textsf{ApacheConDemoEngine}}$ $\textcolor{purple}{\textbf{REGEX\_CUI}}$ =$\textcolor{violet}{\textsf{AC456}}$ $\textcolor{purple}{\textbf{REGEX}}$ =$\textcolor{violet}{\textsf{"diagnostic imaging|MRI"}}$
$\textcolor{gray}{\textsf{// Use our simple extension of AbstractJCasFileWriter. }}$
$\textcolor{green}{\textbf{add}}$ $\textcolor{blue}{\textsf{ApacheConDemoWriter}}$
$\textcolor{green}{\textbf{add}}$ FileTreeXmiWriter$\textcolor{purple}{\textbf{SubDirectory}}$ =$\textcolor{violet}{\textsf{XMI}}$
Demo pipeline with coreference resolution and a custom writer.
$\textcolor{gray}{\textsf{// Demo pipeline with coreference resolution and a custom writer. }}$
$\textcolor{gray}{\textsf{// use the standard tokenizer pipeline: }}$
$\textcolor{magenta}{\textbf{load}}$ DefaultTokenizerPipeline
$\textcolor{gray}{\textsf{// Always need these ... }}$
$\textcolor{green}{\textbf{add}}$ ContextDependentTokenizerAnnotator
$\textcolor{green}{\textbf{add}}$ POSTagger
$\textcolor{gray}{\textsf{// Chunkers }}$
$\textcolor{magenta}{\textbf{load}}$ ChunkerSubPipe
$\textcolor{gray}{\textsf{// Default fast dictionary lookup }}$
$\textcolor{gray}{\textsf{//set minimumSpan=2 }}$
$\textcolor{magenta}{\textbf{load}}$ DictionarySubPipe
$\textcolor{gray}{\textsf{// Cleartk Entity Attributes (negation, uncertainty, etc.) }}$
$\textcolor{magenta}{\textbf{load}}$ AttributeCleartkSubPipe
$\textcolor{gray}{\textsf{// Location. }}$
$\textcolor{gray}{\textsf{//add LocationOfRelationExtractorAnnotator classifierJarPath=/org/apache/ctakes/relationextractor/models/location\_of/model.jar }}$
$\textcolor{gray}{\textsf{// Temporal (event, time, dtr, tlink) }}$
$\textcolor{gray}{\textsf{//load TemporalSubPipe }}$
$\textcolor{gray}{\textsf{// Coreferences (e.g. patient = he) }}$
$\textcolor{magenta}{\textbf{load}}$ CorefSubPipe
$\textcolor{green}{\textbf{add}}$ $\textcolor{blue}{\textsf{ApacheConCorefWriter}}$
Demo pipeline with a custom writer.
$\textcolor{gray}{\textsf{// Demo pipeline with a custom writer. }}$
$\textcolor{gray}{\textsf{// Annotate sections by known regex }}$
$\textcolor{gray}{\textsf{//add BsvRegexSectionizer }}$
$\textcolor{green}{\textbf{add}}$ SimpleSegmentAnnotator
$\textcolor{gray}{\textsf{//add ParagraphAnnotator }}$
$\textcolor{gray}{\textsf{//add SentenceDetector }}$
$\textcolor{green}{\textbf{add}}$ SentenceDetectorAnnotatorBIO$\textcolor{purple}{\textbf{classifierJarPath}}$ =$\textcolor{violet}{\textsf{/org/apache/ctakes/core/models/sentdetect/model.jar}}$
$\textcolor{gray}{\textsf{// Save simple information }}$
$\textcolor{green}{\textbf{add}}$ $\textcolor{blue}{\textsf{ApacheConSentenceWriter}}$
Piper file that sets parameters for a pipeline but does not add components.
$\textcolor{gray}{\textsf{// Piper file that sets parameters for a pipeline but does not add components. }}$
$\textcolor{gray}{\textsf{// Parameters for AssertionAnalysisEngine and ConceptConverterAnalysisEngine }}$
$\textcolor{olive}{\textbf{set}}$ $\textcolor{purple}{\textbf{assertionModelResource}}$ =$\textcolor{violet}{\textsf{file:org/apache/ctakes/assertion/models/i2b2.model}}$
$\textcolor{olive}{\textbf{set}}$ $\textcolor{purple}{\textbf{scopeModelResource}}$ =$\textcolor{violet}{\textsf{file:org/apache/ctakes/assertion/models/scope.model}}$
$\textcolor{olive}{\textbf{set}}$ $\textcolor{purple}{\textbf{cueModelResource}}$ =$\textcolor{violet}{\textsf{file:org/apache/ctakes/assertion/models/cue.model}}$
$\textcolor{olive}{\textbf{set}}$ $\textcolor{purple}{\textbf{enabledFeaturesResource}}$ =$\textcolor{violet}{\textsf{file:org/apache/ctakes/assertion/models/featureFile11b}}$
$\textcolor{olive}{\textbf{set}}$ $\textcolor{purple}{\textbf{posModelResource}}$ =$\textcolor{violet}{\textsf{file:org/apache/ctakes/assertion/models/pos.model}}$
An example pipeline that does a lot of stuff.
$\textcolor{gray}{\textsf{// An example pipeline that does a lot of stuff. }}$
$\textcolor{gray}{\textsf{// Write big "Welcome", "Starting", "Finished" Banners in log. }}$
$\textcolor{olive}{\textbf{set}}$ $\textcolor{purple}{\textbf{WriteBanner}}$ =$\textcolor{violet}{\textsf{yes}}$
$\textcolor{gray}{\textsf{// Advanced Tokenization: Regex sectionization, BIO Sentence Detector (lumper), Paragraphs, Lists. }}$
$\textcolor{magenta}{\textbf{load}}$ FullTokenizerPipeline
$\textcolor{gray}{\textsf{// OR use the standard tokenizer pipeline: }}$
$\textcolor{gray}{\textsf{//load DefaultTokenizerPipeline }}$
$\textcolor{gray}{\textsf{// Refined tokens, Parts of Speech. }}$
$\textcolor{green}{\textbf{add}}$ ContextDependentTokenizerAnnotator
$\textcolor{green}{\textbf{add}}$ POSTagger
$\textcolor{gray}{\textsf{// Chunkers }}$
$\textcolor{magenta}{\textbf{load}}$ ChunkerSubPipe
$\textcolor{gray}{\textsf{// Default fast dictionary lookup. }}$
$\textcolor{olive}{\textbf{set}}$ $\textcolor{purple}{\textbf{minimumSpan}}$ =$\textcolor{violet}{\textsf{2}}$
$\textcolor{magenta}{\textbf{load}}$ DictionarySubPipe
$\textcolor{gray}{\textsf{// Cleartk Entity Attributes (negation, uncertainty, etc.). }}$
$\textcolor{magenta}{\textbf{load}}$ AttributeCleartkSubPipe
$\textcolor{gray}{\textsf{// Entity Relations (degree/severity, anatomical location). }}$
$\textcolor{magenta}{\textbf{load}}$ RelationSubPipe
$\textcolor{gray}{\textsf{// Temporal (event, time, dtr, tlink). }}$
$\textcolor{magenta}{\textbf{load}}$ TemporalSubPipe
$\textcolor{gray}{\textsf{// Coreferences (e.g. patient = he). }}$
$\textcolor{magenta}{\textbf{load}}$ CorefSubPipe
$\textcolor{gray}{\textsf{// Token covered text and token span offsets. Write bev (default) and html styles. }}$
$\textcolor{green}{\textbf{add}}$ TokenTableFileWriter$\textcolor{purple}{\textbf{SubDirectory}}$ =$\textcolor{violet}{\textsf{bsv\_tokens}}$
$\textcolor{green}{\textbf{add}}$ TokenTableFileWriter$\textcolor{purple}{\textbf{SubDirectory}}$ =$\textcolor{violet}{\textsf{html\_tokens}}$ $\textcolor{purple}{\textbf{TableType}}$ =$\textcolor{violet}{\textsf{HTML}}$
$\textcolor{gray}{\textsf{// Html output, write to subdirectory. }}$
$\textcolor{green}{\textbf{add}}$ $\textcolor{blue}{\textsf{pretty.html.HtmlTextWriter}}$ $\textcolor{purple}{\textbf{SubDirectory}}$ =$\textcolor{violet}{\textsf{html}}$
$\textcolor{gray}{\textsf{// Text output, write to subdirectory. }}$
$\textcolor{green}{\textbf{add}}$ $\textcolor{blue}{\textsf{pretty.plaintext.PrettyTextWriterFit}}$ $\textcolor{purple}{\textbf{SubDirectory}}$ =$\textcolor{violet}{\textsf{text}}$
$\textcolor{gray}{\textsf{// primitive FHIR output, write to subdirectory }}$
$\textcolor{green}{\textbf{add}}$ $\textcolor{blue}{\textsf{org.apache.ctakes.fhir.cc.FhirJsonFileWriter}}$ $\textcolor{purple}{\textbf{WriteNlpFhir}}$ =$\textcolor{violet}{\textsf{true}}$ $\textcolor{purple}{\textbf{SubDirectory}}$ =$\textcolor{violet}{\textsf{fhir}}$
$\textcolor{gray}{\textsf{// Table output, write to subdirectory. Write bsv (default), csv and html styles. }}$
$\textcolor{green}{\textbf{add}}$ SemanticTableFileWriter$\textcolor{purple}{\textbf{SubDirectory}}$ =$\textcolor{violet}{\textsf{bsv\_table}}$
$\textcolor{green}{\textbf{add}}$ SemanticTableFileWriter$\textcolor{purple}{\textbf{SubDirectory}}$ =$\textcolor{violet}{\textsf{csv\_table}}$ $\textcolor{purple}{\textbf{TableType}}$ =$\textcolor{violet}{\textsf{CSV}}$
$\textcolor{green}{\textbf{add}}$ SemanticTableFileWriter$\textcolor{purple}{\textbf{SubDirectory}}$ =$\textcolor{violet}{\textsf{html\_table}}$ $\textcolor{purple}{\textbf{TableType}}$ =$\textcolor{violet}{\textsf{HTML}}$
$\textcolor{gray}{\textsf{// XMI output, write to subdirectory. Warning: these can be very large. }}$
$\textcolor{green}{\textbf{add}}$ FileTreeXmiWriter$\textcolor{purple}{\textbf{SubDirectory}}$ =$\textcolor{violet}{\textsf{xmi}}$
$\textcolor{gray}{\textsf{// Temporal Events and Times in Anafora format, write to subdirectory. }}$
$\textcolor{green}{\textbf{add}}$ EventTimeAnaforaWriter$\textcolor{purple}{\textbf{SubDirectory}}$ =$\textcolor{violet}{\textsf{anafora}}$
$\textcolor{gray}{\textsf{// Write some information about the run. }}$
$\textcolor{green}{\textbf{addLast}}$ $\textcolor{blue}{\textsf{org.apache.ctakes.core.util.log.FinishedLogger}}$
Demo pipeline for case-sensitive dictionary lookup.
$\textcolor{gray}{\textsf{// Demo pipeline for case-sensitive dictionary lookup. }}$
$\textcolor{gray}{\textsf{// Load a simple token processing pipeline from another pipeline file }}$
$\textcolor{magenta}{\textbf{load}}$ DefaultTokenizerPipeline
$\textcolor{gray}{\textsf{// Add non-core annotators }}$
$\textcolor{green}{\textbf{addDescription}}$ POSTagger
$\textcolor{gray}{\textsf{// New case-sensitive dictionary lookup }}$
$\textcolor{magenta}{\textbf{load}}$ $\textcolor{blue}{\textsf{cased\_2020aa\_2}}$
$\textcolor{green}{\textbf{add}}$ CasedAnnotationFinder
$\textcolor{gray}{\textsf{// Simple writer for Demo }}$
$\textcolor{green}{\textbf{add}}$ $\textcolor{blue}{\textsf{ApacheConAnnotationWriter}}$
Pipeline that simply reads a FHIR json file and writes a marked-up html file.
$\textcolor{gray}{\textsf{// Pipeline that simply reads a FHIR json file and writes a marked-up html file. }}$
$\textcolor{orange}{\textbf{package}}$ $\textcolor{blue}{\textsf{org.apache.ctakes.fhir.cr}}$
$\textcolor{teal}{\textbf{reader}}$ FhirJsonFileReader
$\textcolor{green}{\textbf{add}}$ $\textcolor{blue}{\textsf{pretty.html.HtmlTextWriter}}$
Commands and parameters to run the ctakes-examples "Hello World" pipeline.
$\textcolor{gray}{\textsf{// Commands and parameters to run the ctakes-examples "Hello World" pipeline. }}$
$\textcolor{gray}{\textsf{// Load a simple token processing pipeline from another pipeline file }}$
$\textcolor{magenta}{\textbf{load}}$ DefaultTokenizerPipeline.piper
$\textcolor{gray}{\textsf{// Add non-core annotators }}$
$\textcolor{green}{\textbf{add}}$ ContextDependentTokenizerAnnotator
$\textcolor{gray}{\textsf{// The POSTagger has a -complex- startup, but it can create its own description to handle it }}$
$\textcolor{green}{\textbf{addDescription}}$ POSTagger
$\textcolor{gray}{\textsf{// add the simple Hello World Annotator }}$
$\textcolor{green}{\textbf{add}}$ $\textcolor{blue}{\textsf{org.apache.ctakes.examples.ae.ExampleHelloWorldAnnotator}}$
Commands and parameters to run the "Hello World" pipeline with Entity Property output.
$\textcolor{gray}{\textsf{// Commands and parameters to run the "Hello World" pipeline with Entity Property output. }}$
$\textcolor{gray}{\textsf{// Load a simple token processing pipeline from another file }}$
$\textcolor{magenta}{\textbf{load}}$ $\textcolor{blue}{\textsf{org/apache/ctakes/examples/pipeline/HelloWorld.piper}}$
$\textcolor{gray}{\textsf{// Assertion engines require dependencies }}$
$\textcolor{green}{\textbf{addDescription}}$ ClearNLPDependencyParserAE
$\textcolor{gray}{\textsf{// Add the Semantic Role Labeler parser for use by assertion }}$
$\textcolor{green}{\textbf{add}}$ ClearNLPSemanticRoleLabelerAE
$\textcolor{gray}{\textsf{// Use the assertion mini pipeline }}$
$\textcolor{gray}{\textsf{// load parameters used by the following engines }}$
$\textcolor{magenta}{\textbf{load}}$ $\textcolor{blue}{\textsf{org/apache/ctakes/examples/pipeline/AssertionDefaults.piper}}$
$\textcolor{gray}{\textsf{// the engines ... }}$
$\textcolor{green}{\textbf{add}}$ $\textcolor{blue}{\textsf{medfacts.AssertionAnalysisEngine}}$
$\textcolor{green}{\textbf{add}}$ $\textcolor{blue}{\textsf{medfacts.ConceptConverterAnalysisEngine}}$
$\textcolor{green}{\textbf{add}}$ $\textcolor{blue}{\textsf{attributes.SubjectAttributeAnalysisEngine}}$
$\textcolor{green}{\textbf{add}}$ $\textcolor{blue}{\textsf{attributes.GenericAttributeAnalysisEngine}}$
$\textcolor{gray}{\textsf{// Collect discovered Entity information for post-run access }}$
collectEntities
Commands and parameters to run the "Hello World" pipeline with UMLS Concept Unique Identifiers (CUI) output.
$\textcolor{gray}{\textsf{// Commands and parameters to run the "Hello World" pipeline with UMLS Concept Unique Identifiers (CUI) output. }}$
$\textcolor{gray}{\textsf{// Load a simple token processing pipeline from another pipeline file }}$
$\textcolor{magenta}{\textbf{load}}$ $\textcolor{blue}{\textsf{org/apache/ctakes/core/pipeline/DefaultTokenizerPipeline.piper}}$
$\textcolor{gray}{\textsf{// Add non-core annotators }}$
$\textcolor{green}{\textbf{add}}$ ContextDependentTokenizerAnnotator
$\textcolor{gray}{\textsf{// The POSTagger has a -complex- startup, but it can create its own description to handle it }}$
$\textcolor{green}{\textbf{addDescription}}$ POSTagger
$\textcolor{gray}{\textsf{// Change the umls and password parameters below }}$
$\textcolor{olive}{\textbf{set}}$ $\textcolor{purple}{\textbf{ctakes.umlsuser}}$ =$\textcolor{violet}{\textsf{CHANGE\_ME}}$ $\textcolor{purple}{\textbf{ctakes.umlspw}}$ =$\textcolor{violet}{\textsf{CHANGE\_ME}}$
$\textcolor{gray}{\textsf{// Default fast dictionary lookup }}$
$\textcolor{magenta}{\textbf{load}}$ DictionarySubPipe.piper
$\textcolor{gray}{\textsf{// Collect discovered UMLS Concept Unique Identifiers (CUI) for post-run information }}$
collectCuis
Commands and parameters to run the "Hello World" pipeline with Entity Property output
$\textcolor{gray}{\textsf{// Commands and parameters to run the "Hello World" pipeline with Entity Property output }}$
$\textcolor{gray}{\textsf{// Load a simple token processing pipeline from another file }}$
$\textcolor{magenta}{\textbf{load}}$ $\textcolor{blue}{\textsf{org/apache/ctakes/examples/pipeline/HelloWorld.piper}}$
$\textcolor{gray}{\textsf{// Add Named Entity Context Entity Attribute annotators }}$
$\textcolor{magenta}{\textbf{load}}$ NeContextsSubPipe.piper
$\textcolor{gray}{\textsf{// Collect discovered Entity information for post-run access }}$
collectEntities
Commands and parameters to run the "Hello World" pipeline with Entity Property output.
$\textcolor{gray}{\textsf{// Commands and parameters to run the "Hello World" pipeline with Entity Property output. }}$
$\textcolor{gray}{\textsf{// Load a simple token processing pipeline from another file }}$
$\textcolor{magenta}{\textbf{load}}$ $\textcolor{blue}{\textsf{org/apache/ctakes/examples/pipeline/HelloWorld.piper}}$
$\textcolor{gray}{\textsf{// Add Cleartk Entity Attribute annotators }}$
$\textcolor{magenta}{\textbf{load}}$ AttributeCleartkSubPipe.piper
$\textcolor{gray}{\textsf{// Collect discovered Entity information for post-run access }}$
collectEntities
An example piper file that will spin up a complete pbj pipeline.
$\textcolor{gray}{\textsf{// An example piper file that will spin up a complete pbj pipeline. }}$
$\textcolor{gray}{\textsf{// }}$
$\textcolor{gray}{\textsf{// This piper will start the Apache Artemis broker pointed to by the -a parameter on the command line. }}$
$\textcolor{gray}{\textsf{// It will pause for 5 seconds to allow Artemis to fully launch. }}$
$\textcolor{gray}{\textsf{// }}$
$\textcolor{gray}{\textsf{// This piper will then launch another instance of Apache cTAKES. }}$
$\textcolor{gray}{\textsf{// That instance of cTAKES will run the third and final bit of the entire PBJ pipeline. }}$
$\textcolor{gray}{\textsf{// }}$
$\textcolor{gray}{\textsf{// This piper will then launch a python PBJ bit of the entire pipeline. }}$
$\textcolor{gray}{\textsf{// }}$
$\textcolor{olive}{\textbf{set}}$ $\textcolor{purple}{\textbf{SetJavaHome}}$ =$\textcolor{violet}{\textsf{no}}$
$\textcolor{gray}{\textsf{// }}$
$\textcolor{gray}{\textsf{// To run this pipeline from the command line, use the parameters: }}$
$\textcolor{gray}{\textsf{// -p SentencePrinter }}$
$\textcolor{gray}{\textsf{// -v \{python environment Directory\} }}$
$\textcolor{gray}{\textsf{// -a \{Artemis Broker Directory\} }}$
$\textcolor{gray}{\textsf{// -i \{Input Document Directory\} }}$
$\textcolor{gray}{\textsf{// -o \{Output Directory\} }}$
$\textcolor{gray}{\textsf{// }}$
$\textcolor{gray}{\textsf{// A standard command-line option is the specification of whether or not to pip the ctakes-pbj package. }}$
$\textcolor{gray}{\textsf{// By default ctakes-pbj will be pip ed at the beginning of a run. You can turn this off with: }}$
$\textcolor{gray}{\textsf{// --pipPbj no }}$
$\textcolor{gray}{\textsf{// }}$
$\textcolor{gray}{\textsf{// Sets up required parameters, starts your Artemis Broker, pips the PBJ project. }}$
$\textcolor{magenta}{\textbf{load}}$ PbjStarter
$\textcolor{gray}{\textsf{// }}$
$\textcolor{gray}{\textsf{// Start another instance of cTAKES, running the pipeline in StartAllExample\_end.piper }}$
$\textcolor{gray}{\textsf{// \$OutputDirectory will substitute the value of this cTAKES pipeline's value for OutputDirectory. }}$
$\textcolor{gray}{\textsf{// \$ArtemisBroker will substitute the value of this cTAKES pipeline's value for ArtemisBroker. }}$
$\textcolor{gray}{\textsf{// }}$
$\textcolor{green}{\textbf{add}}$ CtakesRunner$\textcolor{purple}{\textbf{Pipeline}}$ =$\textcolor{violet}{\textsf{"-p PbjThirdStep -o \$OutputDirectory -a \$ArtemisBroker"}}$
$\textcolor{gray}{\textsf{// }}$
$\textcolor{gray}{\textsf{// Start the python bit of the full pipeline. }}$
$\textcolor{gray}{\textsf{// }}$
$\textcolor{gray}{\textsf{// Declare the python pipeline defining the second step in the total pipeline. }}$
$\textcolor{olive}{\textbf{set}}$ $\textcolor{purple}{\textbf{PbjSecondStep}}$ =$\textcolor{violet}{\textsf{ctakes\_pbj.examples.sentence\_printer\_pipeline}}$
$\textcolor{gray}{\textsf{// The receive and send queue names must be specified. }}$
$\textcolor{gray}{\textsf{// --receive\_queue and -rq are equivalent, as are --send\_queue and -sq }}$
$\textcolor{green}{\textbf{add}}$ PythonRunner$\textcolor{purple}{\textbf{Command}}$ =$\textcolor{violet}{\textsf{"-m \$PbjSecondStep --receive\_queue JavaToPy --send\_queue PyToJava"}}$
$\textcolor{gray}{\textsf{// }}$
$\textcolor{gray}{\textsf{// The pipeline run by this instance of cTAKES. }}$
$\textcolor{gray}{\textsf{// }}$
$\textcolor{gray}{\textsf{// Load a simple token processing pipeline from another pipeline file }}$
$\textcolor{magenta}{\textbf{load}}$ DefaultTokenizerPipeline
$\textcolor{gray}{\textsf{// Send CAS to Artemis at the specified queue. Send stop signal when processing has finished. }}$
$\textcolor{green}{\textbf{add}}$ PbjJmsSender$\textcolor{purple}{\textbf{SendQueue}}$ =$\textcolor{violet}{\textsf{JavaToPy}}$ $\textcolor{purple}{\textbf{SendStop}}$ =$\textcolor{violet}{\textsf{yes}}$
$\textcolor{gray}{\textsf{//add PbjStompSender SendQueue=JavaToPy SendStop=yes }}$
This piper file just listens to a queue and saves cas information to output files.
$\textcolor{gray}{\textsf{// This piper file just listens to a queue and saves cas information to output files. }}$
$\textcolor{gray}{\textsf{// Get cas from Artemis. }}$
$\textcolor{teal}{\textbf{reader}}$ PbjReceiver$\textcolor{purple}{\textbf{ReceiveQueue}}$ =$\textcolor{violet}{\textsf{PyToJava}}$
$\textcolor{gray}{\textsf{// Save a nice table. }}$
$\textcolor{green}{\textbf{add}}$ SemanticTableFileWriter$\textcolor{purple}{\textbf{SubDirectory}}$ =$\textcolor{violet}{\textsf{table}}$
$\textcolor{gray}{\textsf{// Save HTML. }}$
$\textcolor{green}{\textbf{add}}$ $\textcolor{blue}{\textsf{pretty.html.HtmlTextWriter}}$ $\textcolor{purple}{\textbf{SubDirectory}}$ =$\textcolor{violet}{\textsf{html}}$
$\textcolor{gray}{\textsf{// Save marked text. }}$
$\textcolor{green}{\textbf{add}}$ $\textcolor{blue}{\textsf{pretty.plaintext.PrettyTextWriterFit}}$ $\textcolor{purple}{\textbf{SubDirectory}}$ =$\textcolor{violet}{\textsf{text}}$
$\textcolor{gray}{\textsf{// Perform steps to stop the pbj pipeline }}$
$\textcolor{magenta}{\textbf{load}}$ PbjStopper
This is an example piper file that will spin up a complete pbj pipeline.
$\textcolor{gray}{\textsf{// This is an example piper file that will spin up a complete pbj pipeline. }}$
$\textcolor{gray}{\textsf{// }}$
$\textcolor{gray}{\textsf{// This piper will start the Apache Artemis broker pointed to by the -a parameter on the command line. }}$
$\textcolor{gray}{\textsf{// It will pause for 5 seconds to allow Artemis to fully launch. }}$
$\textcolor{gray}{\textsf{// }}$
$\textcolor{gray}{\textsf{// This piper will then launch another instance of Apache cTAKES. }}$
$\textcolor{gray}{\textsf{// That instance of cTAKES will run the third and final bit of the entire PBJ pipeline. }}$
$\textcolor{gray}{\textsf{// }}$
$\textcolor{gray}{\textsf{// This piper will then launch a python PBJ bit of the entire pipeline. }}$
$\textcolor{gray}{\textsf{// }}$
$\textcolor{olive}{\textbf{set}}$ $\textcolor{purple}{\textbf{SetJavaHome}}$ =$\textcolor{violet}{\textsf{no}}$
$\textcolor{gray}{\textsf{// }}$
$\textcolor{gray}{\textsf{// To run this pipeline from the command line, use the parameters: }}$
$\textcolor{gray}{\textsf{// -p WordFinder }}$
$\textcolor{gray}{\textsf{// -v \{python environment Directory\} }}$
$\textcolor{gray}{\textsf{// -a \{Artemis Broker Directory\} }}$
$\textcolor{gray}{\textsf{// -i \{Input Document Directory\} }}$
$\textcolor{gray}{\textsf{// -o \{Output Directory\} }}$
$\textcolor{gray}{\textsf{// }}$
$\textcolor{gray}{\textsf{// A standard command-line option is the specification of whether or not to pip the ctakes-pbj package. }}$
$\textcolor{gray}{\textsf{// By default ctakes-pbj will be pip ed at the beginning of a run. You can turn this off with: }}$
$\textcolor{gray}{\textsf{// --pipPbj no }}$
$\textcolor{gray}{\textsf{// }}$
$\textcolor{gray}{\textsf{// Sets up required parameters, starts your Artemis Broker, pips the PBJ project. }}$
$\textcolor{magenta}{\textbf{load}}$ PbjStarter
$\textcolor{gray}{\textsf{// }}$
$\textcolor{gray}{\textsf{// Start another instance of cTAKES, running the pipeline in StartAllExample\_end.piper }}$
$\textcolor{gray}{\textsf{// \$OutputDirectory will substitute the value of this cTAKES pipeline's value for OutputDirectory. }}$
$\textcolor{gray}{\textsf{// \$ArtemisBroker will substitute the value of this cTAKES pipeline's value for ArtemisBroker. }}$
$\textcolor{gray}{\textsf{// }}$
$\textcolor{green}{\textbf{add}}$ CtakesRunner$\textcolor{purple}{\textbf{Pipeline}}$ =$\textcolor{violet}{\textsf{"-p PbjThirdStep -o \$OutputDirectory -a \$ArtemisBroker"}}$
$\textcolor{gray}{\textsf{// }}$
$\textcolor{gray}{\textsf{// Start the python bit of the full pipeline. }}$
$\textcolor{gray}{\textsf{// }}$
$\textcolor{gray}{\textsf{// Declare the python pipeline defining the second step in the total pipeline. }}$
$\textcolor{olive}{\textbf{set}}$ $\textcolor{purple}{\textbf{PbjSecondStep}}$ =$\textcolor{violet}{\textsf{ctakes\_pbj.examples.word\_finder\_pipeline}}$
$\textcolor{gray}{\textsf{// The receive and send queue names must be specified. }}$
$\textcolor{gray}{\textsf{// --receive\_queue and -rq are equivalent, as are --send\_queue and -sq }}$
$\textcolor{green}{\textbf{add}}$ PythonRunner$\textcolor{purple}{\textbf{Command}}$ =$\textcolor{violet}{\textsf{"-m \$PbjSecondStep --receive\_queue JavaToPy --send\_queue PyToJava"}}$
$\textcolor{gray}{\textsf{// }}$
$\textcolor{gray}{\textsf{// The pipeline run by this instance of cTAKES. }}$
$\textcolor{gray}{\textsf{// }}$
$\textcolor{gray}{\textsf{// Load a simple token processing pipeline from another pipeline file }}$
$\textcolor{magenta}{\textbf{load}}$ DefaultTokenizerPipeline
$\textcolor{gray}{\textsf{// Send CAS to Artemis at the specified queue. Send stop signal when processing has finished. }}$
$\textcolor{green}{\textbf{add}}$ PbjJmsSender$\textcolor{purple}{\textbf{SendQueue}}$ =$\textcolor{violet}{\textsf{JavaToPy}}$ $\textcolor{purple}{\textbf{SendStop}}$ =$\textcolor{violet}{\textsf{yes}}$
This is an example piper file that will spin up a complete pbj pipeline.
$\textcolor{gray}{\textsf{// This is an example piper file that will spin up a complete pbj pipeline. }}$
$\textcolor{gray}{\textsf{// }}$
$\textcolor{gray}{\textsf{// This piper will start the Apache Artemis broker pointed to by the -a parameter on the command line. }}$
$\textcolor{gray}{\textsf{// It will pause for 5 seconds to allow Artemis to fully launch. }}$
$\textcolor{gray}{\textsf{// }}$
$\textcolor{gray}{\textsf{// This piper will then launch a python PBJ bit of the entire pipeline. }}$
$\textcolor{gray}{\textsf{// }}$
$\textcolor{olive}{\textbf{set}}$ $\textcolor{purple}{\textbf{SetJavaHome}}$ =$\textcolor{violet}{\textsf{no}}$
$\textcolor{gray}{\textsf{// }}$
$\textcolor{gray}{\textsf{// To run this pipeline from the command line, use the parameters: }}$
$\textcolor{gray}{\textsf{// -p WordFinder }}$
$\textcolor{gray}{\textsf{// -v \{python environment Directory\} }}$
$\textcolor{gray}{\textsf{// -a \{Artemis Broker Directory\} }}$
$\textcolor{gray}{\textsf{// -i \{Input Document Directory\} }}$
$\textcolor{gray}{\textsf{// -o \{Output Directory\} }}$
$\textcolor{gray}{\textsf{// }}$
$\textcolor{gray}{\textsf{// A standard command-line option is the specification of whether or not to pip the ctakes-pbj package. }}$
$\textcolor{gray}{\textsf{// By default ctakes-pbj will be pip ed at the beginning of a run. You can turn this off with: }}$
$\textcolor{gray}{\textsf{// --pipPbj no }}$
$\textcolor{gray}{\textsf{// }}$
$\textcolor{gray}{\textsf{// Sets up required parameters, starts your Artemis Broker, pips the PBJ project. }}$
$\textcolor{magenta}{\textbf{load}}$ PbjStarter
$\textcolor{gray}{\textsf{// }}$
$\textcolor{gray}{\textsf{// Start the python bit of the full pipeline. }}$
$\textcolor{gray}{\textsf{// }}$
$\textcolor{gray}{\textsf{// Declare the python pipeline defining the second step in the total pipeline. }}$
$\textcolor{olive}{\textbf{set}}$ $\textcolor{purple}{\textbf{PbjSecondStep}}$ =$\textcolor{violet}{\textsf{ctakes\_pbj.examples.word\_finder\_pipeline}}$
$\textcolor{gray}{\textsf{// The receive and send queue names must be specified. }}$
$\textcolor{gray}{\textsf{// --receive\_queue and -rq are equivalent, as are --send\_queue and -sq }}$
$\textcolor{green}{\textbf{add}}$ PythonRunner$\textcolor{purple}{\textbf{Command}}$ =$\textcolor{violet}{\textsf{"-m \$PbjSecondStep --receive\_queue JavaToPy --send\_queue PyToJava"}}$
$\textcolor{gray}{\textsf{// }}$
$\textcolor{gray}{\textsf{// The pipeline run by this instance of cTAKES. It includes a pbj sender and receiver. }}$
$\textcolor{gray}{\textsf{// }}$
$\textcolor{gray}{\textsf{// Load a simple token processing pipeline from another pipeline file }}$
$\textcolor{magenta}{\textbf{load}}$ DefaultTokenizerPipeline
$\textcolor{gray}{\textsf{// Send CAS to Artemis at the specified queue. Send stop signal when processing has finished. }}$
$\textcolor{green}{\textbf{add}}$ PbjJmsSender$\textcolor{purple}{\textbf{SendQueue}}$ =$\textcolor{violet}{\textsf{JavaToPy}}$ $\textcolor{purple}{\textbf{SendStop}}$ =$\textcolor{violet}{\textsf{yes}}$
$\textcolor{gray}{\textsf{// At this point the python process should handle the cas, before sending it "back". }}$
$\textcolor{gray}{\textsf{// Receive CAS from Artemis at the specified queue. }}$
$\textcolor{green}{\textbf{add}}$ PbjReceiverAE$\textcolor{purple}{\textbf{ReceiveQueue}}$ =$\textcolor{violet}{\textsf{PyToJava}}$
$\textcolor{gray}{\textsf{// Save a nice table. }}$
$\textcolor{green}{\textbf{add}}$ SemanticTableFileWriter$\textcolor{purple}{\textbf{SubDirectory}}$ =$\textcolor{violet}{\textsf{table}}$
$\textcolor{gray}{\textsf{// Save HTML. }}$
$\textcolor{green}{\textbf{add}}$ $\textcolor{blue}{\textsf{pretty.html.HtmlTextWriter}}$ $\textcolor{purple}{\textbf{SubDirectory}}$ =$\textcolor{violet}{\textsf{html}}$
$\textcolor{gray}{\textsf{// Save marked text. }}$
$\textcolor{green}{\textbf{add}}$ $\textcolor{blue}{\textsf{pretty.plaintext.PrettyTextWriterFit}}$ $\textcolor{purple}{\textbf{SubDirectory}}$ =$\textcolor{violet}{\textsf{text}}$
$\textcolor{gray}{\textsf{// Stop the Artemis Broker }}$
$\textcolor{green}{\textbf{add}}$ ArtemisStopper
Commands and parameters to run the "Hello World" pipeline with Entity Property output.
$\textcolor{gray}{\textsf{// Commands and parameters to run the "Hello World" pipeline with Entity Property output. }}$
readFiles org/apache/ctakes/examples/notes
$\textcolor{gray}{\textsf{// Load a simple token processing pipeline from another pipeline file }}$
$\textcolor{magenta}{\textbf{load}}$ DefaultTokenizerPipeline.piper
$\textcolor{gray}{\textsf{// Add non-core annotators }}$
$\textcolor{green}{\textbf{add}}$ ContextDependentTokenizerAnnotator
$\textcolor{gray}{\textsf{// The POSTagger has a -complex- startup, but it can create its own description to handle it }}$
$\textcolor{green}{\textbf{addDescription}}$ POSTagger
$\textcolor{gray}{\textsf{//addDescription LvgAnnotator }}$
$\textcolor{green}{\textbf{addDescription}}$ ThreadSafeLvg
$\textcolor{gray}{\textsf{// Default fast dictionary lookup }}$
$\textcolor{magenta}{\textbf{load}}$ DictionarySubPipe.piper
$\textcolor{gray}{\textsf{// Add Named Entity Context Entity Attribute annotators }}$
$\textcolor{magenta}{\textbf{load}}$ NeContextsSubPipe.piper
$\textcolor{gray}{\textsf{// Collect discovered Entity information for post-run access }}$
collectEntities
Pipeline for: sections, paragraphs, sentences, lists, entities and attributes, relations, temporal info, coreferences.
$\textcolor{gray}{\textsf{// Pipeline for: sections, paragraphs, sentences, lists, entities and attributes, relations, temporal info, coreferences. }}$
$\textcolor{gray}{\textsf{// set the thread count }}$
threads 3
$\textcolor{gray}{\textsf{// Advanced Tokenization: Regex sectionization, BIO Sentence Detector (lumper), Paragraphs, Lists }}$
$\textcolor{magenta}{\textbf{load}}$ TsFullTokenizerPipeline
$\textcolor{gray}{\textsf{// Always need these ... }}$
$\textcolor{green}{\textbf{add}}$ ContextDependentTokenizerAnnotator
$\textcolor{green}{\textbf{add}}$ $\textcolor{blue}{\textsf{concurrent.ThreadSafePosTagger}}$
$\textcolor{gray}{\textsf{// Chunkers }}$
$\textcolor{magenta}{\textbf{load}}$ TsChunkerSubPipe
$\textcolor{gray}{\textsf{// Default fast dictionary lookup }}$
$\textcolor{olive}{\textbf{set}}$ $\textcolor{purple}{\textbf{minimumSpan}}$ =$\textcolor{violet}{\textsf{2}}$
$\textcolor{magenta}{\textbf{load}}$ TsDictionarySubPipe
$\textcolor{gray}{\textsf{// Cleartk Entity Attributes (negation, uncertainty, etc.) }}$
$\textcolor{magenta}{\textbf{load}}$ TsAttributeCleartkSubPipe
$\textcolor{gray}{\textsf{// Entity Relations (degree/severity, anatomical location) }}$
$\textcolor{magenta}{\textbf{load}}$ TsRelationSubPipe
$\textcolor{gray}{\textsf{// Temporal (event, time, dtr, tlink) }}$
$\textcolor{magenta}{\textbf{load}}$ TsTemporalSubPipe
$\textcolor{gray}{\textsf{// Coreferences (e.g. patient = he) }}$
$\textcolor{magenta}{\textbf{load}}$ TsCorefSubPipe
$\textcolor{gray}{\textsf{// Html output }}$
$\textcolor{green}{\textbf{add}}$ $\textcolor{blue}{\textsf{pretty.html.HtmlTextWriter}}$
Pipeline with an XMI reader component to feed ctakes XMI files as input instead of plain text files.
$\textcolor{gray}{\textsf{// Pipeline with an XMI reader component to feed ctakes XMI files as input instead of plain text files. }}$
$\textcolor{gray}{\textsf{// Read XMI File using -i from the command line to specify input file }}$
$\textcolor{teal}{\textbf{reader}}$ XmiTreeReader
$\textcolor{gray}{\textsf{// Write html }}$
$\textcolor{green}{\textbf{add}}$ $\textcolor{blue}{\textsf{html.HtmlTextWriter}}$ $\textcolor{purple}{\textbf{SubDirectory}}$ =$\textcolor{violet}{\textsf{HTML}}$
$\textcolor{gray}{\textsf{// write -marked- plaintext }}$
$\textcolor{green}{\textbf{add}}$ $\textcolor{blue}{\textsf{pretty.plaintext.PrettyTextWriterFit}}$ $\textcolor{purple}{\textbf{SubDirectory}}$ =$\textcolor{violet}{\textsf{TEXT}}$
$\textcolor{gray}{\textsf{// write property list }}$
$\textcolor{gray}{\textsf{//add property.plaintext.PropertyTextWriterFit }}$
$\textcolor{gray}{\textsf{// Writes a list of Semantic information about discovered annotations to files. }}$
$\textcolor{green}{\textbf{add}}$ SemanticTableFileWriter$\textcolor{purple}{\textbf{SubDirectory}}$ =$\textcolor{violet}{\textsf{TUI}}$
$\textcolor{gray}{\textsf{// Announce completion }}$
$\textcolor{green}{\textbf{addLast}}$ $\textcolor{blue}{\textsf{util.log.FinishedLogger}}$