ctakes pbj - apache/ctakes GitHub Wiki
A Python Bridge to Java (PBJ).
Problem Statement
Solutions start with identifying the problem. Our problem is the lack of a standardized path to move information from cTAKES to a python program (and back again). Having that ability is very important as most modern Machine Learning is done in Python.Solution
The information that we want to move is stored in an object called a CAS (Common Analysis System). All objects within the CAS are of a Type defined in an extensible Type System. For instance a discovered instance of "cancer" is stored in the CAS as an object of Type "DiseaseDisorderMention".The next step was for us to choose a method of delivery for our path of information. We were looking for something that could handle multiple sub-pipelines, allow for parallel sub-pipelines, and a method that is fast, reusable, and easy to use.
Apache ActiveMQ Message Broker combined with dkpro-cassis became apparent as the ideal solution to our problem, allowing what we hoped for above and more.
How it Works
Other Configurations
Annotation Engines
Utilities
Piper Files
Annotation Engine wrapper for the PbjReceiver.
Source class: PbjReceiverAE
Source package: org.apache.ctakes.pbj.ae
Parent class: org.apache.uima.fit.component.JCasAnnotator_ImplBase
Parameter | Description | Class | Required | Default |
---|---|---|---|---|
ReceiveQueue | The Artemis Queue from which this pipeline receives information. | String | Yes | |
AcceptStop | Yes to shut down when this pipeline receives a stop signal. | String | No | yes |
ReceiveHost | The Artemis Host from which this pipeline receives information. | String | No | localhost |
ReceiveName | Your Artemis Username. | String | No | guest |
ReceivePass | Your Artemis Password. | String | No | guest |
ReceivePort | The Artemis Port from which this pipeline receives information. | int | No |
Starts an Apache Artemis broker.
Source class: ArtemisStarter
Source package: org.apache.ctakes.pbj.ae
Parent class: org.apache.ctakes.pbj.util.ArtemisController
Parameter | Description | Class | Required | Default |
---|---|---|---|---|
ArtemisBroker | Your Artemis broker's root directory. | String | Yes | |
OutputDirectory | Directory for all output files. | File | Yes | |
LogFile | File to which cTAKES output should be sent. | String | No | |
Pause | Pause for some seconds. Default is 0 | int | No | |
Wait | Wait for the process to finish. Default is no. | String | No | no |
Stops an Apache Artemis broker.
Source class: ArtemisStopper
Source package: org.apache.ctakes.pbj.ae
Parent class: org.apache.ctakes.pbj.util.ArtemisController
Parameter | Description | Class | Required | Default |
---|---|---|---|---|
ArtemisBroker | Your Artemis broker's root directory. | String | Yes | |
OutputDirectory | Directory for all output files. | File | Yes | |
LogFile | File to which cTAKES output should be sent. | String | No | |
Pause | Pause for some seconds. Default is 0 | int | No | |
Wait | Wait for the process to finish. Default is no. | String | No | no |
Sends jcas to Artemis Queue using JMS
Source class: PbjJmsSender
Source package: org.apache.ctakes.pbj.ae
Parent class: org.apache.ctakes.pbj.ae.PbjSender
Parameter | Description | Class | Required | Default |
---|---|---|---|---|
SendQueue | The Artemis Queue to which this pipeline sends information. | String | Yes | |
QueueSize | The size of the message queue. Default is 5 messages. | int | No | |
SendHost | The Artemis Host to which this pipeline sends information. | String | No | localhost |
SendName | Your Artemis Username. | String | No | guest |
SendPass | Your Artemis Password. | String | No | guest |
SendPort | The Artemis Port to which this pipeline sends information. | int | No | |
SendStop | Yes to send a stop signal to receiving pipelines. | String | No | yes |
Will pip PBJ based upon user request.
Source class: PbjPipper
Source package: org.apache.ctakes.pbj.ae
Parent class: org.apache.ctakes.core.ae.PythonRunner
Parameter | Description | Class | Required | Default |
---|---|---|---|---|
OutputDirectory | Directory for all output files. | File | Yes | |
Command | A full command line to be executed. Make sure to quote. | String | No | |
CommandDir | The Command Executable's directory. | String | No | |
Log | A name for the streaming logger. Default is the Command. | String | No | |
LogFile | File to which cTAKES output should be sent. | String | No | |
Pause | Pause for some seconds. Default is 0 | int | No | |
PerDoc | yes to run the command once per document. Default is no. | String | No | no |
PipPbj | pip or do not pip PBJ python code. Default is yes. | String | No | yes |
VirtualEnv | Path to Python virtual environment. | String | No | |
Wait | Wait for the process to finish. Default is no. | String | No | no |
WorkingDir | The Working Directory directory. | String | No |
Populates JCas based upon XMI content read from an Artemis Queue.
Source class: PbjReceiver
Source package: org.apache.ctakes.pbj.cr
Parent class: org.apache.uima.fit.component.JCasCollectionReader_ImplBase
Parameter | Description | Class | Required | Default |
---|---|---|---|---|
ReceiveQueue | The Artemis Queue from which this pipeline receives information. | String | Yes | |
AcceptStop | Yes to shut down when this pipeline receives a stop signal. | String | No | yes |
ReceiveHost | The Artemis Host from which this pipeline receives information. | String | No | localhost |
ReceiveName | Your Artemis Username. | String | No | guest |
ReceivePass | Your Artemis Password. | String | No | guest |
ReceivePort | The Artemis Port from which this pipeline receives information. | int | No |
Sends jcas to Artemis Queue using Stomp
Source class: PbjStompSender
Source package: org.apache.ctakes.pbj.ae
Parent class: org.apache.ctakes.pbj.ae.PbjSender
Parameter | Description | Class | Required | Default |
---|---|---|---|---|
SendQueue | The Artemis Queue to which this pipeline sends information. | String | Yes | |
QueueSize | The size of the message queue. Default is 5 messages. | int | No | |
SendHost | The Artemis Host to which this pipeline sends information. | String | No | localhost |
SendName | Your Artemis Username. | String | No | guest |
SendPass | Your Artemis Password. | String | No | guest |
SendPort | The Artemis Port to which this pipeline sends information. | int | No | |
SendStop | Yes to send a stop signal to receiving pipelines. | String | No | yes |
This is a piper file that will perform initial steps required for running a ctakes-pbj pipeline.
$\textcolor{gray}{\textsf{// This is a piper file that will perform initial steps required for running a ctakes-pbj pipeline. }}$
$\textcolor{gray}{\textsf{// }}$
$\textcolor{gray}{\textsf{// Add "load PbjStarter" to the beginning of your piper file. }}$
$\textcolor{gray}{\textsf{// }}$
$\textcolor{gray}{\textsf{// This piper will start the Apache Artemis broker pointed to by the -a parameter on the command line. }}$
$\textcolor{gray}{\textsf{// It will pause for 5 seconds to allow artemis to fully launch. }}$
$\textcolor{gray}{\textsf{// }}$
$\textcolor{gray}{\textsf{// This piper will then pip the python package requirements for ctakes-pbj }}$
$\textcolor{gray}{\textsf{// in an environment pointed to by the -v parameter on the command line. }}$
$\textcolor{gray}{\textsf{// }}$
$\textcolor{gray}{\textsf{// To skip the step of runnning a pip of ctakes-pbj, set --pipPbj to "no" }}$
$\textcolor{gray}{\textsf{// }}$
$\textcolor{gray}{\textsf{// Set the command line parameter -a to accept the directory of the Artemis broker. }}$
$\textcolor{brown}{\textbf{cli}}$ $\textcolor{purple}{\textbf{ArtemisBroker}}$ =$\textcolor{violet}{\textsf{a}}$
$\textcolor{gray}{\textsf{// Set the command line parameter -v to accept the directory of the Python environment. }}$
$\textcolor{brown}{\textbf{cli}}$ $\textcolor{purple}{\textbf{VirtualEnv}}$ =$\textcolor{violet}{\textsf{v}}$
$\textcolor{gray}{\textsf{// Set the command line parameter --pipPbj to 'no' to avoid a pip of pbj at the beginning of the run. }}$
$\textcolor{brown}{\textbf{cli}}$ $\textcolor{purple}{\textbf{PipPbj}}$ =$\textcolor{violet}{\textsf{pipPbj}}$
$\textcolor{gray}{\textsf{// Write nice big banners when ctakes starts and finishes. }}$
$\textcolor{olive}{\textbf{set}}$ $\textcolor{purple}{\textbf{WriteBanner}}$ =$\textcolor{violet}{\textsf{yes}}$
$\textcolor{gray}{\textsf{// }}$
$\textcolor{gray}{\textsf{// Start the Artemis broker and pause 5 seconds. }}$
$\textcolor{gray}{\textsf{// }}$
$\textcolor{gray}{\textsf{// Important: You must create an Artemis Broker before running. }}$
$\textcolor{gray}{\textsf{// See "Creating a Broker Instance" at (http)s://activemq.apache.org/components/artemis/documentation/1.0.0/running-server.html }}$
$\textcolor{gray}{\textsf{// The ArtemisBroker must point to the directory of the broker that you create. }}$
$\textcolor{gray}{\textsf{// }}$
$\textcolor{green}{\textbf{add}}$ ArtemisStarter$\textcolor{purple}{\textbf{Pause}}$ =$\textcolor{violet}{\textsf{5}}$
$\textcolor{gray}{\textsf{// }}$
$\textcolor{gray}{\textsf{// pip the dependency packages in case your environment doesn't have them or needs an update. }}$
$\textcolor{gray}{\textsf{// }}$
$\textcolor{green}{\textbf{add}}$ PbjPipper
$\textcolor{gray}{\textsf{// Add the Finished Logger for some run statistics. }}$
$\textcolor{green}{\textbf{addLast}}$ $\textcolor{blue}{\textsf{util.log.FinishedLogger}}$
$\textcolor{gray}{\textsf{// Force a stop, just in case some external process is trying to stay connected. }}$
$\textcolor{gray}{\textsf{// To disable the forced exit, use "set ForceExit=no" }}$
$\textcolor{green}{\textbf{addLast}}$ ExitForcer$\textcolor{purple}{\textbf{Pause}}$ =$\textcolor{violet}{\textsf{3}}$
This is a piper file that will perform final steps required for stopping a ctakes-pbj pipeline.
$\textcolor{gray}{\textsf{// This is a piper file that will perform final steps required for stopping a ctakes-pbj pipeline. }}$
$\textcolor{gray}{\textsf{// }}$
$\textcolor{gray}{\textsf{// Add "load PbjStopper" to the end of your piper file. }}$
$\textcolor{gray}{\textsf{// }}$
$\textcolor{gray}{\textsf{// This piper will stop the Apache Artemis broker pointed to by the -a parameter on the command line. }}$
$\textcolor{gray}{\textsf{// }}$
$\textcolor{gray}{\textsf{// Set the command line parameter -a to accept the directory of the Artemis installation. }}$
$\textcolor{brown}{\textbf{cli}}$ $\textcolor{purple}{\textbf{ArtemisBroker}}$ =$\textcolor{violet}{\textsf{a}}$
$\textcolor{gray}{\textsf{// Stop the Artemis Broker }}$
$\textcolor{green}{\textbf{add}}$ ArtemisStopper
$\textcolor{gray}{\textsf{// Add the Finished Logger for some run statistics. }}$
$\textcolor{green}{\textbf{add}}$ $\textcolor{blue}{\textsf{util.log.FinishedLogger}}$
$\textcolor{gray}{\textsf{// Force a stop, just in case some external process is trying to stay connected. }}$
$\textcolor{gray}{\textsf{// To disable the forced exit, use "set ForceExit=no" }}$
$\textcolor{green}{\textbf{addLast}}$ ExitForcer$\textcolor{purple}{\textbf{Pause}}$ =$\textcolor{violet}{\textsf{3}}$