Pipeline Diagrams - apache/ctakes GitHub Wiki
Apache cTAKES™ has several pre-configured pipelines, each capable of extracting different types of information from clinical documents.
Below is a flow diagram of what the included pipelines can do.
An exact list of pipelines and their corresponding piper files can be found here.
Note: None of the included pipelines write output. Add one or more Output Writers to your own extended pipeline in order to see or store results.
See information on adding components to piper files using Piper Commands and the list of available output writers for more information.
Note: This diagram may not display well in some browsers or display themes.
---
config:
theme: neutral
---
flowchart TB
NoWriter1@{ shape: lean-r, label: "No Default Writer" }
NoWriter11@{ shape: lean-r, label: "No Default Writer" }
NoWriter2@{ shape: lean-r, label: "No Default Writer" }
NoWriter3@{ shape: lean-r, label: "No Default Writer" }
NoWriter4@{ shape: lean-r, label: "No Default Writer" }
classDef DashBorder fill: #FFE2E2, stroke: #FF0000, stroke-dasharray: 5 5;
class NoWriter1 DashBorder;
class NoWriter11 DashBorder;
class NoWriter2 DashBorder;
class NoWriter3 DashBorder;
class NoWriter4 DashBorder;
classDef NonDefN fill: #DDF7FF, stroke: #003AFF, stroke-dasharray: 9 1;
Documents@{ shape: docs, label: "Document Files" }
%% DEFAULT CLINICAL PIPELINE
subgraph DEFAULT["`**Default Clinical Pipeline**`"]
Reader(Read Patient Dir Files)
Reader --> SentDetecter[Split Sentences]
SentDetecter --> Tokenizers[Split Tokens]
Tokenizers --> POSTagger[Assign Parts of Speech]
POSTagger --> Chunker[Group Context Chunks]
Chunker --> NER[Find and Normalize Entities]
NER --> PolarityTK[Tag Negated Entities]
PolarityTK --> UncertaintyTK[Tag Uncertainty Entities]
UncertaintyTK --> HistoryTK[Assign Entity History]
HistoryTK --> ConditionTK[Tag Conditional Entities]
ConditionTK --> GenericTK[Tag Generic Entities]
GenericTK --> SubjectTK[Assign Entity Subject]
end
SubjectTK -.-> NoWriter1
%% SECTIONIZING CLINICAL PIPELINE
subgraph SECTIONED["`**Sectionizing Clinical Pipeline**`"]
Reader2(Read Patient Dir Files)
Reader2 --> Sectionize[Split and Normalize Sections]
Sectionize --> Paragraph[Split Paragraphs]
Paragraph --> SentDetecter2[Split Sentences, alt]
SentDetecter2 --> Lister[Split Lists]
Lister --> Tokenizers2[Split Tokens]
Tokenizers2 --> POSTagger2[Assign Parts of Speech]
POSTagger2 --> Chunker2[Group Context Chunks]
Chunker2 --> NER2[Find and Normalize Entities]
NER2 --> PolarityTK2[Tag Negated Entities]
PolarityTK2 --> UncertaintyTK2[Tag Uncertainty Entities]
UncertaintyTK2 --> HistoryTK2[Assign Entity History]
HistoryTK2 --> ConditionTK2[Tag Conditional Entities]
ConditionTK2 --> GenericTK2[Tag Generic Entities]
GenericTK2 --> SubjectTK2[Assign Entity Subject]
end
SubjectTK2 -.-> NoWriter11
%% RELATIONS
subgraph RELATIONS ["`**Relation sub-Pipeline**`"]
DegreeOf[Relate Entity Severities]
DegreeOf --> LocationOf[Relate Entity Anatomic Locations]
end
LocationOf -.-> NoWriter2
%% TEMPORAL
subgraph TEMPORAL ["`**Temporal sub-Pipeline**`"]
Events[Find Temporal Events]
Events --> Timex[Find Times]
Timex --> DTR["Relate Events to Doc Time"]
DTR --> ETLinks["Relate Events to Times"]
ETLinks --> EELinks["Relate Events to Events"]
end
EELinks -.-> NoWriter3
%% COREF
subgraph COREF ["`**Coreferences sub-Pipeline**`"]
Corefs[Identify Coreferent Entities]
end
Corefs -.-> NoWriter4
%% TIE IT ALL TOGETHER
class Sectionize NonDefN;
class Paragraph NonDefN;
class SentDetecter NonDefN;
class SentDetecter2 NonDefN;
class Lister NonDefN;
Documents -.-> DEFAULT
Documents -.-> SECTIONED
RelationsQ{Find Relations?}
TemporalQ{Find Temporal?}
CorefQ{Find Coreferences?}
Stop@{ shape: dbl-circ, label: "Stop" }
SubjectTK ---> RelationsQ
SubjectTK2 ---> RelationsQ
LocationOf --> TemporalQ
EELinks --> CorefQ
RelationsQ -- Yes --> RELATIONS
RelationsQ -- No --> TemporalQ
TemporalQ -- Yes --> TEMPORAL
TemporalQ -- No --> CorefQ
CorefQ -- Yes --> COREF
CorefQ -- No --> Stop
style DEFAULT fill:#CCFFB5
style SECTIONED fill:#B5FFBF