Understanding DeepPhe Output - DeepPhe/DeepPhe-Release GitHub Wiki

Using the .piper configuration file, various output writers can be added to the DeepPhe pipeline. These various writers can be used to answer questions such as:

  1. What are the relevant biomarkers and mutations identified in ONE pathology report for a specific patient?

  2. What are the relevant biomarkers and mutations identified in ALL pathology reports for a specific patient?

  3. What treatments were identified in the reports?

etc...(more later)

Writers

DpheRelTableWriter

DpheTableWriter

EvalWriter

PatientSummaryXnJsonFileWriter

PatientSummaryXnTableWriter

DpheRelTableWriter

Configuration Options

SubDirectory=REL TableType=HTML

The files contain human readable information that list relations between two concepts for each patiennt.

filename section description
reportName_rel_table.html source The source concept.
relation The relation between the source and the target concept.
target The target concept.
confidence The confidence the system has in the relation.

DpheTableWriter

Configuration Options

SubDirectory=TABLE TableType=HTML

The table output file format contains a human readable list of concepts for each report.

filename column description
reportName_.table.html DpheGroup The group of the mention (e.g., Gene, Neoplasm,Tissue, Body Part)
Section The section in which the concept was found.
Span The start and ending location of the characters in the report used to identify the concept. (e.g. 40,48).
Negated Whether the term is negated or not.
Uncertain Whether the term has modifiers that express uncertainty in the concept.
Generic ?
URI The ClassURI of the concept (e.g. VIMGene)
Confidence The confidence that the system has in correctly identifying the concept.
Document Text The document text used to identify the concept (e.g. VIMENTIN).

EvalWriter

The eval writer writes single values for attributes that are use for evaluation of the pipeline.

filename section description
patientId_cancer.csv Patient_ID
-record_id
-topography_major
laterality
grade
stage
t
n
m
-historic
location
filename section description
patientId_tumor.csv Patient_ID
-record_id
location
-topography_major
-topography_minor
clockface
quadrant
laterality
-laterality_code
diagnosis
histology
-histologic_type
cancer_type
extent
-behavior
tumor_type
-tumor_size
-tumor_size_procedure
-calcifications
ER_
-ER_amount
-ER_procedure
PR_
-PR_amount
-PR_procedure
HER2
-HER2_amount
-HER2_procedure
KI67
BRCA1
BRCA2
ALK
EGFR
BRAF
ROS1
PDL1
MSI
KRAS
PSA
PSA_EL
-treatment

PatientSummaryXnTableWriter

filename section description
?_cancer.csv Patient_ID
CancerId
Cancer Class & Confidence
Topography, major
Topography, minor
Laterality
Lymph Involvement
Metastatic Site
Histology
Grade
Stage
T Stage
N Stage
M Stage
Course
Test Results
Treatments
Procedures
Genes
Comorbidities
filename section description
?_tumor.csv Patient_ID
CancerId
Cancer Class & Confidence
Topography, major
Topography, minor
Laterality
Clockface
Quadrant
Grade
Tissue
Behavior
Receptor Status
Test Results

PatientSummaryXnJsonFileWriter

Configuration Options

SubDirectory=JSON

filename section description
PatientID.json id The ID number of the patient.
name The name of the patient.
PatientID_Cancers.json tumors A list of tumors (identified by a conceptId) and the tumor's attributes (Location, Topography, Laterality, Behavior, etc.)
attributes A list of the Cancer attributes (Grade, Stage, Treatments, Procedures, etc.)
conceptIds A list of the concepts for the cancer (e.g. the type of cancer).
PatientID_Concepts.json concepts A list of concepts (e.g. Malignant, Lymph, Invasive) for the patient. For each concept found, the mentionID of the concept, dpheGroup (e.g Behavior, Body Part, Clinical Test Result), preferredText, CUI, confidence, and other relevant information are recorded.
conceptRelations A list of how two concepts relate to each other, with an associated relation type. For example: UnifocalLesion (concept 1) hasAssociatedSite (type) RenalVein (concept 2).
PatientID_NoteID_Doc.json id, name, type, date, episode, text Summary information for an individual report. Specifically the id of the report, the name of the report (e.g. pathology), the type (e.g. clinical note), the date, the episode (e.g. Diagnostic), and the full text of the report.
mentions A list of all mentions in the report.
mentionRelations A list of all mention relations in the report.