Pre and Post coordination Modeling - OHDSI/Vocabulary-v5.0 GitHub Wiki
Introduction
The variety of available data standards/sources exhibits the entities' semantics in different manners. The most common types of semantics reflexion utilized by ontologies are Pre-coordination and Post-coordination.
Pre-coordination
When complex clinical expressions representing two or more concepts or a concept and modifiers occur commonly in clinical settings, it becomes more efficient to create one code that represents both concepts. The approach allows coding the complex multiaxial terms as 1 and the only explicit concept. Naturally not all the events that occurred in the real world may be reflected this way: the challenge of pre-coordination is a “terminology explosion” as additional codes are created from core concept and modifier codes to create long lists of concepts.
Post-coordination
Problem statement: A lot of source concepts are complex and are heavily pre-coordinated, often because they were not originally meant for describing a clinical fact but for billing purposes. Common examples are Codes from the ICD10 family of vocabularies or procedure codes. If that source concept combines its individual attributes with an AND combination, one solution is to split it up into multiple concepts and map the source concept to all of them as targets.
The approach allows coding the complex terms as a combination of several facts/concepts. The codes that were “post-coordinated” to create a single clinical expression demonstrate significantly greater value than when only the base concept is used. The approach may have several representations:
- a) the creation of pairs of originally independent facts, which occur at the same timepoint
- b) the creation of dependent pairs such as questions and answers (for laboratory test results, questionnaires).
However, this can also create problems for analysis as described here. If all targets are put in one concept set, one would not find only the actual source but all those which share one of the targets (because a concept set is by default treated as an OR combination). Finding them requires building a cohort with an AND combination and grouping them by timestamp.
Therefore knowing about these pitfalls is very important to achieve accurate results.
Source codes with attributes combined as OR statements almost always end up either as being mapped to less specific codes, becoming standard themselves (e.g. CPT) or being mapped to precoordinated concepts.
OMOP CDM Perspective
Both modeling methods are widely used in OMOP CDM and permit the end-user to define the phenotypes of interest in the most relevant ways. The pre-coordination is a prerogative of vocabularies enrichment, while the post-coordination is a modality available due to CDM Structure and ETL rules application. Classical CDM tables allowing the dependent facts to be stored in are Measurement and Observation. Tha systemic (CDM based) ways to post-coordinate are: FACT_RELATIONSHIP table population,modifier_of_event_id (measurement_event_id and observation_event_id) with mandatory population of modifier_of_field_concept_id ( meas_event_field_concept_id , obs_event_field_concept_id).
Examples of the Modeling approach
Most frequent issues may be stored in different ways described in the tables below. These include the laboratory finding representation, the coverage of historical facts, and independent simultaneous fact representation.
Example 1 - Laboratory Finding
Laboratory tests are usually specified as a combination of the sample, analyte, and its status. The status in different ontologies may be reflected as either part of a pre-coordinated entity or a dependent answer for the test.
Table 1 - Laboratory Finding
Source | concept_id | concept_code | concept_name | concept_class_id | standard_concept | invalid_reason | domain_id | vocabulary_id | modeling type | entity type | depedency type |
---|---|---|---|---|---|---|---|---|---|---|---|
CRP elevated | 46234770 | 76485-2 | C reactive protein [Moles/volume] in Serum or Plasma | Lab Test | Standard | Valid | Measurement | LOINC | post-coordination | fact#1 | Q |
CRP elevated | 1620380 | LA32146-5 | Elevated | Answer | Standard | Valid | Meas Value | LOINC | post-coordination | fact#1.1 | A |
CRP elevated | 37108742 | 119971000119104 | Elevated C-reactive protein | Clinical Finding | Standard | Valid | Condition | SNOMED | pre-coordination |
Example 2 - Historical Context
Historical findings, elements that exhibit details about previously diagnosed conditions, facts about family history, and usually dated only by the visit date. In case of the absence of pre-coordinated entities in standard OMOP vocabularies the possible solution is to post-coordinate as question (Q) and answer(A)
Table 2 - Historical Context
Source | concept_id | concept_code | concept_name | concept_class_id | standard_concept | invalid_reason | domain_id | vocabulary_id | modeling type | entity type | depedency type |
---|---|---|---|---|---|---|---|---|---|---|---|
h/o of MI | 4214956 | 417662000 | History of clinical finding in subject | Context-dependent | Standard | Valid | Observation | SNOMED | post-coordination | fact#1 | Q |
h/oof MI | 4329847 | 22298006 | Myocardial infarction | Clinical Finding | Standard | Valid | Condition | SNOMED | post-coordination | fact#1.1 | A |
h/o of MI | 4163874 | 399211009 | History of myocardial infarction | Context-dependent | Standard | Valid | Observation | SNOMED | pre-coordination |
Example 3 - Complex Facts
The composition of several facts is a common scenario mentioned as post-coordination for primary multiaxial source expressions.
Table 3 - Multiaxial concepts representation
Source | concept_id | concept_code | concept_name | concept_class_id | standard_concept | invalid_reason | domain_id | vocabulary_id | modeling type | entity type | dependency type |
---|---|---|---|---|---|---|---|---|---|---|---|
Chronic r. iliac DVT | 762256003 | 16026431000119101 | Thrombosis of iliac vein | Clinical Finding | Standard | Valid | Condition | SNOMED | post-coordination | fact#1 | |
Chronic r. iliac DVT | 762422 | 350341000119102 | Chronic deep venous thrombosis of right lower extremity | Clinical Finding | Standard | Valid | Condition | SNOMED | post-coordination | fact#2 | |
Chronic r. iliac DVT | 35616025 | 293451000119102 | Chronic deep vein thrombosis of right iliac vein | Clinical Finding | Standard | Valid | Condition | SNOMED | pre-coordination |