Pre and Post coordination Modeling - OHDSI/Vocabulary-v5.0 GitHub Wiki

Introduction

The variety of available data standards/sources exhibits the entities' semantics in different manners. The most common types of semantics reflexion utilized by ontologies are Pre-coordination and Post-coordination.

Pre-coordination

When complex clinical expressions representing two or more concepts or a concept and modifiers occur commonly in clinical settings, it becomes more efficient to create one code that represents both concepts. The approach allows coding the complex multiaxial terms as 1 and the only explicit concept. Naturally not all the events that occurred in the real world may be reflected this way: the challenge of pre-coordination is a “terminology explosion” as additional codes are created from core concept and modifier codes to create long lists of concepts.

Post-coordination

Problem statement: A lot of source concepts are complex and are heavily pre-coordinated, often because they were not originally meant for describing a clinical fact but for billing purposes. Common examples are Codes from the ICD10 family of vocabularies or procedure codes. If that source concept combines its individual attributes with an AND combination, one solution is to split it up into multiple concepts and map the source concept to all of them as targets.

The approach allows coding the complex terms as a combination of several facts/concepts. The codes that were “post-coordinated” to create a single clinical expression demonstrate significantly greater value than when only the base concept is used. The approach may have several representations:

  • a) the creation of pairs of originally independent facts, which occur at the same timepoint
  • b) the creation of dependent pairs such as questions and answers (for laboratory test results, questionnaires).

However, this can also create problems for analysis as described here. If all targets are put in one concept set, one would not find only the actual source but all those which share one of the targets (because a concept set is by default treated as an OR combination). Finding them requires building a cohort with an AND combination and grouping them by timestamp.

Therefore knowing about these pitfalls is very important to achieve accurate results.

Source codes with attributes combined as OR statements almost always end up either as being mapped to less specific codes, becoming standard themselves (e.g. CPT) or being mapped to precoordinated concepts.

OMOP CDM Perspective

Both modeling methods are widely used in OMOP CDM and permit the end-user to define the phenotypes of interest in the most relevant ways. The pre-coordination is a prerogative of vocabularies enrichment, while the post-coordination is a modality available due to CDM Structure and ETL rules application. Classical CDM tables allowing the dependent facts to be stored in are Measurement and Observation. Tha systemic (CDM based) ways to post-coordinate are: FACT_RELATIONSHIP table population,modifier_of_event_id (measurement_event_id and observation_event_id) with mandatory population of modifier_of_field_concept_id ( meas_event_field_concept_id , obs_event_field_concept_id).

Examples of the Modeling approach

Most frequent issues may be stored in different ways described in the tables below. These include the laboratory finding representation, the coverage of historical facts, and independent simultaneous fact representation.

Example 1 - Laboratory Finding

Laboratory tests are usually specified as a combination of the sample, analyte, and its status. The status in different ontologies may be reflected as either part of a pre-coordinated entity or a dependent answer for the test.

Table 1 - Laboratory Finding

Source concept_id concept_code concept_name concept_class_id standard_concept invalid_reason domain_id vocabulary_id modeling type entity type depedency type
CRP elevated 46234770 76485-2 C reactive protein [Moles/volume] in Serum or Plasma Lab Test Standard Valid Measurement LOINC post-coordination fact#1 Q
CRP elevated 1620380 LA32146-5 Elevated Answer Standard Valid Meas Value LOINC post-coordination fact#1.1 A
CRP elevated 37108742 119971000119104 Elevated C-reactive protein Clinical Finding Standard Valid Condition SNOMED pre-coordination

Example 2 - Historical Context

Historical findings, elements that exhibit details about previously diagnosed conditions, facts about family history, and usually dated only by the visit date. In case of the absence of pre-coordinated entities in standard OMOP vocabularies the possible solution is to post-coordinate as question (Q) and answer(A)

Table 2 - Historical Context

Source concept_id concept_code concept_name concept_class_id standard_concept invalid_reason domain_id vocabulary_id modeling type entity type depedency type
h/o of MI 4214956 417662000 History of clinical finding in subject Context-dependent Standard Valid Observation SNOMED post-coordination fact#1 Q
h/oof MI 4329847 22298006 Myocardial infarction Clinical Finding Standard Valid Condition SNOMED post-coordination fact#1.1 A
h/o of MI 4163874 399211009 History of myocardial infarction Context-dependent Standard Valid Observation SNOMED pre-coordination

Example 3 - Complex Facts

The composition of several facts is a common scenario mentioned as post-coordination for primary multiaxial source expressions.

Table 3 - Multiaxial concepts representation

Source concept_id concept_code concept_name concept_class_id standard_concept invalid_reason domain_id vocabulary_id modeling type entity type dependency type
Chronic r. iliac DVT 762256003 16026431000119101 Thrombosis of iliac vein Clinical Finding Standard Valid Condition SNOMED post-coordination fact#1
Chronic r. iliac DVT 762422 350341000119102 Chronic deep venous thrombosis of right lower extremity Clinical Finding Standard Valid Condition SNOMED post-coordination fact#2
Chronic r. iliac DVT 35616025 293451000119102 Chronic deep vein thrombosis of right iliac vein Clinical Finding Standard Valid Condition SNOMED pre-coordination