PH terminology and its mapping to Standard vocabulary - OHDSI/ETL--PulmonaryHypertensionRegistries GitHub Wiki

Principles of clinical classification

The course of pulmonary hypertension (PH) makes physicians underestimate its real prevalence resulting in inadequate disease management, poor control and inferior outcomes [link]. Disease-specific data collections, such as PH registries, may inform and improve the caregivers' adjustments, decrease time to diagnosis, and support treatment decisions in real world clinical practice.

The most common and explicit terminology which is used by clinicians in guidance and management of patients with PH is the World Symposium on Pulmonary Hypertension Clinical Classification (WSCC) [link]. There are 5 specific groups in which PH patients can be classified according to pathogenesis and practical relevance (Table 1).

Table 1 - Clinical classification of pulmonary hypertension (6th World Symposium on Pulmonary Hypertension, February 2018)

1 PAH
1.1 Idiopathic PAH
1.2 Heritable PAH
1.3 Drug- and toxin-induced PAH
1.4 PAH associated with:
1.4.1 Connective tissue disease
1.4.2 HIV infection
1.4.3 Portal hypertension
1.4.4 Congenital heart disease
1.4.5 Schistosomiasis
1.5 PAH long-term responders to calcium channel blockers
1.6 PAH with overt features of venous/capillaries (PVOD/PCH) involvement
1.7 Persistent PH of the newborn syndrome
2 PH due to left heart disease
2.1 PH due to heart failure with preserved LVEF
2.2 PH due to heart failure with reduced LVEF
2.3 Valvular heart disease
2.4 Congenital/acquired cardiovascular conditions leading to post-capillary PH
3 PH due to lung disease and/or hypoxia
3.1 Obstructive lung disease
3.2 Restrictive lung disease
3.3 Other lung disease with mixed restrictive/obstructive pattern
3.4 Hypoxia without lung disease
3.5 Developmental lung disorders
4 PH due to pulmonary artery obstructions
4.1 Chronic thromboembolic PH
4.2 Other pulmonary artery obstructions
5 PH with unclear and/or multifactorial mechanisms
5.1 Hematologic disorders
5.2 Systemic and metabolic disorders
5.3 Others
5.4 Complex congenital heart disease

Another clinically relevant classification of PH into 5 groups is the World Health Organization Classification (WHO) which essentially overlaps with WSCC but is less detailed::

Group 1 - Pulmonary arterial hypertension (PAH)

Group 2 - Pulmonary hypertension due to left-sided heart disease

Group 3 - Pulmonary hypertension due to lung diseases and/or hypoxia

Group 4 - Chronic thromboembolic pulmonary hypertension (CTEPH)

Group 5 - Pulmonary hypertension with unclear or multifactorial etiologies, including hematologic disorders (eg, myeloproliferative disorders), systemic disorders (eg, sarcoidosis, pulmonary Langerhans cell histiocytosis, lymphangioleiomyomatosis, neurofibromatosis, vasculitis), metabolic disorders (eg, glycogen storage disease, Gaucher disease, thyroid disorders), and miscellaneous conditions (eg, tumor obstruction, mediastinal fibrosis, chronic renal failure on dialysis).

WSCC terms can not be fully mapped to the SNOMED terminology in 1-to-1 fashion due to differences in diagnosis representation (Appendix 1). However, this does not create any real limitations in studies and may be solved by using the post-coordinated expressions in phenotyping or creation of new pre-coordinated Condition concepts.

The recording and differentiation of a diagnosis of PH can be complex in a variety of data sources. Clinicians usually record the diagnosis in fashion that is not appropriate for subsequent analysis (e.g. free text or unstructured data), thus mapping and harmonization steps are crucial.

Data representation in PH registries

The real world data (RWD) assets may have several ways to define PH diagnosis:

  • Development of custom, registry-defined dictionaries.
  • Utilization of existing terminologies according to registry protocols.

1. Custom solutions

Due to database design principles the source concept definition may not be trivial. Sometimes source_code and source_code_descrition may be only defined as a concatenation of different dependent variables or fields (Table 2).

Table 2 - PH source terms representation examples

field1 field2 field3 field4
PULMONARY ARTERIAL HYPERTENSION PAH ASSOCIATED WITH... CONNECTIVE TISSUE DISEASE SYSTEMIC SCLEROSIS/SCLERODERMA
PULMONARY ARTERIAL HYPERTENSION HERITABLE PAH BMPR2
PH UNCLEAR MULTIFACTORIAL MECHANISMS HEMATOLOGIC DISORDERS
PULMONARY HYPERTENSION DUE TO LUNG DISEASES AND/OR HYPOXIA INTERSTITIAL LUNG DISEASE
PULMONARY HYPERTENSION DUE TO LEFT HEART DISEASE LEFT VENTRICULAR SYSTOLIC DYSFUNCTION
PULMONARY ARTERIAL HYPERTENSION PAH ASSOCIATED WITH.. PORTAL HYPERTENSION
CHRONIC THROMBOEMBOLIC PULMONARY HYPERTENSION
PULMONARY ARTERIAL HYPERTENSION PULMONARY VENO-OCCLUSIVE DISEASE AND/OR PULMONARY CAPILLARY HEMANGIOMATOSIS

2. Utilization of existing terminologies

The Medical Dictionary for Regulatory Activities (MedDRA), hierarchically structured and licensed vocabulary, is also used in many sources. The structure of the vocabulary predisposes to only a limited number of codes to be exactly used in real-world databases (RWD), like registries.

We analyzed the MedDRA terminology, its utilization and cross vocabulary relationships based on SNOMED-MedDRA initiative mappings and mappings built during the current project. The mappings and MedDRA PH terms usage in the RWD are shown in the Appendix 2.

Based on our experience, not the entire collection of the MedDRA terms is used in the RWD (we found that it is only about ~28% (15/54) of codes). Outstanding codes are mapped to the SNOMED terms, mostly to the Pulmonary Hypertension axis. However, 15% (8/54) of MedDRA terms are not related to that. Only 11 codes of SNOMED were used to cover 31 MedDRA terms.

Such peculiarities are important to be mentioned as they reflect internal discrepancies between classification systems. Such differences determine the need to build the concept sets based on specific terms rather than to rely on conflicting hierarchies of different classification systems.

Data representation in Electronic Health Records (EHR)

The EHR is an alternative data source used in observational research. It mostly uses the International Classification of Disease (ICD) and its versions.

Currently a variety of ICD versions, both chronological and regional, are ingested into OMOP CDM. The most common coding systems for EHR are ICD10 (ex-U.S. data) and ICD10CM (U.S data). The examples of codes, their descriptions and diagnostic entities reliable for the code with appropriate SNOMED targets are shown in Appendixes 3 and 4.

The ICD terminology is less detailed than SNOMED or WSCC, but still a widely used option in the PH-related data sources.

SNOMED terminology representation

The Systematized Nomenclature of Medicine clinical terms (SNOMED CT) is an OMOP CDM standard to track the condition dynamics. SNOMED terminology has a very explicit set of pre-coordinated concepts covering PH condition and its subtypes.

At glance, SNOMED and other PH terminologies, like ICD, have almost an identical set of concepts and attributes to address the diagnosis subtypes; however, in fact the classification principles and, therefore, the hierarchical relationships between concepts make full matching impossible. Distinct SNOMED codes may be the ancestors for codes coming from distinct ancestry groups in other terminologies. Such discrepancies do not make any of the terminology improper, but reflect the differences in their origination and, therefore, require special methods of its utilization and analysis.

The example of “Pulmonary hypertension due to interstitial lung disease” ancestry is shown on Figure 1.

Figure 1 - “Pulmonary hypertension due to interstitial lung disease” concept and its hierarchical representation in SNOMED

Mapping from PH registries to OMOP Standard vocabulary

We’ve figured out that in some cases, PH registries provide detailed information on PH diagnosis that is not captured in the Standard vocabulary. The SNOMED terminology, being the OMOP Standard, is still an acceptable option to represent the semantics of these source codes. Even if pre-coordinated terms are missing, the information still can be preserved in the post-coordinated way as 1-to-many mapping (table 3, row 1).

Only a little number of source codes requires new concept creation to reflect their full granularity. They have some overlapping features such as “and/or” discriminator in their names, a well known data standards naming issue. The codes we recommend to be created as the OMOP Extension terms:

  • Drugs and toxins induced pulmonary arterial hypertension;
  • Pulmonary hypertension with unclear and/or multifactorial mechanisms;
  • Heritable PAH: ALK-1/ENG/SMAD9/CAV1/KCNK3.

The SNOMED is the constantly evolving terminology with active collaborators ready to create new concepts based on the users’ demand. Thus, we expect that new revisions of WSCC will be covered by the SNOMED soon.

Table 3 - Source PH terms to SNOMED mapping

Source code description SNOMED code SNOMED description
PULMONARY ARTERIAL HYPERTENSION|PAH ASSOCIATED WITH...||CONNECTIVE TISSUE DISEASE|SYSTEMIC SCLEROSIS/SCLERODERMA 697903007 Pulmonary arterial hypertension associated with connective tissue disease
89155008 Systemic sclerosis
PULMONARY ARTERIAL HYPERTENSION|HERITABLE PAH|BMPR2 697899000 Heritable pulmonary arterial hypertension due to BMPR2 mutation
Group I: Pulmonary Arterial Hypertension|Idiopathic PAH (IPAH) 697898008 Idiopathic pulmonary arterial hypertension
PH|UNCLEAR MULTIFACTORIAL MECHANISMS|HEMATOLOGIC DISORDERS 697917003 Pulmonary hypertension due to hematological disorder
Group IV: Chronic Thromboembolic Pulmonary Hypertension (CTEPH) 233947005 Chronic thromboembolic pulmonary hypertension
PULMONARY HYPERTENSION DUE TO LUNG DISEASES AND/OR HYPOXIA|INTERSTITIAL LUNG DISEASE 697912009 Pulmonary hypertension due to interstitial lung disease
PULMONARY HYPERTENSION DUE TO LEFT HEART DISEASE|LEFT VENTRICULAR SYSTOLIC DYSFUNCTION 697925001 Pulmonary hypertension due to systolic systemic ventricular dysfunction
PULMONARY ARTERIAL HYPERTENSION|PAH ASSOCIATED WITH...||PORTAL HYPERTENSION 445237003 Pulmonary arterial hypertension associated with portal hypertension

Summary

The coverage of the PH terms, used in the variety of the sources, by the OMOP Standard vocabulary is around 80% (1-to-1 match). The remaining terms can be mapped in 1-to-many fashion that doesn’t substantially affect cohort building in OMOP CDM. The coverage may be further improved by creating new concepts and/or using mapping strategies as:

  • Post-coordination of several standard terms to be picked using the “at the same day” criterion in cohorts.
  • Introduction of relationships between database entities (fact_relationship or modifier of event such as meas_event_field_concept_id and obs_event_field_concept_id fields).

In order to store the grouping variable originating from the source (e.g. WHO Classification) we may recommend to map the source terms to the measurement_concept_id and specific value_as_concept_id in the LOINC vocabulary (Table 4). The approach still has some limitations as LOINC terms are identical to WHO Classification only and don’t support any other classifications, but it’s still a powerful method to ease the querying of patients.

Table 4 - LOINC terms to be used for mapping of PH grouping variables

LOINC event description LOINC event code LOINC value description LOINC value code
Pulmonary hypertension [Class] 95932-0 Group 1: Pulmonary arterial hypertension (PAH) LA31114-4
Group 2: Pulmonary Hypertension due to left heart disease LA31115-1
Group 3: Pulmonary Hypertension due to chronic lung disease and hypoxemia LA31116-9
Group 4: Chronic Thromboembolic Pulmonary Hypertension (CTEPH) LA31117-7
Group 5: Pulmonary hypertension due to unclear multifactorial mechanisms LA31118-5

Appendix

Appendix 1 - WSCC Clinical PH diagnosis and its representation in SNOMED

Source code description SNOMED code SNOMED description Comment
1 PAH
1.1 Idiopathic PAH 697898008 Idiopathic pulmonary arterial hypertension
1.2 Heritable PAH 697897003 Heritable pulmonary arterial hypertension
1.3 Drug- and toxin-induced PAH 233945002 Pulmonary arterial hypertension induced by drug The exact causation may be specified as Condition/Drug exposure linked with fact_relationship

OR

new concept : Drug or toxin induced pulmonary arterial hypertension OMOP5160885 OMOP Extension

OR
697901009 Pulmonary arterial hypertension induced by toxin The exact causation may be specified as Condition/Drug exposure linked with fact_relationship

OR

new concept: Drug or toxin induced pulmonary arterial hypertension OMOP5160885 OMOP Extension

1.4 PAH associated with:
1.4.1 Connective tissue disease 697903007 Pulmonary arterial hypertension associated with connective tissue disease The exact causation may be specified as Condition/Observation linked with fact_relationship
1.4.2 HIV infection 697904001 Pulmonary arterial hypertension associated with HIV infection
1.4.3 Portal hypertension 445237003 Pulmonary arterial hypertension associated with portal hypertension
1.4.4 Congenital heart disease 697905000 Pulmonary arterial hypertension associated with congenital heart disease The exact causation may be specified as Condition/Observation linked with fact_relationship

OR

new concept to be added on demand

1.4.5 Schistosomiasis 697907008 Pulmonary arterial hypertension associated with schistosomiasis
1.5 PAH long-term responders to calcium channel blockers Not covered by SNOMED May be stored in OMOP as hierarchical parent + separate record in Condition/Observation
1.6 PAH with overt features of venous/capillaries (PVOD/PCH) involvement Not covered by SNOMED May be stored in OMOP as hierarchical parent + separate record in Condition/Observation
1.7 Persistent PH of the newborn syndrome 233815004 Persistent pulmonary hypertension of the newborn
2 PH due to left heart disease 472790001 Pulmonary hypertension due to left heart disease
2.1 PH due to heart failure with preserved LVEF 697926000 Pulmonary hypertension due to diastolic systemic ventricular dysfunction
2.2 PH due to heart failure with reduced LVEF 697925001 Pulmonary hypertension due to systolic systemic ventricular dysfunction
2.3 Valvular heart disease 697927009 Pulmonary hypertension due to left-sided valvular heart disease
2.4 Congenital/acquired cardiovascular conditions leading to post-capillary PH Not covered by SNOMED May be stored in OMOP as hierarchical parent + separate record in Condition/Observation
3 PH due to lung disease and/or hypoxia 697910001 Pulmonary hypertension due to lung disease and/or hypoxia
3.1 Obstructive lung disease OMOP5160883 Pulmonary hypertension due to lung disease May be stored in OMOP as hierarchical parent + separate record in Condition/Observation
3.2 Restrictive lung disease OMOP5160883 Pulmonary hypertension due to lung disease May be stored in OMOP as hierarchical parent + separate record in Condition/Observation
3.3 Other lung disease with mixed restrictive/obstructive pattern 697913004 Pulmonary hypertension due to pulmonary disease with mixed restrictive and obstructive pattern
3.4 Hypoxia without lung disease OMOP5160884 Pulmonary hypertension due to hypoxia To be stored as hierarchical parent
3.5 Developmental lung disorders 697916007 Pulmonary hypertension due to developmental abnormality of the lung
4 PH due to pulmonary artery obstructions Not covered by SNOMED
4.1 Chronic thromboembolic PH 233947005 Chronic thromboembolic pulmonary hypertension
4.2 Other pulmonary artery obstructions 233946001 Large vessel pulmonary hypertension
5 PH with unclear and/or multifactorial mechanisms Not covered by SNOMED
5.1 Hematologic disorders 697917003 Pulmonary hypertension due to hematological disorder
5.2 Systemic and metabolic disorders Not covered by SNOMED May be stored in OMOP as hierarchical parent (generic) + separate record in Condition/Observation
5.3 Others Stored in OMOP as hierarchical parent
5.4 Complex congenital heart disease Not covered by SNOMED May be stored in OMOP as hierarchical parent (generic) + separate record in Condition/Observation

Appendix 2 - MedDRA PH codes and mapping to OMOP Standard terms

* - MedDRA codes and concept classes are not shown here due to the license restrictions (link).

MedDRA description Concept code Concept name PH branch in SNOMED Found in the RWD
Pulmonary hypertension aggravated 10964002 Progressive pulmonary hypertension + +
Hypertension pulmonary aggravated 10964002 Progressive pulmonary hypertension + +
Pulmonary arterial hypertension 11399002 Pulmonary arterial hypertension + +
Pulmonary arterial hypertension WHO functional class III 11399002 Pulmonary arterial hypertension + +
Pulmonary arterial hypertension WHO functional class IV 11399002 Pulmonary arterial hypertension + +
Pulmonary arterial hypertension WHO functional class I 11399002 Pulmonary arterial hypertension + +
Pulmonary arterial hypertension WHO functional class II 11399002 Pulmonary arterial hypertension + +
Pulmonary hypertension WHO functional class IV 70995007 Pulmonary hypertension + +
Pulmonary hypertension WHO functional class III 70995007 Pulmonary hypertension + +
Pulmonary hypertension WHO functional class II 70995007 Pulmonary hypertension + +
Pulmonary hypertension WHO functional class I 70995007 Pulmonary hypertension + +
Hypertension pulmonary 70995007 Pulmonary hypertension + +
Pulmonary hypertension 70995007 Pulmonary hypertension + +
Pulmonary hypertension NOS 70995007 Pulmonary hypertension + +
Pulmonary hypertension NOS aggravated 70995007 Pulmonary hypertension + +
Pulmonary hypertension secondary 88223008 Secondary pulmonary hypertension + +
Secondary pulmonary arterial hypertension 88223008 Secondary pulmonary hypertension + +
Newborn persistent pulmonary hypertension 233815004 Persistent pulmonary hypertension of the newborn + +
Persistent pulmonary hypertension of the newborn 233815004 Persistent pulmonary hypertension of the newborn + +
Chronic thromboembolic pulmonary hypertension 233947005 Chronic thromboembolic pulmonary hypertension + +
CTEPH 233947005 Chronic thromboembolic pulmonary hypertension + +
Portopulmonary hypertension 445237003 Pulmonary arterial hypertension associated with portal hypertension + +
Eisenmenger's syndrome 445928005 Eisenmenger's syndrome + +
Familial (FPAH) 697897003 Heritable pulmonary arterial hypertension + +
Familial pulmonary arterial hypertension 697897003 Heritable pulmonary arterial hypertension + +
Primary pulmonary hypertension 697898008 Idiopathic pulmonary arterial hypertension + +
Pulmonary hypertension primary 697898008 Idiopathic pulmonary arterial hypertension + +
Idiopathic (IPAH) 697898008 Idiopathic pulmonary arterial hypertension + +
Idiopathic pulmonary arterial hypertension 697898008 Idiopathic pulmonary arterial hypertension + +
Associated with (APAH) 697902002 Associated pulmonary arterial hypertension + +
Associated with pulmonary arterial hypertension 697902002 Associated pulmonary arterial hypertension + +
Pulmonary arterial hypertension WHO functional class I OMOP5160879 Pulmonary hypertension WHO (World Health Organization) functional class: I +
Pulmonary arterial hypertension WHO functional class II OMOP5160880 Pulmonary hypertension WHO (World Health Organization) functional class: II +
Pulmonary arterial hypertension WHO functional class III OMOP5160881 Pulmonary hypertension WHO (World Health Organization) functional class: III +
Pulmonary arterial hypertension WHO functional class IV OMOP5160882 Pulmonary hypertension WHO (World Health Organization) functional class: IV +
Pulmonary hypertension WHO functional class I OMOP5160879 Pulmonary hypertension WHO (World Health Organization) functional class: I +
Pulmonary hypertension WHO functional class II OMOP5160880 Pulmonary hypertension WHO (World Health Organization) functional class: II +
Pulmonary hypertension WHO functional class III OMOP5160881 Pulmonary hypertension WHO (World Health Organization) functional class: III +
Pulmonary hypertension WHO functional class IV OMOP5160882 Pulmonary hypertension WHO (World Health Organization) functional class: IV +
Acute cor pulmonale 49584005 Acute cor pulmonale +
Cor pulmonale acute 49584005 Acute cor pulmonale +
Chronic cor pulmonale 79955004 Chronic cor pulmonale +
Cor pulmonale chronic 79955004 Chronic cor pulmonale +
Cor pulmonale 83291003 Cor pulmonale +
Chronic pulmonary heart disease 87837008 Chronic pulmonary heart disease +
Chronic pulmonary heart disease unspecified 87837008 Chronic pulmonary heart disease +
Right-to-left cardiac shunt 441826000 Right to left cardiac shunt +
Acute pulmonary heart disease 67189007 Acute pulmonary heart disease
Chronic pulmonary heart disease, unspecified 87837008 Chronic pulmonary heart disease
Cor pulmonale NOS 83291003 Cor pulmonale
Disease heart pulmonary 274096000 Pulmonary heart disease
Heart disease pulmonary 274096000 Pulmonary heart disease
Pulmonary hypertensions 70995007 Pulmonary hypertension +
Congenital pulmonary hypertension 697897003 Heritable pulmonary arterial hypertension +
Pulmonary artery wall hypertrophy 251039005 Pulmonary artery finding
Pulmonary hypertensive crisis 11399002 Pulmonary arterial hypertension +
Right ventricular hypertension 461321006 Right ventricular hypertension
Right-to-left atrial shunt 441826000 Right to left cardiac shunt
Right-to-left ventricular shunt 441826000 Right to left cardiac shunt
Alveolar capillary dysplasia 708028001 Congenital pulmonary alveolar capillary dysplasia
Coronary sinus dilatation 253323000 Coronary sinus abnormality
Alveolar capillary dysplasia with misalignment of pulmonary veins 447275002 Alveolar capillary dysplasia with pulmonary venous misalignment

Appendix 3 - ICD10 (WHO version) representation and its mapping to SNOMED

ICD10 code ICD10 description SNOMED code SNOMED description
I27.0 Primary pulmonary hypertension 11399002 Pulmonary arterial hypertension
I27.2 Other secondary pulmonary hypertension 88223008 Secondary pulmonary hypertension
P29.3 Persistent fetal circulation 233815004 Persistent pulmonary hypertension of the newborn

Appendix 4 - ICD10CM representation and its mapping to SNOMED

ICD10CM code ICD10CM description Diagnostic category related to the ICD10CM code SNOMED code SNOMED description
I27.0 Primary pulmonary hypertension Primary pulmonary hypertension 11399002 Pulmonary arterial hypertension
I27.2 Other secondary pulmonary hypertension Other secondary pulmonary hypertension 88223008 Secondary pulmonary hypertension
I27.20 Pulmonary hypertension, unspecified Pulmonary hypertension, unspecified

Pulmonary hypertension NOS

70995007 PHT - Pulmonary hypertension
I27.21 Secondary pulmonary arterial hypertension Secondary pulmonary arterial hypertension (Associated) (drug-induced) (toxin-induced)

pulmonary arterial hypertension NOS (Associated) drug-induced) (toxin-induced) (secondary)

group 1 pulmonary hypertension

88223008 Secondary pulmonary hypertension
I27.22 Pulmonary hypertension due to left heart disease Pulmonary hypertension due to left heart disease

Group 2 pulmonary hypertension

472790001

Pulmonary hypertension due to left heart disease

I27.23 Pulmonary hypertension due to lung diseases and hypoxia Pulmonary hypertension due to lung diseases and hypoxia

Group 3 pulmonary hypertension

697910001 Pulmonary hypertension due to lung disease and/or hypoxia
I27.24 Chronic thromboembolic pulmonary hypertension Chronic thromboembolic pulmonary hypertension

Group 4 pulmonary hypertension

233947005 Chronic thromboembolic pulmonary hypertension
I27.29 Other secondary pulmonary hypertension Other secondary pulmonary hypertension

Group 5 pulmonary hypertension

88223008 Secondary pulmonary hypertension
I27.83 Eisenmenger's syndrome Eisenmenger's syndrome 445928005 Eisenmenger's syndrome
P29.3 Persistent fetal circulation Persistent fetal circulation 233815004 Persistent pulmonary hypertension of the newborn
P29.30 Pulmonary hypertension of newborn Pulmonary hypertension of newborn 233815004 Persistent pulmonary hypertension of the newborn
P29.38 Other persistent fetal circulation Other persistent fetal circulation 233815004 Persistent pulmonary hypertension of the newborn
⚠️ **GitHub.com Fallback** ⚠️