How to link facts - OHDSI/ETL--PulmonaryHypertensionRegistries GitHub Wiki

Registry and clinical trials data contains lots of interconnected facts. Among examples are an adverse event and its cause, severity, relation to the study drug, outcome, etc. Sometimes, these connections can be easily spotted in the converted CDM data since they happened on the same date (particularly if each patient has just a few events on a date). However, it is generally helpful to store such relations between events in the CDM. There are two ways to do this: to create records in the fact relationship table or, in a more modern way, to utilize new v5.4 fields in the measurement and observation tables. It is worth mentioning that the current version of Atlas does not support the fact relationship table nor v5.4 fields. So in both cases, analysis of the related events can only be performed by some custom queries.

Fact Relationship Table

The fact relationship table is a canonical way of storing relationships between facts. In some sense, it is less limited than the v5.4 fields way (see below) since it can create links between records stored in any CDM table. But ‘unit of action’ here is a record, meaning you cannot make a connection between something less than a record, i.e., a particular field. Each relationship must have a concept, though it can be 0. Links can be one- or bi-directional, depending on researcher needs.

Here is an example for using the fact_relationship table. Let us take the 'Arterial O2 while patient on supplemental oxygen, %' as a source concept. In OMOP terms, this event consists of a measurement of the oxygen saturation and an oxygen therapy procedure, so two records in the appropriate tables are created. Finally, these two records are linked via the fact relationship table with the 'during' as a relationship concept id.

The source table:

clinical event event date ...
Arterial O2 while patient on supplemental oxygen, % 2015-05-06 ...

This event is mapped to the following CDM events:

procedure_occurrence_id procedure_concept_id procedure_date ...
1234 4239130 2015-05-06 ...
measurement_id measurement_concept_id measurement_date ...
2345 4013965 2015-05-06 ...

The fact_relationship table:

domain_concept_id fact_id_1 domain_concept_id_2 fact_id_2 relationship_concept_id
1147330 1234 1147301 2345 4162730

Event fields and event ids

Another way to store a connection between facts is v5.4 additional fields in the measurement and observation tables - measurement_event_id and meas_event_field_concept_id, and/or observation_event_id and obs_event_field_concept_id. Generally speaking, you may not necessarily convert to v5.4, but choose a middle way with extending v5.3 with these additional fields.

Taking the v5.4 approach, we see two facts: the main event and its modifier (maybe not an ideal wording, since there is a field with this name in the procedure occurrence table). And only the modifier event contains a link to the main event. In contrast, the main event may have a link to the modifier only if the main event is in the measurement or observation tables because only these two tables have that additional fields. If the main event is of another domain, i.e., Condition, only a link from the modifier to the main event could be created, making this relation one-directional. Needless to say, you can only link facts with modifiers in the Measurement and Observation domains, i.e., a condition and a procedure cannot be linked this way since both of these tables lack additional fields to store the link. Another peculiarity of this approach is that it may multiply modifier events. Thus, if a source event has been mapped to several target concepts (1:n), then a modifier should be created for each target record of the main event. For example, a source adverse event, ‘Pulmonary edema associated with PVOD’ with the ‘Mild’ attribute, has been mapped to 4232485 and 4078925 concepts. Two records with ’Mild’ concepts should be created in the Observation table to be then linked to each of the two target main events. This might not be convenient for research in some circumstances. The fact relationship approach acts similarly. However, it does not multiply records in the event tables but in the fact relationship table.

One significant advantage of the v5.4 approach is that it can modify any field of the main event, not just the whole record like the fact relationship approach. It can be helpful, for example, when a value_as_concept_id needs to be modified.

Here is the example above rewritten with v5.4 fields.

clinical event event date ...
Arterial O2 while patient on supplemental oxygen, % 2015-05-06 ...
procedure_occurrence_id procedure_concept_id procedure_date ...
1234 4239130 2015-05-06 ...
measurement_id measurement_concept_id measurement_date measurement_event_id meas_event_field_concept_id ...
2345 4013965 2015-05-06 1234 1147082 ...

Also a good example of how to populate these additional v5.4 fields can be found in the Clinical Trials WG wiki and repository.

Custom concepts

As a bonus option, one can create custom concepts that combine those facts instead of linking facts with the fact_relationship table or v5.4 fields. A good practice would be including such custom concepts into the existing concept hierarchy.