DataSet Release Note 14 July 2020 - National-Clinical-Cohort-Collaborative/Data-Ingestion-and-Harmonization GitHub Wiki

Datasets released to the Palantir foundry includes three OMOP data sources and four PCORnet sources.

Please note, the mapping information are built from the documentation provided from the Samvit. These documents are checked in here. Please refer to the following cross walk tables for the currently mapped valueSets concept ids that are used in transforming the PCORnet CDM data source terms to OMOP concept ids. The actual table definitions and the values set to the terms are found here. Here are the PCORnet data source terms cross walk table listing:

  • N3c_xwalk_mapping.sql
  • p2o_vital_term_xwalk.sql
  • n3c_xwalk_mapping.sql
  • p2o_admitting_source_xwalk.sql
  • p2o_death_term_xwalk.sql
  • p2o_demo_term_xwalk.sql
  • p2o_discharge_status_xwalk.sql
  • p2o_dispense_source_xwalk.sql
  • p2o_facility_type_xwalk.sql
  • p2o_medadmin_term_xwalk.sql
  • p2o_term_xwalk.sql

Updates from last realease:

  1. We have applied the logic to add LOINC codes to the uncoded COVID tests. The tool leveraged in this case is COVID-19 TestNorm. The corrections are not applied when there are original LOINC codes present but only applied to those lab measures with missing LOINC codes, with the intention to help downstream analyses without obfuscating real data. Details on the corrections can be found at COVID19 LOINC CODE correction.
  2. The COVID Lab tests list is aligned with scripts from Phenotype_Data_Acquisition Workstream. This will be updated with the upstream phenotyping workstream.

Known Issues:

  1. Please note the records in the Observation table have a concept in the Measurement domain (there are a couple other mismatched tables and domains as well). This is a known issue. We are working with the SMEs on this topic.

  2. The “flavors of null” issue. This is a known issue. We may choose to change these values in the future for the current concept id values used for Unknown, No Information, Other.

Interpretation of Qualitative Results

Qualitative results are an important aspects of the lab measures. In our scenario, the qualitative results are essential for all COVID related patients with the expectation that most of their labs would be reported in qualitative results. Here is how we handled the qualitative results during the ETL process and these concept ids would be used to identify a specific subgroup of patients. We are working to achieve gross mapping sufficient for the current level of sophistication in our users to be able to recognize positive and negatives, but that we are also anticipating a more refined perspective from the researchers in the future to be able to discern temporarily and thus are accommodating that by also preserving the original values.

Qualititive results Concept ID Name for the Concept ID
positive 45884084 Positive
negative 45878583 Negative
pos 45884084 Positive
neg 45878583 Negative
presumptive positive 45884084 Positive
presumptive negative 45878583 Negative
detected 45884084 Positive
not detected 45878583 Negative
inconclusive 45877990 Inconclusive
normal 45884153 Normal
abnormal 45878745 Abnormal
low 45881666 Low
high 45876384 High
borderline 45880922 Borderline
elevated 4328749 High
undetermined 45880649 Undetermined
undetectable 45878583 Negative
un 0 Null
unknown 0 Null
no information 46237210 No information