Participating data sources - ohdsi-studies/ConceptPrevalence GitHub Wiki

Database Description # of records # of concepts
1 IBM MarketScan Commercial Database (CCAE) US commercial claims patients (0-65 years old) 2,6 billion 16,3 thousand
2 IBM MarketScan Multi-State Medicaid Database (MDCD) Medicaid enrollees from multiple states 1,3 billion 16 thousand
3 IBM MarketScan Medicare Supplemental Database (MDCR) Medicare supplemental coverage through privately insured, fee-for-service, point-of-service, or capitated health plans 523 million 15,6 thousand
4 Optum De-Identified Clinformatics Data-Mart-Database— Socio-Economic Status (SES) Primarily representative of US commercial claims patients with socio-economic status information 3 billion 16,5 thousand
5 Optum De-Identified Clinformatics Data-Mart-Database—Date of Death (DOD) Primarily representative of US commercial claims patients with full death record 25 billion 16,5 thousand
6 Optum De-identified Electronic Health Record Dataset (PANTHER) Aggregated EHR repository from US health systems (Humedica’s EHR) 956 million 52,9 thousand
7 Premier Healthcare Database (PREMIER) Hospital charge data from the hospitals across the US 36 million 16,2 thousand
8 The Healthcare Cost and Utilization ProjectNationwide Inpatient Sample (HCUP) US hospital care data (inpatient stays, ambulatory surgery and services visits, and emergency department encounters.) 1,6 billion 9,6 thousand
9 IQVIA Longitudinal Patient Data (LPD)Australia Longitudinal patient data from electronic health records from primary and secondary from Australian physician practices 7,5 million 3,5 thousand
10 IQVIA Disease Analyzer (DA) Germany Electronic health records data from German practices (mostly primary care practices). 339 million 6,7 thousand
11 IQVIA Disease Analyzer (DA) France Electronic health records data from French practices (mostly primary care practices). 16,6 million 3,4 thousand
12 Japan Medical Data Center database (JMDC) Data from 60 society-managed health insurance plans covering workers aged 18 to 65 and their dependents in Japan 101 thousand 5,5 thousand
13 IQVIA US LRxDx Open Claims (Open Claims) Anonymized, pre-adjudicated claims collected from US office based physicians and specialists 6,2 billion 16,6 thousand
14 IQVIA US Hospital Charge Data Master (CDM) Anonymized hospital charge detail masters (CDM) collected from short-term, acute-care and non-federal hospitals 5 billion 16,2 thousand
15 IQVIA US Ambulatory EMR (AmbEMR) EMR data from US primary care (40%) and speciality practices (60%) 1,2 billion 49,6 thousand
16 Stanford Medicine Research Data Repository (STaRR) EHR data derived from outpatient and inpatient visits Stanford Hospital and Clinics 545 thousand 9 thousand
17 Korea National Health Insurance Service / National Sample Cohort (NHIS/NSC Korea) National administrative claims database covering the South Korea population (2% population sample cohort from 2002 – 2013 5,4 billion 6,8 thousand
18 Medical Information Mart for Intensive Care III (MIMIC3) Electronic health records data associated with ~60,000 intensive care unit admission 292 million 0.8 thousand
19 Ajou University Database (Ajou) Korean tertiary teaching hospital electronic health record data 30,9 million 6,3 thousand
20 Tufts Medical Center Database (Tufts) Mixed EMR data including hospital data, state death records, tumor registry and primary and specialty practices 124,4 million 19 thousand
21 Australian Electronic practice based research network (AU-ePBRN) Electronic health records data from primary care practices in Australia 766 million 2,8 thousand
22 Columbia University Medical Center Database (CUMC) EHR data consisting of over 5 million patients from the New York-Presbyterian hospital and affiliated academic physician practice 5,4 billion 16 thousand