MIMIC IV Medical Information Mart for Intensive Care - onetomapanalytics/Meta_Data GitHub Wiki

MIMIC-IV - Medical Information Mart for Intensive Care

General description

  1. Database primary purpose - Retrospectively collect medical data to improve patient care through knowledge discovery and algorithm development
  2. Overall data type - Health outcomes
  3. Dataset type - Longitudinal
  4. Data source - A custom hospital-wide EHR and an ICU-specific clinical information system
  5. Data level - Patient level
  6. Geographic location of the data collection sites - Beth Israel Deaconess Medical Center (BIDMC) at Boston, MA, USA
  7. Sponsor, manager, or home institution - BIDMC
  8. Date range - 2008 - 2019
  9. Physician identifiers - Deidentified integer identifiers
  10. Longitudinal tracking - Track patients among years and providers among modules, both using deidentified integer identifiers
  11. Clinical areas of interest - all
  12. Variables that are uniquely present in this dataset - MIMIC-IV is a relational database containing real hospital stays for patients admitted to the BIDMC, a teaching hospital of Harvard Medical School. The dataset is separated into “modules” to reflect the provenance of the data; they are hospital (labs, micro, and electronic medication administration), ICU data, emergency department, lookup tables and meta-data from MIMIC-CXR, and deidentified free-text clinical notes
  13. Database caveats and limitations - (1) all patients over 89 have been grouped together into a single group with a value of 91; (2) the maximum time of follow-up for each patient is exactly one year after their last hospital discharge; (3) data are collected during routine clinical practice and reflect the idiosyncrasies of that practice, meaning implausible values may be present in the database as an artifact of the archival process; (4) MIMIC-Note is currently not publicly available, and the structure is subject to change; (5) all patients across all datasets are in the hosp module; however, not all ICU patients have ED data, not all ICU patients have CXRs, not all ED patients have hospital data, and so on; (6) within an individual module, there are also incomplete tables as certain electronic systems did not exist in the past, particularly the eMAR system.
  14. Other - Date and times were shifted randomly into the future using an offset measured in days. A single date shift was assigned to each subject_id. As a result, the data for a single patient is internally consistent. For example, if the time between two measures in the database was 4 hours in the raw data, then the calculated time difference in MIMIC-IV will also be 4 hours. Conversely, distinct patients are not temporally comparable. That is, two patients admitted in 2130 were not necessarily admitted in the same year.

Applicable methods

  1. Exploratory analysis (1)
  2. Association methods, such as logistic regression (2, 3, 4), generalized linear model (5, 6, 7), Cox proportional hazard models (8, 9, 10)
  3. Dose-response analysis (11, 12)
  4. Sensitivity Analysis (13, 14)
  5. Propensity scores (15, 16, 17)
  6. Machine learning (18, 19, 20)
  7. Time series (21, 22)

High-impact designs

  1. Evaluate racial and ethnic disparities in care (23)

  2. Construct and validate a predictive nomogram (24)

  3. Identify the affecting features of persistent acute kidney injury (pAKI) for patients in intensive care units (25)

  4. Evaluate the predictive value of a biomarker on diseases prognosis and in-hospital mortality (26, 27, 28, 29, 30, 31)

  5. Measure the probability of invasive ventilation after meeting physiologic thresholds (32)

Data dictionary

To access the MIMIC-IV data dictionary, click here

Variable categories

  1. Patients demographics (e.g., age group, sex, race, marital status, insurance type)
  2. Diagnosis (ICD codes)
  3. Prescription (e.g., drug type and dose)
  4. Hospital and ICU information (e.g., admission, discharge, and death time)