MIMIC IV Medical Information Mart for Intensive Care - onetomapanalytics/Meta_Data GitHub Wiki
MIMIC-IV - Medical Information Mart for Intensive Care
General description
- Database primary purpose - Retrospectively collect medical data to improve patient care through knowledge discovery and algorithm development
- Overall data type - Health outcomes
- Dataset type - Longitudinal
- Data source - A custom hospital-wide EHR and an ICU-specific clinical information system
- Data level - Patient level
- Geographic location of the data collection sites - Beth Israel Deaconess Medical Center (BIDMC) at Boston, MA, USA
- Sponsor, manager, or home institution - BIDMC
- Date range - 2008 - 2019
- Physician identifiers - Deidentified integer identifiers
- Longitudinal tracking - Track patients among years and providers among modules, both using deidentified integer identifiers
- Clinical areas of interest - all
- Variables that are uniquely present in this dataset - MIMIC-IV is a relational database containing real hospital stays for patients admitted to the BIDMC, a teaching hospital of Harvard Medical School. The dataset is separated into “modules” to reflect the provenance of the data; they are hospital (labs, micro, and electronic medication administration), ICU data, emergency department, lookup tables and meta-data from MIMIC-CXR, and deidentified free-text clinical notes
- Database caveats and limitations - (1) all patients over 89 have been grouped together into a single group with a value of 91; (2) the maximum time of follow-up for each patient is exactly one year after their last hospital discharge; (3) data are collected during routine clinical practice and reflect the idiosyncrasies of that practice, meaning implausible values may be present in the database as an artifact of the archival process; (4) MIMIC-Note is currently not publicly available, and the structure is subject to change; (5) all patients across all datasets are in the hosp module; however, not all ICU patients have ED data, not all ICU patients have CXRs, not all ED patients have hospital data, and so on; (6) within an individual module, there are also incomplete tables as certain electronic systems did not exist in the past, particularly the eMAR system.
- Other - Date and times were shifted randomly into the future using an offset measured in days. A single date shift was assigned to each subject_id. As a result, the data for a single patient is internally consistent. For example, if the time between two measures in the database was 4 hours in the raw data, then the calculated time difference in MIMIC-IV will also be 4 hours. Conversely, distinct patients are not temporally comparable. That is, two patients admitted in 2130 were not necessarily admitted in the same year.
Applicable methods
- Exploratory analysis (1)
- Association methods, such as logistic regression (2, 3, 4), generalized linear model (5, 6, 7), Cox proportional hazard models (8, 9, 10)
- Dose-response analysis (11, 12)
- Sensitivity Analysis (13, 14)
- Propensity scores (15, 16, 17)
- Machine learning (18, 19, 20)
- Time series (21, 22)
High-impact designs
-
Evaluate racial and ethnic disparities in care (23)
-
Construct and validate a predictive nomogram (24)
-
Identify the affecting features of persistent acute kidney injury (pAKI) for patients in intensive care units (25)
-
Evaluate the predictive value of a biomarker on diseases prognosis and in-hospital mortality (26, 27, 28, 29, 30, 31)
-
Measure the probability of invasive ventilation after meeting physiologic thresholds (32)
Data dictionary
To access the MIMIC-IV data dictionary, click here
Variable categories
- Patients demographics (e.g., age group, sex, race, marital status, insurance type)
- Diagnosis (ICD codes)
- Prescription (e.g., drug type and dose)
- Hospital and ICU information (e.g., admission, discharge, and death time)