HCUP SID Healthcare Cost and Utilization Project, State Inpatient Database - onetomapanalytics/Meta_Data GitHub Wiki
HCUP SID - Healthcare Cost and Utilization Project, State Inpatient Database
General description
- Database primary purpose - Provide inpatient discharge records from non-federal hospitals in individual participating states; since it encompasses all patients, SID provides a unique view of inpatient care in a defined market or state over time.
- Overall data type - Health outcomes
- Dataset type - Cross-sectional
- Data source - Claims
- Data level - Patient level
- Geographic location of the data collection sites - California, Florida, Iowa, Maryland, Massachusets, New York, and Washington
- Sponsor, manager, or home institution - Agency for Healthcare Research and Quality's (AHRQ)
- Date range - California: 2003 - 2011, Florida: 2004 - 2019, Iowa: 2009 - 2014, Maryland: 2009 - 2017; Massachusets: 2012 - 2014, New York: 2009 - 2014, North Carolina: 2010 - 2015, Washington: 2009 - 2015, and Wisconsin: 2013 - 2015
- Geolocation data - Hospital's state postal code, patient's state postal code, zip code, state/county FIPS code
- Dates - Admission and discharge hour, day, month, and year; days to event
- Hospital identifiers - HCUP-specific hospital identifier (Medicare Provider ID) and AHA hospital identifier (may vary by state and date organizations)
- Physicians identifiers - HCUP provides de-identified physician identifiers, which can be used to distinguish between physicians
- Longitudinal tracking - Track patients within and accross hospitals (up to one year), track hospitals
- Financial variables - Contains charge information and provides supplemental files containing cost-to-charge ratios
- Clinical areas of interest - all
- Variables that are uniquely present in this dataset - Includes the universe of the inpatient discharge abstracts from participating States that are translated into a standard format to permit multistate comparisons and analyses. Also, include a core set of clinical and nonclinical information on all visits, regardless of the expected payer, including but not limited to Medicare, Medicaid, private insurance, self-pay, or those billed as 'no charge'.
- Database caveats and limitations - Not all data elements are available from every state, and not in all the years. Also, patients cannot be longitudinally followed, except regarding readmission
Applicable methods
- Association methods, such as multivariable analysis (1, 2, 3), logistic regression models (4, 5, 6, 7), generalized linear mixed-effect models (8, 9, 10), Poisson regression (11)
- Descriptive statistics (12, 13)
- Exploratory anaysis (14)
- Interrupted time series (15
- Machine learning (16, 17)
- Time to event (18
- Propensity score (19, 12, 20, 21)
High impact designs
-
Enrichment of the SID dataset through linkage to other datasets, such as Medicare inpatient claims data (22), AHA (23, 24, 25), NY SPARCS (24), SEDD (26, CMS Hospital Compare (27
-
Evaluate quality and cost of surgery at safety-net hospitals (28, 20)
-
Assess the effects of different geographic measures of socioeconomic status and deprivation on surgical outcomes (30)
-
Develop a systematic approach to detect surgical access disparities (31)
-
Assess between-hospital variation in interventions (32)
-
Develop a scale to predict readmission rates (33)
-
Assess socioeconomic disparities (37
-
Compare outcomes between states that implemented or not Medicaid expansion (38)
-
Develop a method to delineate hospital service areas (HSAs) and hospital referral regions (HRRs) (39)
-
Compare methods between cohorts over time (40)
-
Investigate clinical features, management strategies, and outcomes (41)
-
Evaluate outcomes related to Medicare's Nonpayment Program (42)
-
Assess factors associated with the length of readmission following a procedure (43)
Data dictionary
To access the HCUP SID data dictionary, click here
Variable categories
- Patient demographics [e.g., age, sex, race, ethnicity, language, residence indicator (i.e., homeless), marital status]
- Hospital discharge records (e.g., primary discharge diagnosis, dates of admission and discharge, LOS, patient discharge status etc)
- Charges (expected payer, total charges)
- Injury information (i.e., type and intent)
- Diagnosis codes
- Procedure codes
Linkage to other datasets
-
The SID can be linked to hospital-level data from the American Hospital Association's Annual Survey of Hospitals and county-level data from the Bureau of Health Professions' Area Resource File, except in those States that do not allow the release of hospital identifiers.
-
SID can also be linked to social determinants of health data using patient ZIP codes (e.g., Distressed Communities Index Data)