Texas Hospital Discharge Data - onetomapanalytics/Meta_Data GitHub Wiki

Texas Hospital Discharge Data

General description

  1. Database primary purpose - Collect data and report on healthcare activity in Texas hospitals and health maintenance organizations. The goal is to provide information that will enable consumers to have an impact on the cost and quality of health care in Texas.
  2. Overall data type - Health outcomes
  3. Dataset type - Cross-sectional
  4. Data source - Claims
  5. Data level - Patient level
  6. Geographic location of the data collection sites - Texas (U.S.A.)
  7. Sponsor, manager, or home institution - Texas Department of State Health Services (DSHS)
  8. Date range - Hospital inpatient discharge data: 2006 - 2012; Ambulatory patient data: 2009 - 2013
  9. Geolocation data - Patient's country (suppressed if fewer than 5 patients from one country); county Federal Information Processing Standard (FIPS) codes based on patient ZIP code; state of the patient’s mailing address in Texas and contiguous states; patient’s ZIP code (some digits or the entire ZIP code are suppressed if there are few patients included in the zip code, if the patient is from a foreign country, for patients with ICD codes indicating drug or alcohol use or HIV diagnosis, and if a hospital has few patients of a particular gender); public health region
  10. Dates - Year and quarter of discharge, weekday admission, day of surgical or other procedures (calculated variable)
  11. Hospital identifiers - Provider ID (unique identifier assigned by DSHS) and hospital name (hospitals with fewer than 50 discharges are assigned the name ‘Low Discharge Volume Hospital.’ If a hospital has fewer than 5 discharges of a particular gender "Hospital Name" is blank)
  12. Physician identifiers - Attending physician and operating or other physician uniform identifiers (unique identifiers assigned by DSHS)
  13. Longitudinal tracking - Track providers in Texas hospitals through DSHS identifiers and hospital names
  14. Financial variables - payment source, type of bill, charges (see "Variable categories" section)
  15. Clinical areas of interest - all
  16. Number of records - on average, 560 hospitals per quarter (in 2012). For the number of hospitals per quarter per year, click here
  17. Variables that are uniquely present in this dataset - Discharge data from Texas hospitals, including detailed charges
  18. Database caveats and limitations - (1) Because the data collection is conducted through billing forms, the collected data are administrative and not clinical. (2) Records with some MDC and Patient Status codes (e.g., newborns/neonates with conditions originating in the perinatal period, alcohol/drug-induced organic mental disorders, burns, discharged/transferred to inpatient rehabilitation) may be ungroupable since they are not available for all years and dataset versions. (3) Gender is suppressed for some patients (e.g., with an ICD-9-CM code that indicates drug or alcohol use or an HIV diagnosis). (4) Depending on hospitals’ collection and billing cycles, not all discharges may have been billed or reported. This can affect the accuracy of the source of payment data, particularly self-pay and charity that may later qualify for Medicaid or other payment sources. (5) There is a limit of codes that can be submitted (up to 25 diagnosis codes, 25 procedure codes, and 10 E-codes). So, sicker patients and the hospitals that treat them may not be accurately represented in the data. This may also result in total volume and percentage calculations for diagnoses and procedures not being complete. (6) Race and ethnicity data are generally not collected by hospitals and may be subjectively captured. (7) Comparability of the length of stay (LOS) across hospitals are affected by factors such as case-mix and severity complexity, payer-mix, market areas and hospital ownership, affiliation, or teaching status. (8) Any conclusions drawn from the data are subject to errors caused by the inability of the hospital to communicate complete data due to form constraints, subjectivity in the assignment of codes, system mapping, and normal clerical errors.

Applicable methods

  1. Association analysis, such as multivariable linear regression models (1), logistic regression models (2), chi-square (3), Poisson regression (4)
  2. Bivariate analysis (3)
  3. Exploratory analysis (5)
  4. Descriptive analysis (6)
  5. Geospatial analysis (7)
  6. Sensitivity analysis (5)

High-impact designs

  • Combined with other datasets to assess the endemicity of diseases (8)

  • Evaluates the Texas general surgery workforce at both the state and local level (9)

  • Compare the costs of physician-owned single specialty hospitals with those of full-service hospital competitors (10)

Data dictionary

To access the Texas Hospital Inpatient Discharge data dictionary, click here

Variable categories

  1. Patient demographics (i.e., age, sex, race, ethnicity, homeless, unemployed, student)
  2. Admission [type (i.e., emergency, urgent, elective, newborn, trauma center), source]
  3. Diagnosis [e.g., admitting, principal, other, additional external cause of injury, Major Diagnostic Category (MDC), Diagnosis Related Group (DRG), severity of illness, and risk of mortality scores]
  4. Surgery and procedures (e.g., principal surgical or other procedure performed during the period covered by the bill)
  5. Specialty units (i.g., coronary, pediatric, detoxification, psychiatric, intensive care, rehabilitation, hospice, sub-acute care, nursery, skilled nursing, obstetric, acute care, oncology)
  6. Facility type (e.g., teaching, psychiatric, rehabilitation, acute care, pediatric)
  7. Patient status (e.g., admitted as inpatient, left against medical advice, discharged or transferred to)
  8. Claim condition (i.e., military service-related, disability, abortion, sterilization, disaster-related)
  9. Financial [e.g., payment source, type of bill (i.e., type of facility, care, and claim), total charges (covered and non-covered)]
  10. Charges [e.g., accommodation charge by unit of care, ancillary service charge (e.g., pharmacy, supplies, equipment, physical therapy, OR, laboratory, radiology, ambulance)]

Linkage to other datasets

  • Linkages can be established for any dataset that might have Zip codes