Default Demographic Data - synthetichealth/synthea GitHub Wiki

Default US Census Data

By default, SyntheaTM contains publicly available demographic data obtained from the US Census Bureau. The data was post-processed to create population input data for every place (town and city) in the United States. This post-processed data can be used with Synthea to generate representative populations.

Data Sets

The census data files used to assemble the default demographics file (src/main/resources/geography/demographics.csv) are as follows:

File Set Description
sub-est2015* Subcounty population estimates
cc-est2015-alldata* County population demographic distributions (gender, race, age groups)
ACS_14_5YR_S1501* Education level attainment distributions
ACS_14_5YR_S1901* Income level distributions

Subcounty Population Estimates

Property Detail
Files sub-est2015*
Source US Census Bureau
Year 2010 Census
URL https://www2.census.gov/programs-surveys/popest/datasets/2010-2015/cities/totals/

County Population Demographics

Property Detail
Files cc-est2015-alldata*
Source US Census Bureau
Year 2010 Census
URL https://www2.census.gov/programs-surveys/popest/datasets/2010-2015/counties/asrh/

Education Level Distributions

Property Detail
Files ACS_14_5YR_S1501*
Source US Census Bureau
Year 2010-2014 American Community Survey 5-Year Estimates
URL https://factfinder.census.gov/bkmk/table/1.0/en/ACS/14_5YR/S1501/0100000US.16000.004
URL2 https://factfinder.census.gov/bkmk/table/1.0/en/ACS/14_5YR/S1501/0400000US01.06000

Income Level Distributions

Property Detail
Files ACS_14_5YR_S1901*
Source US Census Bureau
Year 2010-2014 American Community Survey 5-Year Estimates
URL https://factfinder.census.gov/bkmk/table/1.0/en/ACS/14_5YR/S1901/0100000US.16000.004
URL2 https://factfinder.census.gov/bkmk/table/1.0/en/ACS/14_5YR/S1901/0400000US01.06000

SDoH Data

The SDoH file is county-level data. It is derived from the 2018 Social Vulnerability Index and the 2021 Community Health Rankings. The file has the following headers:

FIPS_CODE,
COUNTY_CODE,
COUNTY,
ST,
STATE,
FOOD_INSECURITY,
SEVERE_HOUSING_COST_BURDEN,
UNEMPLOYED,
NO_VEHICLE_ACCESS,
UNINSURED

The first five columns (FIPS_CODE,COUNTY_CODE,COUNTY,ST,STATE) are required and fixed.

The remaining columns (FOOD_INSECURITY,SEVERE_HOUSING_COST_BURDEN,UNEMPLOYED,NO_VEHICLE_ACCESS,UNINSURED), plus any others that might be added in the future, get added to the patient attributes based on the probability listed in the table cell.

⚠️ **GitHub.com Fallback** ⚠️