Census Data Collection - stlrda/211Dashboard-Workflows GitHub Wiki

The following information describes how one should go about collecting Census data for the 211Dashboard project. Ideally, this process will be automated in the future; however, at this time, the Census Bureau's API does not include all of the tables that are of interest to this project.

Search Category Table

Search Term Census Table Name Folder Name
total population Total Population total_population
disability characteristics Disability Characteristics disability_characteristics
educational attainment Educational Attainment for the Population 25 Years and Over educational_attainment
age and sex Age and Sex age_and_sex
employment status Employment Status employment_status
hispanic or latino origin Hispanic or Latino Origin hispanic_or_latino_origin
health insurance coverage status Types of Health Insurance Coverage By Age health_insurance_coverage_status
tenure Tenure housing_tenure
income inequality Gini Index of Income Inequality income_inequality
internet subscriptions Internet Subscriptions in Household internet_subscription
median household income Median Household Income in the Past 12 Months (...) median_household_income
per capita income Per Capita Income in the Past 12 Months (...) per_capita_income
poverty status Poverty Status in the Past 12 Months poverty_status
race Race race
means of transportation to work Means of Transportation to Work means_of_transportation_to_work

(** Note: Hispanic or Latino Origin is how the Census Bureau defines ethnicity, which is a different attribute than Race.)

You will need to use the Search Category Table above to assist with the downloading and storing of census data files. The census data for this project should be collected at the "Census Tract" and "County" level. The following steps describe the data collection process.

Download Steps

First, create two folders: census_tract and census_county.

Use the steps outlined below to populate these two folders with the corresponding data.

  1. Navigate to data.census.gov.
  2. Click Advanced Search below the search bar.
  3. In the BROWSE FILTERS section, choose the appropriate filters.
    • Years - select most recent (e.g. "2018")
    • Geography - select ALL "Tract" OR "County" WITHIN (STATE) "Missouri" AND "Illinois"
  4. For each "Search Term" listed in the Search Category Table above, follow steps 5-16.
  5. Confirm your Selected Filters are correct (listed at the bottom of the page).
  6. Type "Search Term" from the Search Category Table into the advanced search bar.
  7. Click SEARCH button.
  8. Click on the appropriate table (see "Census Table Name" in the Search Category Table).
  9. Click Download (located near the top on the left side bar).
  10. Click to check the box next to the table you wish to download.
  11. Once the appropriate table is selected, click Download Selected (1).
  12. Under the "Select Table Vintages" header be sure to check only the 5-Year option for the date you've selected.
  13. Click Download (bottom right).
  14. After your files are finished being prepared, click Download Now.
  15. On your local machine, follow these steps:
    • Unzip the newly downloaded folder (found in your Downloads folder).
    • Rename the unzipped folder to the corresponding "Folder Name" in the Search Category Table (e.g., the data folder from the Hispanic or Latino Origin census table should be named hispanic_or_latino_origin).
    • Next move the downloaded folder to either the census_Tract or census_County folder (depending on which level the data corresponds to).
  16. Finally, back on your web browser, use your browser's "back" navigation button to return to the Advanced Search page without clearing your selected filters.

After all zip files have been downloaded from the census data site, follow these next steps to add the additional community conditions data to the census_tract and census_county folders.

Additional Community Conditions Data

The 211Dashboard project will also include the following community conditions scores:

  • Social Vulnerability Index (SVI)
  • Medically Underserved Area (MUA) Score

Socal Vulnerability Index (SVI)

Social vulnerability refers to a number of factors—including poverty, lack of access to transportation, and crowded housing—that may weaken a community’s ability to prevent human suffering and financial loss in a disaster. The metric is provided by the Center for Disease Control and Prevention (CDC) and can be accessed here.

While the downloadable .csv file has many attributes, we are only interested in "RPL_THEMES" which gives an overall score for the area of interest.

To download the data, navigate to svi.cdc.gov. Select data > choose the most recent Year and appropriate Geography (e.g. United States) > choose the Geography Type (Census Tracts or Counties) and File Type (CSV File) > click Go.

Do this for both Geography Types ("Census Tracts" and "Counties"). Place the .csv file in a folder named svi and add the folder to the census_tract and census_county folders, respectively.

Medically Underserved Areas

Medically Underserved Areas/Populations are areas or populations designated by the Health Resources and Services Administration (HRSA) as having too few primary care providers, high infant mortality, high poverty or a high elderly population.

To download the MUA data, navigate to data.hrsa.gov. Select "Shortage Areas" > Under the heading "Medically Underserved Areas/Populations (MUA/P) select MUA/P-CSV (the file should download immediately).

Next, create a folder named mua in the census_tract and census_county folders. Add the downloaded file (MUA_DET.csv) to the mua folder in both the census_tract and census_county folders.

In total, each of the census data folders should have 17 folders within. The script that generates the files for loading into the database uses this file structure (and the census.json file) to select the columns of interest from these tables and to aggregate the various columns as necessary. The last task is to push these folders to the AWS S3 bucket where the project's data is stored. Once in the S3 bucket, if the update_dag is triggered the census data will be updated in the database.

S3 Upload

To upload to the S3 bucket (uw211dashboard-workbucket) you'll need the appropriate permissions. You will also need to use the AWS Command Line Interface (CLI), and run the following commands from within the directory where you are storing the collected data:

aws s3 cp census_county s3://uw211dashboard-workbucket/census_county --recursive

aws s3 cp census_tract s3://uw211dashboard-workbucket/census_tract --recursive