Doc: USA Census - wimlds/smart_cities GitHub Wiki

The US census is designed to record changes in the US population over time. It is designed to cover each individual residing in the US and data on a variety of socioeconomic indicators are collected every 10 years. In addition, every 3 years the American Community Survey (ACS) collects a subset of the data (and some additional data) on residents. All (most of) this data is made available through the American Fact Finder (AFF) website, but to prevent identification of individuals and protect privacy it is aggregated at some geographical granularity. The data we gathered is aggregated at the '''census tract''' level. A census tract is a geographical area typically smaller than a zipcode in NYC, whose boundaries are designed to collect a maximally homogeneous subset of the poplutation: all tracts should ideally contain the same number of people, with similar socioeconomic characteristics, so that the variance within a tract is minimal. However, the tracts are designed based on the previous census, and therefore they are never as homogeneous as desired. An additional complication of this scheme is that the census tracts are slightly different from a census to the next. You can find the files containing the geographical information for the NYC census tracts from the 2000 and 2010 census on our github repository in zipped '''shapefile''' format for use with Geopandas, Carto, ARCGIS, QGIS and other geographical visualization and analysis tools.

For convenience, since the AFF website is rather confusing due largely to the tremendous amount of data it hosts, we collected and made available here data for 2 census years: 2000 and 2010. We collected income data and population characteristics data for all 5 boroughs at the census tract level. These files contain a lot of variables, thus a lot of columns, for thousands of census tracts in NYC.


Table paths:

  • `smart_cities_data.census00`
    • Shape: 2,217 rows, 219 columns
  • `smart_cities_data.census10`
    • Shape: 2,168 rows, 133 columns
  • `smart_cities_data.IncomeCensus00`
    • Shape: 2,217 rows, 198 columns
  • `smart_cities_data.IncomeCensus10`
    • Shape: 2,168 rows, 147 columns

Instead of listing the many columns and their description here we also uploaded the metadata files that accompany the data files downloaded from the AFF site:

Table paths:

  • `smart_cities_data.census00_metadata`
    • Shape: 217 rows, 2 columns
  • `smart_cities_data.census10_metadata`
    • Shape: 533 rows, 3 columns
  • `smart_cities_data.IncomeCensus00_metadata`
    • Shape: 195 rows, 2 columns
  • `smart_cities_data.IncomeCensus10_metadata`
    • Shape: 375 rows, 2 columns