Attribute API - DataUSA/datausa-api GitHub Wiki

Contents

Basics

The majority of the data sets contains numeric data, but for visualization purposes, rich attribute data is helpful for explaining the nature of the data. For example, when looking at the courses (CIP codes) offered it is helpful to know the name of the course, or retrieve the icon image for the course. These types of data we refer to as attributes. For starters, there is an endpoint to view a list of all of the available attribute types:

http://api.datausa.io/attrs/list/

To view the entire list for a specific attribute type, simply replace "list" in that previous query with the specific attribute type you need. For example, to view all location attributes:

http://api.datausa.io/attrs/geo/

Additionally, each attribute call can be filtered by a specific geography. Here, we request the attribute for the state of Massachusetts:

http://api.datausa.io/attrs/geo/04000US25/

And finally, all attribute calls can be filtered by a specific sumlevel. This fetches the attributes for all of the states:

http://api.datausa.io/attrs/geo/?sumlevel=state

Finding Datasets

In order to find which datasets contain which attributes, it is possible to perform a call directly to the logic layer of the Data API:

http://api.datausa.io/api/logic/?show=geo&sumlevel=all

Locations (geo)

column name description
id unique ID
name if there are no conflicts, will be the same as name_long, otherwise will be suffixed by addition geography (ex. Suffolk County, MA)
name_long the shortest version of the attribute name (ex. Suffolk County)
display_name cleaned names, used for profile page titles
url_name slug name used in URL structure
sumlevel attribute sumlevel
image_link link to image source on flickr
image_author image credit from flickr
image_meta any information about the image's content, if available

Containing (parent) Geographies

http://api.datausa.io/attrs/geo/04000US25/parents

Containing (children) Geographies

http://api.datausa.io/attrs/geo/04000US25/children/

Containing (children) Geographies With Sumlevel Filter

Showing places within Massachusetts.

http://api.datausa.io/attrs/geo/04000US25/children/?sumlevel=160

Neighboring Geographies

http://api.datausa.io/attrs/geo/04000US25/neighbors/

Available Sumlevels

sumlevel prefix description
nation 010 Aggregate US data
state 040 US States (includin D.C. and Puerto Rico)
county 050 US Counties
place 160 Census Designated Places
msa 310 Metropolitan Statistical Area
puma 795 Public Use Micro Data Sample Area, a census subdivision of US states

Geo Containment

We have also included a special mechanism for getting both the counties inside an MSA or the tracts inside a place. To do this, indicate sumlevel=county but also specify the geo ID of an MSA (e.g. geo=31000US10420). Ordinarily this would be an API call that would return no data since geo=31000US10420 is mutually exclusive to sumlevel=county, however when this special case arises the API recognizes the discrepancy between the specified geo and the desired sumlevel, and uses that to return all counties within the 31000US10420 MSA.

http://api.datausa.io/api/?show=geo&sumlevel=county&required=pop&geo=31000US10420&year=latest

Courses (cip)

Course data is based upon the 2010 CIP standard. They represent instructional programs and can be found in both IPEDS and ACS PUMS data.

column name description
id unique ID
name if there are no conflicts, will be the same as name_long, otherwise will be suffixed by addition information
name_long the shortest version of the attribute name
url_name slug name used in URL structure
level attribute sumlevel
is_stem whether or not the course is a STEM course (1 for yes, 0 for no)
image_link link to image source on flickr
image_author image credit from flickr

Available Sumlevels

sumlevel description
0 2 digit course
1 4 digit course
2 6 digit course

Industries (naics)

Industry data in Data USA is based around the 2012 NAICS-based codes from ACS PUMS.

column name description
id unique ID
name if there are no conflicts, will be the same as name_long, otherwise will be suffixed by addition information
url_name slug name used in URL structure
level attribute sumlevel
parent parent attribute grouping, if applicable (otherwise it will reference itself)
grandparent highest attribute grouping, if applicable (otherwise it will reference itself)
image_link link to image source on flickr
image_author image credit from flickr

Available Sumlevels

sumlevel description
0 Sector
1 Sub-sector
2 Group

Occupations (soc)

Occupation data in Data USA is based around the 2012 SOC-based codes from ACS PUMS.

column name description
id unique ID
name if there are no conflicts, will be the same as name_long, otherwise will be suffixed by addition information
url_name slug name used in URL structure
level attribute sumlevel
parent parent attribute grouping, if applicable (otherwise it will reference itself)
grandparent grandparent attribute grouping, if applicable (otherwise it will reference itself)
great_grandparent highest attribute grouping, if applicable (otherwise it will reference itself)
image_link link to image source on flickr
image_author image credit from flickr

Available Sumlevels

sumlevel description
0 Major Group
1 Minor Group
2 Broad Occupation
3 Detailed Occupation

Genders (sex)

column name description
id unique ID
name full name

Skills (skill)

column name description
id unique O*NET ID
name full name
parent parent group name
avg_value average value

Birthplaces (birthplace)

column name description
id unique ID
name full country name
adm0_a3 ISO 3166 country code

Languages (language)

column name description
id unique ID
name full language name

Universities (university)

column name description
sector sector ID
name full name
url main website of university, if available
is_stem whether or not the university contains STEM majors (1 for yes, 0 for no)
county county ID
state state ID
lat latitute coordinate reported by IPEDS
lng longitude coordinate reported by IPEDS
id unique ID
msa MSA ID

Sectors (sector)

column name description
color unique hexidecimal color
id unique ID
name full name

Input/Output Industries (iocode)

column name description
id unique ID
name full name
parent unique ID of parent industry
level integer value representing sumlevel, where 0 is the largest grouping

BLS Occupations (bls_soc)

column name description
id unique ID
name full name
level string value representing sumlevel

BLS Industries (bls_naics)

column name description
id unique ID
name full name
level integer value representing sumlevel, where 0 is the largest grouping

ACS Occupations (acs_occ)

The ACS occupation codes are identifiers created by Data USA to impose a structure on the textual identifiers used in ACS tables.

column name description
id unique ID
name full name
level integer value representing sumlevel, where 0 is the largest grouping

ACS Industries (acs_ind)

The ACS industry codes are identifiers created by Data USA to impose a structure on the textual identifiers used in ACS tables.

column name description
id unique ID
name full name
level integer value representing sumlevel, where 0 is the largest grouping

ACS Races & Ethnicities (acs_race)

column name description
id unique ID
name full name

ACS PUMS Races & Ethnicities (pums_race)

column name description
id unique ID
name full name

Races & Ethnicities (race)

column name description
id unique ID
name full name

Degrees (degree)

column name description
id unique ID
name full name

PUMS Degrees (pums_degree)

column name description
id unique ID
name full name

Wage Bins (wage_bin)

column name description
id unique alpha ID
name full name of bin (ex. "$10-20K")

Conflicts (conflict)

column name description
id unique ID
name full name