Final output cleaning - NYCPlanning/db-factfinder GitHub Wiki
Formatting
After c, e, m, p, and z are calculated using calculate_c_e_m_p_z
, values are rounded and cleaned based on the rules contained in the methods cleaning
and rounding
.
Rounding
The utility function rounding
rounds estimates and MOEs to the number of digits specified in the metadata. All c, p, and z are rounded to a single decimal place, regardless of the number of digits specified in the metadata. Note that the logic used to clean data (described in the next section), refer to the rounded values rather than the raw values.
Cleaning
The following rules modify the rounded results of calculate_c_e_m_p_z
, in the order listed. The purpose of these cleaning steps is to remove invalid values.
Invalid values
- If c, e, m, p, or z are negative, they are overwritten by NULL
- If p is greater than 100, it is overwritten by NULL
- If p is 100 or NULL (including values overwritten by the above rule), z is set to NULL
Zero estimates
- If e is 0, c, m, p, z is set to NULL
Base variables
- If the variable is a base variable, the geography type is either borough or city, and c is NULL, c is set to 0
- If the variable is a base variable, the geography type is either borough or city, and m is NULL, m is set to 0
- If the variable is a base variable and the variable is not a median variable, p is set to 100
- If the variable is a base variable and the variable is not a median variable, z is set to NULL
Inputs to median variables (binned variables)
- If the variable is an input to a median (with the exception of median rooms inputs), m is set to NULL
- If the variable is an input to a median (with the exception of median rooms inputs), p is set to NULL
- If the variable is an input to a median (with the exception of median rooms inputs), z is set to NULL
- If the variable is an input to a median (with the exception of median rooms inputs), c is set to NULL
Special variables
- If the variable is a special variable, p is set to NULL
- If the variable is a special variable, z is set to NULL
GEOID Formatting
The method labs_geoid
translates census geoids into the format displayed in the PFF application. Because the list of geography types changed in 2020, this method relies on year-specific functions, format_geoid
, contained in the AggregatedGeography
classes.
These translations primarily involve replacing FIPS county codes with borough abbreviations or codes.