Final output cleaning - NYCPlanning/db-factfinder GitHub Wiki

Formatting

After c, e, m, p, and z are calculated using calculate_c_e_m_p_z, values are rounded and cleaned based on the rules contained in the methods cleaning and rounding.

Rounding

The utility function rounding rounds estimates and MOEs to the number of digits specified in the metadata. All c, p, and z are rounded to a single decimal place, regardless of the number of digits specified in the metadata. Note that the logic used to clean data (described in the next section), refer to the rounded values rather than the raw values.

Cleaning

The following rules modify the rounded results of calculate_c_e_m_p_z, in the order listed. The purpose of these cleaning steps is to remove invalid values.

Invalid values

  • If c, e, m, p, or z are negative, they are overwritten by NULL
  • If p is greater than 100, it is overwritten by NULL
  • If p is 100 or NULL (including values overwritten by the above rule), z is set to NULL

Zero estimates

  • If e is 0, c, m, p, z is set to NULL

Base variables

  • If the variable is a base variable, the geography type is either borough or city, and c is NULL, c is set to 0
  • If the variable is a base variable, the geography type is either borough or city, and m is NULL, m is set to 0
  • If the variable is a base variable and the variable is not a median variable, p is set to 100
  • If the variable is a base variable and the variable is not a median variable, z is set to NULL

Inputs to median variables (binned variables)

  • If the variable is an input to a median (with the exception of median rooms inputs), m is set to NULL
  • If the variable is an input to a median (with the exception of median rooms inputs), p is set to NULL
  • If the variable is an input to a median (with the exception of median rooms inputs), z is set to NULL
  • If the variable is an input to a median (with the exception of median rooms inputs), c is set to NULL

Special variables

  • If the variable is a special variable, p is set to NULL
  • If the variable is a special variable, z is set to NULL

GEOID Formatting

The method labs_geoid translates census geoids into the format displayed in the PFF application. Because the list of geography types changed in 2020, this method relies on year-specific functions, format_geoid, contained in the AggregatedGeography classes.

These translations primarily involve replacing FIPS county codes with borough abbreviations or codes.