Dataset: HPD Complaints - nycdb/nycdb GitHub Wiki

hpd_complaints

The hpd_complaints dataset can be added to NYCDB by running:

nycdb --download hpd_complaints
nycdb --load hpd_complaints

Provenance

The dataset comes from the following dataset on NYC open data portal:

The Department of Housing Preservation and Development (HPD) records complaints that are made by the public through the 311 Citizen Services Center, Code Enforcement Borough Offices or the internet for conditions which violate the New York City Housing Maintenance Code (HMC) or the New York State Multiple Dwelling Law (MDL). Each complaint is associated with one or more problems reported by the complainant. Problems are closed if a tenant verifies by phone that the condition was corrected or an inspection result is entered by an HPD inspector. A complaint is closed when all associated problems are closed.

Note: This dataset previously contained two separate table; one for complaints and one for problems. For each complaint there are one or more associated problems. Since 2023 HPD now releases a single dataset with complaints and problems joined together.

Data dictionary

You can download the data dictionary XLSX.

Click to Expand the hpd_complaints_and_problems Data Dictionary
Column Name Data Type Description Expected Values
ReceivedDate date Date when the complaint was received
ProblemID integer Unique identifier of this problem. There can be multiple problems associated with a single complaint
ComplaintID integer Unique identifier of the complaint this problem is associated with. For each complaint there are one or more problems.
BuildingID integer Unique identifier given to a building record
Borough text Complaint borough Manhattan, Bronx, Brooklyn, Queens, Staten Island
HouseNumber text Complaint house number
StreetName text Complaint street name
PostCode text Complaint zip code
Block integer Number assigned by the NYC Dept of Finance identifying the tax block the lot is on
Lot integer Unique number assigned by the NYC Dept of Finance within a Block identifying a lot
Apartment text Number of the unit or apartment in a building
CommunityBoard integer Unique number identifying a Community District/Board, which is a political geographical area within a borough of the City of NY 1-18
UnitType text Type of space where the problem was reported Apartment, Building, Building-Wide, Public area
SpaceType text Type of space where the problem was reported
Type text Code indicating the problem type Emergency, Hazardous, Immediate Emergency, Non emergency
MajorCategory text The major category of the problem
MinorCategory text The minor category of the problem
ProblemCode text The problem code
ComplaintStatus text The status of the complaint Close, Open
ComplaintStatusDate date Date when the complaint status was updated
ProblemStatus text The status of the problem Close, Open
ProblemStatusDate date Date when the problem status was updated
StatusDescription text Status description
ProblemDuplicateFlag boolean Dupliate complaint Indicator N, Y
ComplaintAnonymousFlag boolean Anonymous complaint Indicator N, Y
UniqueKey text Unique identifier of a Service Request (SR) in the open data set. Links to the to the open dataset '311 Service Requests from 2010 to Present'
Latitude numeric
Longitude numeric
CouncilDistrict char(2)
CensusTract text
Bin char(7) Building Identification Number, assigned by City Planning, for the address of the building identified in the complaint. Read more about BIN
BBL char(10) A combination of three numeric codes--a 1-digit borough number, a block number (up to 5 digits) and a lot number (up to 4 digits)--designated and modified by the Department of Finance (DOF). BBLs are used by various city agencies to identify real estate for taxes, zoning, construction, and other purposes. Read more about BBL
NTA text Neighborhood Tabulation Area. Read more about NTAs

Tables

  • hpd_complaints_and_problems is the only table in this dataset.

Example queries

Following are some useful SQL queries related to the dataset.

Archived queries from before separate problems and complaints tables were combined

Get properties with most building-wide heat complaints

The hpd_complaints table has the location information for complaints, and the hpd_complaint_problems table has information about the nature of the problems. Therefore, to get information about the number of complaints for a specific problem for each property we need to join the two tables in this dataset.

First we can start with hpd_complaint_problems and filter to just the problem records that we are interested in: building-wide lack of heat during NYC's "heat season" when it is mandatory for landlords to provide heat. Since there can be multiple problem records associated with each complaint, we want to make sure we only keep one row for each complaint. We can save the results of this in a temporary table using WITH, then in another query where we can join to the hpd_complaints.

Now that we have identified the complaintid for all the problems we are interested in, we can join our temporary no_heat_problems table with the main hpd_complaints, and then count up the number of complaints fitting our criteria for each property (bbl) and display them in descending order.

WITH no_heat_problems AS (
  SELECT DISTINCT complaintid -- only one for each complaint
  FROM hpd_complaint_problems
  WHERE code ~ 'NO HEAT' -- description of the problem contains the phrase "NO HEAT"
    AND statusdate between '2018-10-01' and '2019-05-31' -- "heat season"
    AND (unittype = 'BUILDING-WIDE' OR minorcategory = 'ENTIRE BUILDING') -- these do not always agree
)
SELECT 
  c.bbl, 
  count(*) AS no_heat_complaints
FROM hpd_complaints AS c
INNER JOIN no_heat_problems AS p
  USING(complaintid)
GROUP BY c.bbl
ORDER BY no_heat_complaints desc;
⚠️ **GitHub.com Fallback** ⚠️