update_uprn_addresses - zebrainvest/postgres GitHub Wiki
update_listing_scrapper_data
Stored Procedure
Requirement
The UPRN (uprn
) value in table certificates_depc
is often NULL or NaN
, however it is common for the same property (postcode
, address
) to have multiple records in this table, some of which with a UPRN value.
Updating this table's UPRN values when they're NULL/NaN is very costly as this is such a large table and is used very frequently for various processes, and locking it for updates is problematic.
A solution is to use a separate table to keep postcode + address + uprn values, and get the UPRN from there instead, by matching postcode + address values from both tables.
Implementation
This is a stored procedure in the production propertytoolkit database.
Adds new addresses with UPRN to the table uprn_addresses
(UA), sourced from certificates_depc
(DEPC), wherever the UPRN in DEPC is new to UA.
The stored procedure only adds new records, ignoring any already existing records with the same UPRN value.
Notes:
- This table uses collations for
postcode
(ignore_punct_case
) to ensure that spaces/symbols and case are ignored when comparing to other strings, and another collation inaddress
(num_ignore_punct_case
) to also ignores spaces/symbols and case, but treating numbers as a whole in such a way that'1/23'
compared to'123'
is different, but compared to'01.23'
or' 1 23'
is the same. - A system cron job (root user's crontab) runs this stored procedure at half past midnight.