URL Click Column DB Migration - opengovsg/GoGovSG GitHub Wiki
Problem
When the click count of a URL in the url table is being incremented, that corresponding row is locked. This could pose a problem for bulk operations as they would not be able to acquire the necessary locks if a link is constantly being visited.
Objective
Migrate the count column from the url table to a separate url_clicks table.
Constraints
-
No downtime.
-
No loss of data.
Proposed approach
-
Deploy application code that introduces a new
UrlClicksmodel, allowing sequelize to create a new table, albeit unused. -
Run a migration script that, within the same transaction,
- Copies the
urltable's click count over tourl_clicks. - Adds a trigger on
urltable: Onclickcount increment, do the same onurl_clickstable. - Adds a trigger on
urltable: On row insert, insert a corresponding row onurl_clickstable.
- Copies the
-
Deploy application code that lets sequelize read/write on the new
url_clickstable instead ofurltable. -
(OPTIONAL) Run a migration script that deletes the
countcolumn onurltable.
Discussion
Importance of synchronisation:
Synchronisation of data is important for us to not have to worry about various edge cases:
-
urltable'sclickscount not being equal tourl_clickstable'sclick count(possible data loss). -
Attempting to increment a
url_clickstable row that does not exist. -
Attempting to read from a
url_clickstable row that does not exist (nullable results).
Achieving synchronisation:
Start by exhaustively handling all DB operations that could cause our data to go out of sync:
-
urltable'sclickcolumn incremented. -
New row inserted into
urltable.
Utilising a single transaction, carry out all necessary migration and adding of triggers to mitigate foreseen problems above.
Possible challenges
- DB Locks: In step 2, the migration script needs to read every row in the
urltable to copy thecountcolumn over. A lock is needed to prevent dirty reads, and its acquisition could take awhile if done during a period of high traffic on our site.