Flyway Database Migrations - bcgov/nr-compliance-enforcement GitHub Wiki

Automated database migrations are handled by Flyway. Here's how Flyway is configured for the C&E project:

Configuration: Flyway is configured to check the backend/db/migrations folder for sql scripts of the format V{#.#.#}__{description}.
- The first number is the major version of the application (at the time of writing this article we are still working on MVP aka version 0)
- The second number is the number of the current sprint
- The third number is an incremental number for builds within the sprint
Version Control: Flyway maintains a table named "flyway_schema_history" to keep track of the applied migrations.
Migration Workflow: When the application starts or when triggered manually, Flyway compares the current state of the database with the available migration scripts.
- Flyway checks the schema history table to determine the last applied version.
- It scans the specified migration scripts location (typically a directory) for new scripts that have not been applied yet.
- It orders the new scripts based on their version numbers and applies them sequentially.
Migration Execution: For each migration script, Flyway executes the SQL statements within the script against the target database. This can involve creating or modifying database tables, views, indexes, procedures, or inserting/updating data.
Schema History Update: After successfully applying a migration script, Flyway updates the schema history table, recording the script's version and checksum as applied.
Repeatable Migrations: Flyway also supports "repeatable" migrations, which are applied every time the database changes are synchronized. These migrations are typically used for managing data changes that cannot be expressed purely in SQL, such as stored procedures or triggers. Repeatable scripts are in the format of R__{description}.
Best Practices / Conventions:

DDL Operations such as Creating Tables, Altering Tables, etc. MUST go in a new versioned script (V{#.#.#}__{description}) as these operations often cannot be re-run.
DML Operations that insert or modify code tables that are stored in redux should go in a repeatable script {R__{table_name}}.sql and be written such that the data will be inserted if it doesn't exist, but updated if it does exist. See the following example:

INSERT INTO shared.agency_code (
    agency_code, short_description, long_description, display_order, active_ind, create_user_id, create_utc_timestamp, external_agency_ind
) VALUES
    ('PARKS', 'BC Parks', 'BC Parks', 10, TRUE, 'FLYWAY', NOW(), false),
    ('COS', 'COS', 'Conservation Officer Service', 20, TRUE, 'FLYWAY', NOW(), false),
    ('EPO', 'CEEB', 'Compliance and Environmental Enforcement Branch', 30, TRUE, 'FLYWAY', NOW(), false),
    ('ECCC', 'Environment and Climate Change Canada', 'Environment and Climate Change Canada', 40, TRUE, 'FLYWAY', NOW(), true),
    ('DFO', 'Fisheries and Oceans Canada', 'Fisheries and Oceans Canada', 50, TRUE, 'FLYWAY', NOW(), true),
    ('NROS', 'Natural Resource Officer Service', 'Natural Resource Officer Service', 60, TRUE, 'FLYWAY', NOW(), true),
    ('NRS', 'Natural Resource Sector', 'Natural Resource Sector', 70, TRUE, 'FLYWAY', NOW(), true),
    ('OTH', 'Other', 'Other', 80, TRUE, 'FLYWAY', NOW(), true),
    ('POL', 'Police', 'Police', 90, TRUE, 'FLYWAY', NOW(), true)
ON CONFLICT (agency_code) DO UPDATE SET
    short_description = EXCLUDED.short_description,
    long_description = EXCLUDED.long_description,
    display_order = EXCLUDED.display_order,
    active_ind = EXCLUDED.active_ind,
    update_user_id = 'FLYWAY',
    update_utc_timestamp = NOW();

** As an alternative, scripts can be added to R__Create-Test-Data.sql as this script will increment the code table version value that will prompt a redux refresh however this approach should be deprecated.

Updates to stored procedures should go in a repeatable script {R__{procedure_name}}.sql to ensure that there are not multiple scripts updating the same procedure
Before making a new repeatable script check that a script that doesn't modify this data doesn't already exist as the execution order of the repeatable scripts is not guaranteed.
When referencing a table where the primary key column name matches the table name (with or without a suffix like _code), the foreign key column in the referencing table must be given a different name. For code tables this should be done by removing the _code from the foreign key column name (e.g. use case_status instead of case_status_code) This is done in order to prevent Prisma from generating redundant or awkward relation property names and to ensure clean, predictable Prisma models.
When creating a logical link between two tables in different schemas the column name should be suffixed with _ref. In addition, the comment on the column should clearly indicate the schema, table and column the reference is referring to. This is done in order to clearly indicate a logical cross-schema link, and including the full reference in the column comment ensures clarity and maintainability where foreign key constraints can't be enforced.
This process has been built into docker-compose as well as the C&E deployment pipeline.