Liquibase - VittorioDeMarzi/hero-beans GitHub Wiki

An Introduction to Liquibase: Version Control for Your Database

1. What is Liquibase?

At its core, Liquibase is an open-source, database-independent tool for managing and versioning database schema changes.

Think of it like Git, but for your database. While you use Git to track changes to your source code, you use Liquibase to track changes to your database structure (tables, columns, indexes, etc.).

It works by using descriptive, human-readable files to define your database modifications. These files are called changelogs.

Key Concepts:

Changelog: An ordered collection of database changes. This is typically a master XML, YAML, or JSON file (e.g., master.xml) that includes other files for organization.

Changeset: A single, atomic change to your database. Each changeset is uniquely identified by an id and an author. It represents one logical modification, like creating a table, adding a column, or inserting data.

Example of a changeset in XML:

<changeSet id="1" author="john.doe">
    <createTable tableName="users">
        <column name="id" type="int" autoIncrement="true">
            <constraints primaryKey="true" nullable="false"/>
        </column>
        <column name="username" type="varchar(50)">
            <constraints nullable="false" unique="true"/>
        </column>
    </createTable>
</changeSet>

Tracking Tables: When Liquibase runs, it creates two special tables in your database:
1. DATABASECHANGELOG: This table keeps a record of every changeset that has already been executed. Before applying a new changeset, Liquibase checks this table to ensure it hasn't been run before.
2. DATABASECHANGELOGLOCK: This table prevents multiple instances of your application from trying to update the database at the same time, ensuring schema changes are applied safely.

2. Why is it Important to Use Liquibase?

Managing database changes manually with raw SQL scripts is risky, error-prone, and doesn't scale well. Liquibase solves these problems by providing several critical benefits:

a. Consistency and Reliability Across Environments

Your application runs in multiple environments (developer machines, testing, staging, production). Liquibase ensures that the database schema is in the exact same state in every environment, eliminating "it works on my machine" problems caused by database differences.

b. Version Control for Your Database Schema

Your database structure is just as critical as your application code. By keeping your changelog files in your Git repository alongside your code, you create a single source of truth. You can see exactly how the database evolved over time, who made which change, and why.

c. Improved Team Collaboration

When multiple developers are working on features that require database changes, conflicts can easily arise. Liquibase provides a structured way to manage these changes. Each developer can work on their changesets in separate files, which are then ordered in a master changelog, preventing conflicts.

d. Database Independence

You can write your changesets in an abstract, database-agnostic format (like XML, YAML, or JSON). Liquibase then generates the correct SQL for your specific database (e.g., PostgreSQL, MySQL, Oracle, SQL Server). This makes it much easier to support multiple database vendors or migrate from one to another.

e. Automation and CI/CD Integration

Liquibase integrates seamlessly into automated build and deployment pipelines. You can configure it to run automatically when your application starts up or as a step in your CI/CD process. This automates database migrations, making deployments faster, safer, and more repeatable.

f. Traceability and Easy Rollbacks

Because every change is tracked, you have a complete audit trail of your database schema. If a deployment goes wrong, Liquibase supports automated rollbacks for many types of changes, allowing you to quickly revert the database to its previous state.

3. What Else? Key Concepts & Best Practices

To use Liquibase effectively, it's helpful to understand a few more concepts:

a. Use a Master Changelog

Don't put all your changesets in one massive file. The best practice is to create a master.xml (or .yml) changelog that simply includes other changelog files, one for each feature or release. This keeps your migrations organized and easy to manage.

Example master.xml:

<databaseChangeLog ...>
    <include file="db/changelog/01-create-initial-schema.xml"/>
    <include file="db/changelog/02-add-user-roles.xml"/>
    <include file="db/changelog/03-feature-product-tables.xml"/>
</databaseChangeLog>

b. One Logical Change per Changeset

A changeset should be atomic. For example, creating a table should be one changeset. Adding a column to that table should be a second changeset. This granularity is important because:

Rollbacks are cleaner: You can roll back one specific change without affecting others.
Recovery is easier: If a changeset fails midway, the entire change is considered failed, leaving the database in a consistent state.

c. Preconditions: Defensive Database Changes

Liquibase allows you to specify preconditions for a changeset. The changeset will only run if the precondition is met. This is a powerful tool for making your migrations more robust.

Example: Only add a column if the table already exists.

<changeSet id="5" author="jane.doe">
    <preConditions onFail="MARK_RAN">
        <tableExists tableName="users"/>
    </preConditions>
    <addColumn tableName="users">
        <column name="last_login" type="timestamp"/>
    </addColumn>
</changeSet>

d. Contexts and Labels

These advanced features allow you to selectively run changesets based on the environment or other criteria.

Contexts: Useful for environment-specific changes (e.g., only run a changeset in test or prod).
Labels: Useful for categorizing changesets (e.g., apply all changes related to a specific feature v2.1-feature).