June 24, 2024
Ensuring quality database changes in CI/CD pipelines
See Liquibase in Action
Accelerate database changes, reduce failures, and enforce governance across your pipelines.
This is one section of our complete 12-part guide to CI/CD for databases - get your copy.
In part 3 and part 4 of this series, we talked about defining batches of changes and then processing those batches into deployable packages of working changes that are separated from the source code repository where they were created. This mirrors the notion of Continuous Integration (CI) where we take source code changes and process them into binaries. The processing part of the CI process provides an opportunity to add quality checks into the process and improve the efficiency of the pipeline by stopping problems sooner. So, before we move on to the deployment phase for our packaged batches, we should examine what opportunities exist for us to apply CI-style quality checks to our database batches.
Three types of checks in the CI process
As background, let’s think about the typical checks applied for a CI process and how they might map to a group of database changes. The parallels are easiest to see if we break the change quality checking part of CI down into three phases — before, during, and after. Then, laying app code changes and database changes side-by-side, we get a table
Using CI for quality database changes
The idea of using the CI processing step for quality is not new — it was added to the CI practice almost immediately. The concept comes directly from Lean manufacturing processes and reflects the idea that it is easier and cheaper to build something well — with quality — rather than to build something poorly and then invest in a heavy inspection and rework process.
There is an adage: “You cannot test quality in; you can only build quality in.” It sounds obvious that it is inefficient and ineffective to constantly rebuild and reinspect the same thing repeatedly just to get it to a deployable state, but a lot of companies do exactly that.
In addition to the overall efficiency, CI also improves the experience of an individual trying to get database changes through a pipeline. The quality automation in the CI process gives them immediate feedback on whether the change is good or bad. They can then remediate the problem without having wasted a bunch of their time or, more importantly, other people’s time.
Automation is the key to quality
With good automation, it is relatively easy to add quality checks by plugging in additional steps or tools in any given phase.
We’ve mainly focused on the basic functionality of the During phase in section 3 of this guide. Unfortunately, “Do the changes work?” is a relatively low bar for quality. It is far too easy to have unacceptable things that “work” just fine, but create security holes or performance problems in our databases. Weighed against that is the time and complexity of the checks required to ensure that our “working” changes are also “safe”.
The best way to balance speed and quality is through automation — computers are simply faster than human inspection.
Setting up a database change automation strategy
“Before” phase
The strategy for automation is straightforward. You begin in the Before phase with the easy checks — the stuff that can be done quickly and with lightweight effort. In most CI scenarios, this is one or more code scans. In this phase, you are looking for adherence to coding standards defined by your architects, generally known anti-patterns, known security weaknesses, and any application-specific items.
For example, you might scan the code going into a globalized database for non-Unicode variable declarations. These are things that involve relatively lightweight tools that are focused on text pattern-matching in the SQL itself. These are tools that are relatively lightweight, easy to integrate into a CI tool, and require simple automation to consume them.
“During” phase
The idea here is to verify that the proposed batch of changes actually works in a Production-representative database. This is a simple thing to automate in as much as it should be done using exactly the same deployment process as you will eventually use when deploying the batches into Production. While simple in concept as discussed in section 4, there is more depth to this phase than just running the scripts and we will discuss that in a later section.
“After” phase
Analyzing what the production-representative database looks like after the changes have been processed is more complex. You have to inspect the database after the changes have run (or “worked”) to ensure that the database is “healthy”. This minimally means inspecting the post-processed database and comparing it to an expected model or pattern.
For example, looking for indexes on important field types, looking for orphaned objects, etc. It also means thinking about running tests against stored procedures and functions to ensure they respond as expected to some test data.
The After phase requires some thought as the checks can rapidly become very specific to a database, the application it supports, or both. The challenge with automated inspection in this phase is that it requires a strong ability to interact with the database being inspected.
The capability to interact with the database, in turn, implies a much more sophisticated tool or tools to accomplish. The benefit is that these tools will catch much more subtle problems. However, they are usually more difficult to integrate into your toolchain and will require sophisticated automation to run efficiently.
Taking a structured approach with modular tooling will enable you to build smart quality checks into your CI process for database changes. The efficiency of your final pipeline will be defined by how well you ensure that your definition of change quality answers all three questions:
- Do the inbound code changes meet our coding standards?
- Do the changes work?
- Do the changes, when applied to a database, leave that database in a state where it is healthy and usable for our application?
It requires some initial investment to elevate your definition of quality to a higher level than ‘it works’, but the payoff in efficiency and productivity makes it worthwhile – especially when you make your deployment process reliable and predictable.
Ready to dive into all 12 parts of the complete CI/CD for databases guide? It covers:
- The goals & challenges of designing CI/CD pipelines
- Batching database changes
- Automating database “builds”
- Ensuring quality database changes
- Reliable and predictable database deployments
- Trustworthy database environments
- Sandbox databases
- Database governance for consistency, quality, and security
- Observability for database CI/CD
- Measuring database CI/CD pipeline performance
- Handling and avoiding problems with database changes
- Bring CI/CD to the database with Liquibase