Overcoming the challenges of large-scale database change management

See Liquibase in Action

Accelerate database changes, reduce failures, and enforce governance across your pipelines.

When one of the world’s largest financial institutions experienced a sudden performance issue with a critical Liquibase command, our team sprang into action. What should have taken the customer seconds was dragging on for 15 minutes or more, revealing a significant challenge in handling massive deployments with tens of thousands of ChangeSets.

This unexpected slowdown highlighted a crucial pain point in rapidly scaling database pipelines. Although Liquibase is known for its ability to bring DevOps automation, governance, and observability to database pipelines of any type, architecture, and scale, this issue showed us there was room to improve.

Our engineering team quickly replicated the issue, identified the root cause, and delivered a solution to every customer. Here’s how we tackled the problem to ensure Liquibase continues to support ever-expanding data stores with speed and reliability.

Replicating the issue

To replicate our Liquibase Pro users’ issue – slower performance on larger datasets – we could run a simple test:

Deploy a large number of changesets to the database
Run a simple command like liquibase release-locks

However, deploying tens of thousands of ChangeSets from scratch would be time-consuming and unnecessary. Instead, we used the Structured Logs provided by the customer to identify the underlying issue: querying the DATABASECHANGELOG table (DCBL).

So, to replicate the customer’s performance issue, we didn’t create 10,000+ ChangeSets, but just filled up the DATABASECHANGELOG table with test entries.

DECLARE
  v_counter NUMBER := 1;
BEGIN
  FOR i IN 1..40000 LOOP
     INSERT INTO DATABASECHANGELOG(ID, AUTHOR, FILENAME, DATEEXECUTED, ORDEREXECUTED, EXECTYPE, MD5SUM, DESCRIPTION, COMMENTS, TAG, LIQUIBASE, CONTEXTS, LABELS, DEPLOYMENT_ID) 
     VALUES (TO_CHAR(v_counter), 'your.name', 'big.sql', TIMESTAMP '2024-04-24 16:12:31.507774', 100, 'EXECUTED', '9:22921d38f361c5f294e6a5092bb1b654', 'sql', 'example comment', NULL, 'DEV', 'example-context', 'example-label', '3975120367');
     v_counter := v_counter + 1;
  END LOOP;
END;

‍

Now, with our DBCL table populated with generated entries, we can run our command: liquibase release-locks.

$ time liquibase release-locks
Starting Liquibase at 11:55:08 (version [local build] #0 built at 2024-04-24 16:46+0000)
Liquibase Version: [local build]
Liquibase Pro [local build] by Liquibase licensed to LB Internal Engineering License until Tue Apr 30 19:00:00 CDT 2024
Successfully released all database change log locks for '***@jdbc:oracle:thin:@***.rds.amazonaws.com:***'
Liquibase command 'releaseLocks' was executed successfully.

real    3m20.455s

‍

And there we have it – more than three minutes to release locks on a database with 40,000 ChangeSets. So how do we fix it and bring these minutes back down to seconds?

Fixing change command delays on high-volume databases

To find a solution, we turned to the logs.

The logs pointed us to a query that we modified during the upgrade of CheckSums. While it had seemed harmless at the time of the upgrade, we were now suspicious – had all of the CheckSums in the database been updated? The old code only checked the first row, which seemed like an obvious error. Surely modifying this query to check all of the rows would not be an issue, and would be a simple fix to an obvious bug!

We realized that loading the entire contents of the DBCL table into memory, just to check the contents of the MD5SUM column introduced a big performance issue. Fortunately, this big issue also needed only a simple fix, properly filtering our query to return only the data we’re interested in.

We didn’t actually care about the entire table, only the rows that did not match the current checksum version. So we simply added a filter to the query, and now the liquibase release-locks command performs much better:

$ time liquibase release-locks
Starting Liquibase at 12:18:38 (version [local build] #0 built at 2024-04-24 17:15+0000)
Liquibase Version: [local build]
Liquibase Pro [local build] by Liquibase licensed to LB Internal Engineering License until Tue Apr 30 19:00:00 CDT 2024
Successfully released all database change log locks for '***@jdbc:oracle:thin:@***.rds.amazonaws.com:***'
Liquibase command 'releaseLocks' was executed successfully.

real    0m4.632s

‍

Under five seconds, down from over three minutes – fantastic!

Now that the problem was fixed, we dug in deeper to ask, “why was it so slow to begin with?” Loading 40,000 rows into memory might not be fast, but it definitely should not take 3 minutes.

This too, was a simple fix. We simply needed to increase the fetch size when we made a query. The fetch size informs the JDBC driver how many rows should be returned at one time. The issues were being reported on Oracle, and Oracle’s driver defaults to a fetch size of 10, meaning that every 10 rows requires a new communication with the database.

We opted to set the fetch size to a default of 1000 (for all databases, not just Oracle), and immediately observed notable performance improvements on all Liquibase commands. The liquibase history command, for instance, saw a 77% decrease in the time it took to run. The complete set of commands and improvements includes:

release-locks: 65% faster
validate: 77%
update-sql: 42%
update: 37%
history: 77%
status: 78%

These speed and efficiency gains add up to significant time and resources saved for enterprises managing 1,000s or 10,000s of ChangeSets.

Supporting speed & agility for large-scale database pipelines

Since solving the issue causing Liquibase users to see performance degradation for massive deployments – 10,000+ ChangeSets – our engineering team is looking for other ways to support and enhance database scalability. We look toward innovation, keeping in mind what’s been learned from bug fixes like this one:

The importance of efficiency at scale: Small inefficiencies, like querying unnecessary data, can cause major performance issues in large-scale systems, so always optimize your code for scalability.
Intentional, disciplined bug-fixing: Fixing bugs outside a structured process can introduce new problems, underscoring the importance of sticking to documented tickets and thorough testing.
Continuous improvement: Every performance issue is an opportunity for refinement, so embrace iterative improvement to make your systems more robust over time.
Embrace a constant mindset of scalability: As your systems grow, regularly reassess your tools and processes to ensure they scale effectively and continue to meet evolving demands.

Engineering teams can take this advice for their own scalability initiatives – and have confidence that Liquibase can handle their rapidly increasing database change management needs.

Download the latest version of Liquibase to manage high-velocity change workflows across 60+ supported databases.

‍

Documenting the people and culture that make the company such an unstoppable force, the Life at Liquibase series showcases employee perspectives on what it's like to build the future of database DevOps.

Steven Massaro

Steven Massaro is a Senior Software Engineer at Liquibase, helping build the future of database DevOps to be faster, safer, and more efficient since 2021. After earning his B.S. in Computer Science from West Chester University of Pennsylvania, he joined retail marketing automation platform Listrak as a programmer and web developer. He then spent nearly four years at Vertex, rising to Senior Software Developer working on tax automation solutions before joining Liquibase. Steven resides in Denver, CO, where he lives with his wife and (occasionally) rambunctious dog. When he’s not working, you can find him hiking, rock climbing, or fixing things that aren’t broken. Follow and connect with him on LinkedIn.