April 17, 2024
Engineering efficiency at Liquibase: Improving AWS S3 integration tests with MinIO
See Liquibase in Action
Accelerate database changes, reduce failures, and enforce governance across your pipelines.
Part of our commitment to a progressive DevOps culture, Liquibase continuously tries to “shift-left” (by moving testing earlier in the software development life cycle), and our S3 extension is no exception to that rule. However, our integration tests have never been reliable for the S3 extension, and we found that using MinIO improved the reliability and efficiency of our tests significantly.
Liquibase developed an extension that integrates with AWS S3 in November 2022 and has been running nightly integration tests against S3 since then. The tests run on a matrix of Windows and Linux GitHub Actions runners to ensure that the extension can handle paths on both operating systems and on the different versions of Java that Liquibase supports.
In short, this means we were running our integration test suite against six different combinations of operating system and Java versions each night. This concurrency often caused tests to interact unpredictably, and though the vast majority of our tests were designed not to interfere with each other, some test cases simply could not support these modifications.
These tests have been markedly unreliable. In fact, they were so unreliable that the engineering team had developed a problematic pattern of accepting that the nightly test results would always fail.
Most of the issues we’ve encountered could be solved with improvements to our test infrastructure, like separate buckets for each test runner (to avoid unwanted interactions between each test iteration) and test retries (like on our test that uploads 1001 files to test our pagination logic - a test which often fails due to network errors). Though these changes could have helped our situation, they were somewhat unwieldy.
We didn’t want to require different buckets for each test iteration because GitHub Actions supports a great deal of concurrency, and this would mean having an inordinate number of buckets to provision. Test retries were less than ideal too, because they delay the results. If the tests are failing due to interference from other iterations, then a retry would likely experience the same interference.
A more sophisticated solution is to replace the AWS stack entirely and run the tests against an S3-compatible but self-hosted tool.
Options to emulate S3
After testing a variety of tools, we settled on MinIO. MinIO is:
- Cross-platform (runs natively on Linux and Windows)
- S3 compatible
- Open source
- Fast
After replacing our AWS S3 stack in our test environment with MinIO, we saw the following improvements:
- Tests pass every time: Only failures are true failures, not network errors.
- Nightly runs are passing every night: Developers are accepting the results from nightly runs.
- Tests run more quickly: Tests execute nearly twice as fast as before.
- Less expensive: S3 costs are essentially free when running locally via MinIO, because we’re not using AWS infrastructure
The new and improved testing process means our S3 integration tests are dependable and trustworthy, continuing our commitment to deliver a product that’s as reliable and easy to use as possible.
If you’re interested in adopting this method for yourself, check out the walkthrough below.
How to use MinIO as the S3 provider for your tests
Liquibase is heavily invested in GitHub actions, and the S3 extension is no outlier in that respect. As mentioned in the previous section, our S3 extension is tested on both Windows and Linux runners. The implementation of MinIO as a replacement for S3 happened in a couple of phases, described in more detail below.
Phase 1: local development
The first step was to set up the local development environment to support MinIO as the S3 backend. This is the stepping stone for preparing the build environment to use MinIO, but that will come later.
First, we need to run MinIO locally. For this, we look to Docker Compose, which for MinIO is quite simple. All we need to do is create a docker-compose.yml file with the following contents:
version: '3.8'
services:
minio:
image: minio/minio
container_name: minio
ports:
- "9000:9000"
environment:
MINIO_ROOT_USER: minio
MINIO_ROOT_PASSWORD: minio123
command: server /data # Optionally - you can mount this directory to a volume or bind mount
Now we configure the S3 extension to use MinIO. The S3 extension uses the AWS SDK for Java V2, which supports a concept it calls an “endpointOverride”.
When the S3Client is created, we make some small modifications:
StaticCredentialsProvider staticCredentialsProvider =
StaticCredentialsProvider.create(
AwsSessionCredentials.create(ACCESS_KEY, SECRET_KEY, ""));
S3Client s3Client=S3Client.builder()
.httpClientBuilder(UrlConnectionHttpClient.builder())
.credentialsProvider(staticCredentialsProvider)
.endpointOverride(new URI("http://127.0.0.1:9000"))
.forcePathStyle(true)
.build();
Notes:
- We provide hardcoded credentials using the “credentialsProvider”. In the docker-compose.yml file we created previously, we see the user is “minio” and the password is “minio123”. These are the ACCESS_KEY and SECRET_KEY respectively.
- This could be easily replaced with an environment variable if hardcoding the credentials is undesirable.
- Additionally, the key line is the endpointOverride, which tells the S3 SDK to use MinIO rather than AWS infrastructure.
Finally, we also had to ensure that we were creating a bucket if it did not already exist:
ListBucketsResponse listBucketsResponse = s3Client.listBuckets();
boolean bucketExists = listBucketsResponse.buckets().stream().anyMatch(b -> b.name().equals(BUCKET));
if (!bucketExists) {
s3Client.createBucket(CreateBucketRequest.builder().bucket(BUCKET).build());
}
With these small changes, our entire S3 test suite now ran against MinIO instead of S3. Our tests are passing locally. Great progress!
Phase 2: GitHub actions - Linux runners
Preparing the Linux GitHub Actions environment to run MinIO is very straightforward.
Simply add another step to your workflow file:
- name: Start MinIO (emulates S3)
run: |
wget -q https://dl.min.io/server/minio/release/linux-amd64/minio
chmod +x minio
export MINIO_ROOT_USER=minio
export MINIO_ROOT_PASSWORD=minio123
./minio server /tmp/minio &
This code block does the following:
- Downloads MinIO and makes it executable
- Sets the username and password to match the user/pass we established in our docker-compose.yml
- Starts the MinIO server in the background
With this small change to start MinIO prior to running our test suite, our tests were passing on Linux!
Phase 3: GitHub actions - Windows runners
Windows runners in GitHub Actions have a very similar setup process, with one notable exception: the MinIO server must be started as a Windows service, or it will be killed when the step ends.
- name: Start MinIO (emulates S3)
run: |
Invoke-WebRequest -Uri "https://dl.min.io/server/minio/release/windows-amd64/minio.exe" -OutFile "C:\minio.exe"
Invoke-WebRequest -Uri "https://raw.githubusercontent.com/minio/minio-service/master/windows/install-service.ps1" -OutFile "C:\install-service.ps1"
C:\install-service.ps1
net start MinIO
This code block does the following:
- Downloads MinIO
- Downloads the MinIO Powershell script to install as a Windows service
- Installs the MinIO service
- Starts the MinIO service
Starting MinIO as a Windows service allows it to continue running between steps in the same workflow.
Now our tests are passing on both Windows and Linux.
As the simplicity and success of the above walkthrough showcases, choosing MinIO as our S3 replacement proved to be the correct choice. It provides the cross-platform support and S3 compatibility we need to resolve these recurring test failures. With a few changes to our test suite and GitHub actions workflows, we solved our testing conundrum and can move on to our next initiative.