Alert Message
Thanks for signing up! We hope you enjoy our newsletter, The Teller.
docker
engineering

Monolith to Dockerlith: Learnings from Migrating Our Monolith to Docker

By Steven Uray

Like everyone’s monolith, ours is complex and was once a little bit out of control. Earnest’s monolith is involved in all major aspects of the business. It is responsible for accepting loan applications from clients, making a decision for each of those applicants, and even supporting clients after loans have been signed. These wide-ranging responsibilities mean that our monolith is really five different applications in one body of code.

Over 100 developers have contributed to its codebase since inception. The complexity of so many revisions and updates made it difficult to set up and maintain. Beyond standard npm libraries, almost a dozen different dependencies from a database to a mock SQS queue to a Hashicorp Vault instance needed to be set up correctly for it to work completely on a developer’s computer. Our engineering teams had come to expect that getting this application set up on a new computer would take at least a week, and would require the assistance of multiple developers who had been at the company long enough to acquire the necessary tribal knowledge.

As an engineering team, we needed a way to ensure everyone had a consistent local environment. We wanted it to be quick and easy to set up. Finally, we thought the local environment should have greater parity with our CI, staging, and production environments. In order to accomplish these objectives, we turned to Docker, Docker Compose, and The Go Script pattern.

We started the solution by addressing the shared node_modules folder between all five applications. All application containers shared the same node_modules folder inside a Docker volume. Any of these containers could be started in any order and update the npm dependencies. Therefore it became necessary to ensure only one container could write to node_modules at a time.

While there are many ways to control the start up order of Docker containers, we chose to create a bash script that would lock file descriptors at runtime and then executed this script in the entrypoint of each container. After this script ran, it would invoke the application’s process and the application container would be usable by the developer. An application container’s Docker Compose file looks like this:

And here is the entrypoint script itself.

#!/usr/bin/env bash
# Application containers should not continue starting
# if their dependencies are unverified,
# so if any command in this script fails the container
# will stop it’s initialization process and
# an error message will be presented to the user.
set -euo pipefail
# Set variables here
# The resource to have synchronized write access is the node_modules directory.
# An empty, hidden file .npm_lock is created to use it’s file descriptor.
# The file descriptor to be locked is number 200.
# Finally, the .package_md5 file is used to store a hash string of the package.json file
# so no call to npm install is made unless the package.json file has changed.
readonly NPM_DIR=/app/src/node_modules
readonly LOCK_FILE=$NPM_DIR/.npm_lock
readonly LOCK_FD=200
readonly MD5_FILE=$NPM_DIR/.package_md5
# Define functions here
# Here is a generic function that locks a file with a file descriptor,
# using the flock unix command.
# Docker containers share the host’s kernel,
# so they can communicate with each other through these file descriptor locks.
lock() {
# Create lock file.
eval "exec $2>$1"
# Get exclusive, waiting lock on the file descriptor of the lock file.
flock --exclusive $2
}
# Releases the lock acquired with lock()
unlock() {
echo "This container is releasing the lock on the npm libraries"
flock --unlock $1
}
# Containers should only install npm libraries if the dependencies have changed
# in package.json. Therefore, package.json has an MD5 hash saved to a shared file
# each time the dependencies are installed.
# If package.json changes due to new dependencies,
# a different MD5 hash will be generated that differs from
# the one that has been saved.
check_npm_libraries() {
if [ ! -f $MD5_FILE ] || ! md5sum -c --status <<<"$(cat $MD5_FILE)"; then
echo "INVALID"
else
echo "VALID"
fi
}
# If the dependencies have changed, they are updated with a call to “npm install”.
# After this, the hash of the package.json file the dependencies came from
# is saved to the hash file.
install_npm_libraries() {
npm install grunt
echo "Installing npm Libraries..."
npm install --unsafe-perm --loglevel info
write_md5_sum_to_md5_file
}
write_md5_sum_to_md5_file() {
echo "package.json md5" $(md5sum ./package.json)
md5sum ./package.json > $MD5_FILE
}
prepare_npm_volume() {
#Try to get node_modules lock.
lock $LOCK_FILE $LOCK_FD
#Check status of npm libraries now that this container has the lock.
echo "This container has the lock on the npm libraries."
npm_library_status=$(check_npm_libraries)
if [ "$npm_library_status" == "INVALID" ]; then
install_npm_libraries
else
echo "Valid cached npm libraries detected, skipping npm install"
fi
# The lock must be explicitly released here,
# because the process is expected to continue after this script is done.
unlock $LOCK_FD
}
#Execute commands here
prepare_npm_volume
echo "arguments received: [email protected]"
exec "[email protected]"

So: the first time a user starts up the application containers, one of them will grab the lock and install the dependencies. The rest of the containers will wait for it to finish, see that the dependencies are valid before turning the control over to our grunt startup tasks. Dependencies are automatically checked and updated, but subsequent start ups will occur quickly and without calls to “npm install” until the dependencies change.

In the event of a container shutdown, networking failure, or Docker daemon shutdown, the lock on the file descriptor is released automatically. Developers can restart the Docker containers and continue with their workflow to recover from this unexpected failure.

In addition to the container startup synchronization system, we have a Docker image that contains the correct versions of node, npm, and other programs. Docker Compose links the application containers, a Postgres container, a mock Amazon SQS queue, and other supporting containers.

We have implemented The Go Script pattern as the last piece of the puzzle to make setting up the application, starting it, and running the tests one-step commands. This pattern is used by almost every project at Earnest, and it’s implementation in this project brings it in line with the rest of Earnest’s tooling. Developers new to the project can become productive quickly, and all developers can keep their focus on high-level goals instead of low-level implementation details.

Accomplishing these goals was time-consuming and difficult, but worth it. Team morale improved as a longstanding pain point in the daily life of Earnest software developers was eliminated. Our software developers reported that this tool increased their efficiency by an average of 32.5% when working on the monolith. On an average workday, this tool is used around 200 times by the engineering team.

All 25 current contributors to the project are now working with the same, complete version of the application, and they can easily replicate the behavior of the build, staging, and production environments. And now new hires can get the monolith set up in half an hour, instead of a week.

“Very very very nice work fellas” — An Earnest Software Developer

Disclaimer: This blog post provides personal finance educational information, and it is not intended to provide legal, financial, or tax advice.