5 tips for flexible and scalable CI with Docker
Nulab
March 14, 2016
As Docker mentions in their use case, Continuous Integration and Continuous Delivery (CI/CD) is an excellent way to utilize the tool outside the production system. At Nulab, we have been using Docker this way since we began offering pull requests for Backlog. And now we want to share five tips we learned in the process.
Nulab’s CI Environment
This graphic represents an overview of Nulab’s CI Environment:
Jenkins is the center of our CI environment. Backlog and Typetalk trigger jobs in Jenkins. In practice, we use the Jenkins Backlog Plugin and Jenkins Typetalk Plugin in tandem.
Job execution is done on the slave side, with EC2 mainly for testing and an exclusive slave for special environments, like mobile builds. Currently, we have over 10 servers in the cluster, including the master, and average 250 builds a day.
Now we’ll discuss more detailed points we take care of when operating this CI environment.
Five tips for Docker in CI
1. Keep slave configuration method simple
We set up a slave using Jenkins AWS EC2 Plugin with the following simple rules:
- Use the latest version of Amazon Linux for AMI as it is
- Simply install Docker and Docker compose when starting EC2 instance
This shortens the slave’s startup time and also creates a CI environment that will work anywhere so long as Docker is running.
2. Test in a single Dockerfile
It is not rare to run a test that requires a database or other middleware. In general, it’s better to have one process per Docker container. For testing, though, we consider it OK to have several processes running in one container.
In the example below, we installed Redis in a Docker container while running the Java processes for testing.
Dockerfile example
FROM java:openjdk-8 # install redis RUN apt-get install –y redis-server
Test job example
docker run ${TEST_IMAGE} bach –c "service redis-server start ; ./gradlew clean test"
This enabled us to run the test easily without changing test settings like the database server name and port from a local development environment.
Docker Compose is another approach. But, aside from cases where you would want to use the Dockerfile for something other than testing, the method described above is usually good enough.
3. Use cache effectively
To keep build time short, it’s important to use the cache effectively. Let’s look at how to do this from two points of view: Docker image and dependent library.
Use Docker image of the in-house registry
We have an in-house registry of our own (see below) and save custom images there. We create custom images by adding required runtimes like JDK, Perl, Python, etc to publicly available images.
You should place an in-house registry near slaves so that custom images will be downloaded quickly. In our case, we have the in-house registry in AWS Tokyo region, since that’s where our slaves are running. This shortens the build time compared to using a public image because it can be downloaded faster and contains everything necessary for the build.
Cache dependent libraries
Cacheing the dependency library is the most important point for building a Java project. We do it in the following three patterns.
Host directory pattern
The first pattern uses a host directory as a cache. When running the Docker container, mount the host directory to directory inside the container where dependent libraries are stored (see below).
docker run –v ${HOME}/.gradle:/root/.gradle ${TEST_IMAGE} ./gradlew clean test
If you don’t have dependent libraries in the host directory, they will be resolved and downloaded on the first run. You can then use the resultant dependent libraries from host directory as your cache. In the case of Java, there’re commonly used libraries across projects, which increases their cache effectiveness. A drawback, however, is the possibility of a permission issue for the host directory — we will address this later in the section about deleting the container after a build.
Cache in advance pattern
The second pattern is to build the Docker image with dependent libraries in advance of testing and use the image as a cache.
First, create the Dockerfile:
RUN mkdir -p /opt/app COPY requirements.txt /opt/app/ WORKDIR /opt/app RUN pip install -r requirements.txt COPY . /opt/app
Next, run it:
docker build –t ${TEST_IMAGE} . docker run ${TEST_IMAGE} py.test tests
This method works best if the dependencies (specified in requirements.txt, above) do not change often. In the example above, we installed dependent libraries by “RUN pip install -r requirements.txt” which means they will be cached in Docker until requirements.txt changes. Since everything is included in the Docker container, we will avoid the permission problem previously mentioned, but this method has its own drawback: if you change a single dependency, the entire cache is cleared.
External cache pattern
Finally, we have the external cache pattern. Here we took a hint from Travis CI’s approach. In this method, you make the Dockerfile like this:
RUN mkdir /root/.gradle RUN cd /root/.gradle; curl -skL https://s3-ap-northeast-1.amazonaws.com/${CACHE_BUCKET}/cache.20151201.tar.gz | tar zxf -
This very simple approach is much faster than making Gradle or sbt resolve and download the dependencies, even for the first build activation right after starting the slave. But the drawback is that you need to maintain an external cache.
Each approach has its own advantages and drawbacks. In many cases, we can run a test of Java project as root user and thus adopt the first approach without any drawbacks. For Python and Perl projects, however, it is much easier to debug if the dependencies are in the container, so we use the second approach. The third approach is especially useful to Java projects, so we are currently considering if we can adopt this method mixed with the host directory method.
4. Remove container after a build
If you don’t remove unnecessary containers after running the test, they will eventually use up host storage. Before removing the container, however, you must save the build result, result report, and the .war file. Otherwise, everything will be gone.
Here are two ways to remove containers after you get the build result:
Mount Jenkins’ workspace
The first method is to mount Jenkins’ workspace to the working directory of container. In the following example, first we set WORKDIR to/opt/app and then run docker.
docker run --rm –v $(pwd):/opt/app ${TEST_IMAGE} ./gradlew clean test
As a normal build tool often stores its result under the working directory, you don’t have to explicitly retrieve the build result from container. The result will just remain in Jenkins’ workspace after the test.
However, if the test needs to be run by a non-root user, you have to consider the permissions of the host directory within the Docker container. We experienced this issue when installing several libraries by npm, or using a library like testing.postgresql. If you don’t care about it, some problems may occur, such as failing to write the build result or to build itself in the next run.
We solve this permission problem in either of the following two ways.
In the first method, we write the result in the directory where the execution user has permission to write when running the build. Then we copy that file after the build. This is illustrated below, in a file called run.sh:
su test-user –c "py.test tests –-junit-xml=/var/tmp/results.xml" cp –p /var/tmp/results.xml .
Then we run docker as below:
docker run --rm –v $(pwd):/opt/app ${TEST_IMAGE} ./run.sh
In the second method, we change the owner of the directory to the execution user at the beginning of the build, and change it back after the build is done. This is illustrated below, as a different run.sh file:
chown test-user . su test-user -c "py.test tests" chown $1 .
Then we run docker as below:
docker run --rm –v $(pwd):/opt/app ${TEST_IMAGE} ./run.sh $(id -u)
Here we pass uid in the host as an argument to the script. There may be other approaches such as getting ownership of the directory inside a container.
Get result after test run, then remove the container
In the other approach to properly removing containers, we get the results file from the container after the build, like this:
UNIQUE_NAME=“TEST_${GIT_COMMIT}_$(date +%s)” docker run --name=${UNIQUE_NAME} ${TEST_IMAGE} ./gradlew clean test docker cp ${UNIQUE_NAME}:/opt/app/build/test-result/ test-result docker rm ${UNIQUE_NAME}
In this approach, you won’t encounter the permissions problem mentioned earlier, but there are a few steps involved in removing the container.
Both methods have advantages and disadvantages, but so far we’ve adopted the first approach, even though it is a little complicated, because it is less likely to leave trash.
5. Dockernize tools required for job execution
We created our own tool to upload the build archive (mainly .war and zip file of Play! Framework, etc.) and static resources inside of the archive to S3. Originally, we installed the tool to the slave and ran it in job settings like this:
/usr/local/bin/upload-static-s3 ROOT.war -b ${S3_CDN_BUCKET}
That is what we used to do. Now, we create the custom image including the tool and set ENTRYPOINT in the image like this.
ENTRYPOINT ["/usr/local/bin/upload-static-s3"]
The image is stored in our in-house registry, and we run docker as below, using the same arguments as the ones we used before:
docker run --rm ${IN_HOUSE_REGISTORY_URL}/upload-static-s3 ROOT.war –b ${S3_CDN_BUCKET}
We wrote this tool in Go language for easy installation to the slave, but with Dockernize that became unnecessary. Finally, we can run the whole CI process of ‘test, build, upload to s3′ by Docker alone.
Advantages of using Docker in CI
After adopting Docker, we saw three advantages:
- We can now run tests for all pull request branches
- We have improved our build performance
- We can now run CI wherever Docker is running
Regarding the first point, slave setup became easier and now we can run it in an independent environment, so we can automatically run tests for each pull request.
Regarding the second point, slave changes have become easier, so when performance needs improvement, we can easily solve the problem by using a higher spec instance or adding more slaves. As a result, we managed to nearly halve the build time after introducing Docker and instance type changes, as shown below in a graph of the build times for one project.
Regarding the third point, now we can run the whole CI process with Docker, so we can run it not only in AWS but also in other cloud environments. CI/CD environment is now indispensable for our service operation. The ability to reconstruct easily in different environments is essential for service availability.
A disadvantage, on the other hand, is that not all application developers are familiar with Docker, so it makes the maintenance of CI harder for some members of our team. But Docker’s useful points are significant, and spread beyond CI, so we feel it is warranted to introduce it to our development process and educate our team so they become familiar with it.
Final thoughts
These days, I’ve had more opportunities to see and hear about examples of how Docker is used in production. However, it’s not easy to adopt what we’ve learned to our specific environment because there are so many unique specifics, conditions, and features for every project. In this regard, the CI environment seems the most uniform, and thus the easiest to place to begin with Docker — it is easier to learn from and adopt practices from other examples. Considering this along with the clear advantages we mentioned in this post, CI/CD seems the best place to try Docker in our workflow, allowing our team to gain useful experience with it.