Fasten Docker build

Gif for Fasten Docker Build

Context

Recently I started working on a microservices project, as a DevOps engineer my responsibility was to ensure smooth build and release of the project. One of the challenges that I was facing was the image building process of the projects was painfully slow. Following true Opstree spirit of continuous improvement I started exploring how I can fix this problem and finally got a decent success, I was able to reduce docker image build time from 4 minutes to 20 seconds. In this blog, I would like to showcase various ways through which image building can be reduced drastically.

You can find the complete code available here in this repository

Problem statement

I’m using a Springboot HelloWorld project to walk you through the problem statement and eventually the various solutions that we would be applying.

The base of the problem lies in the fact that 80-90% of image building time is consumed in downloading the dependencies defined in pom.xml. Since the scope of downloaded dependencies is limited to the image build process only, that’s why every time an image builds the complete process starts from the beginning.

FROM maven:3-jdk-8

LABEL maintainer="opensource@opstree.com"

WORKDIR /usr/src/app

ADD . /usr/src/app
RUN mvn clean package -Dmaven.test.skip=true

ENTRYPOINT ["java","-jar","/usr/src/app/target/helloworld-0.0.1-SNAPSHOT.jar"]

If we build a docker image using the above Dockerfile, it will take close to 3 minutes to build the image.

$ make problem-build-package-with-time
real	3m3.356s
user	0m1.221s
sys	0m1.067s

Solution1 | Avoid downloading dependencies

We know the solution to this problem lies in the fact “if somehow we can skip downloading dependencies” our problem will be solved. If there would have been an option to mount the host system local repository( ~/.m2) while building the image this problem would have been resolved.

We have solved this problem by moving the artifact generation part out of the image building process. Artifact generation is done using a build container having the local maven repo mounted so that dependencies would be downloaded only if not already present.

solution1-build:
	docker run -it -v ~/.m2/repository:/root/.m2/repository \
	 -w /usr/src/mymaven -v ${PWD}:/usr/src/mymaven --rm \
	 maven:3-jdk-8 mvn clean package -Dmaven.test.skip=true

Once the artifact generation is done the only thing that you have to do in your Dockerfile is to copy the generated artifact in your Docker image.

FROM maven:3-jdk-8

LABEL maintainer="opensource@opstree.com"

WORKDIR /usr/src/app

ADD target/helloworld-0.0.3-SNAPSHOT.jar /usr/src/app/app.jar

ENTRYPOINT ["java","-jar","/usr/src/app/app.jar"]

Now if we build our image post this solution the image build time will drastically reduce to ~20 seconds

solution1-build:
	docker run -it -v ~/.m2/repository:/root/.m2/repository \
	 -w /usr/src/mymaven -v ${PWD}:/usr/src/mymaven --rm \
	 maven:3-jdk-8 mvn clean package -Dmaven.test.skip=true

solution1-package:
	docker build -t opstree/fasten-build -f Dockerfile.solution1 .

solution1-build-package:
	make solution1-build
	make solution1-package

solution1-build-package-with-time:
	time make solution1-build-package >/dev/null 2>&1

$ make solution1-build-package-with-time
time make solution1-build-package >/dev/null 2>&1

real	0m15.212s
user	0m0.288s
sys	0m0.286s

Solution2 | Leverage docker layer caching

One of the very interesting concepts of docker is the caching of layers where a layer is only built if there are supposed to be some changes.

FROM maven:3-jdk-8

LABEL maintainer="opensource@opstree.com"

WORKDIR /usr/src/app

ADD pom.xml /usr/src/app
RUN mvn dependency:resolve -Dmaven.test.skip=true

ADD . /usr/src/app
RUN mvn clean install -Dmaven.test.skip=true

ENTRYPOINT ["java","-jar","/usr/src/app/target/helloworld-0.0.3-SNAPSHOT.jar"]

If you notice lines 7 & 8 are the new addition in comparison to the original Dockerfile. When the image will be built using this Dockerfile, the layers corresponding to lines 7 & 8 will be only built if & only if there is a change in pom.xml else the previously built layers cache will be used. Hence no time will be wasted in downloading the compile-time dependencies(A concept unique to maven).

solution2-build-package:
	time docker build -t opstree/fasten-build -f \
Dockerfile.solution2 .

solution2-build-package-with-time:
	time make solution2-build-package >/dev/null 2>&1

$ make solution2-build-package-with-time
time make solution2-build-package >/dev/null 2>&1

real	1m32.215s
user	0m0.788s
sys	0m0.787s

Solution 3 | Best of both worlds

Solution2 worked pretty well, the only problem in the approach was that if even there would be 1 single change of line in pom.xml the dependencies layer would be built again.

FROM maven:3-jdk-8

LABEL maintainer="opensource@opstree.com"

WORKDIR /usr/src/app

ADD . /usr/src/app
RUN mvn clean package -Dmaven.test.skip=true

In solution 3 we are sort of combining Solution1 and Solution 2 where the dependencies downloading is moved out via an intermediate builder image as shown above.

The final image build process will use this build image as a base image that will already have all the dependencies inside it. Even if there is a new dependency added in pom.xml the final image will only download that delta dependency as the rest of the dependencies would already be present via base builder image.

FROM opstree/fasten-build-builder

ADD . /usr/src/app

RUN mvn clean package -Dmaven.test.skip=true

ENTRYPOINT ["java","-jar","/usr/src/app/target/helloworld-0.0.3-SNAPSHOT.jar"]

The complete image building process would be executed as given below, please note that you don’t have to build the builder image frequently it can be a scheduled operation nightly or weekly depending on the dependencies update in your pom.xml.

solution3-build-builder:
	docker build -t opstree/fasten-build-builder \
	 -f Dockerfile.solution3.builder .

solution3-build-package:
	docker build -t opstree/fasten-build \
	-f Dockerfile.solution3 .

solution3-build-package-with-time:
	time make solution3-build-package >/dev/null 2>&1

$ make solution3-build-package-with-time
time make solution3-build-package >/dev/null 2>&1

real	0m4.800s
user	0m0.224s
sys	0m0.200s

Conclusion

In conclusion, I would like to summarise that you can either go for Solution 1/2 or 3 on case to case basis. Solution 2 will be apt when your pom.xml is stabilized and there are rarely any changes in pom.xml.

Also, you would have noticed that the Dockerfile is not written as per best practices i.e non-root user, multistage docker build that I didn’t cover intentionally. That I would like to cover in the next blog.

More importantly, I would like to stress the most important part that is continuous improvement whenever you do work it’s fine that you start with a workable solution but then you should be in constant pursuit of taking it to next level.

Feel free to give any feedback or recommendation would love to hear your thoughts or if you have another smart solution that would be great. Till we meet next time Happy Learning.

Gif reference: https://giphy.com/gifs/the-flash-Z2pZfL0YPC9sk

 

Opstree is an End to End DevOps solution provider