Dockerizing a Maven Java Application - OpenData-tu/documentation GitHub Wiki

Authorship

Version Date Modified by Summary of changes
0.1 2017-07-04 Andres Ardila First draft

TL;DR

The Dockerfile should contain the following sections:

FROM openjdk:8-jre-alpine

ADD "target/dataimporter-jar-with-dependencies.jar" "app/dataimporter-jar-with-dependencies.jar"

ENTRYPOINT ["java", "-jar", "app/dataimporter-jar-with-dependencies.jar"]

Note that the image is based on the alpine flavor so as to keep images even smaller. The non-alpine image in openjdk:8-jre is 310 MB (!) as opposed to 81.4 MB for the alpine flavor.

The pom.xml should include a fat JAR goal, for example:

<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
    <modelVersion>4.0.0</modelVersion>

    <groupId>edu.tu_berlin.ise.opendata</groupId>
    <artifactId>dataimporter</artifactId>
    <version>1.0-SNAPSHOT</version>

    <properties>
        <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
        <java.version>1.8</java.version>
    </properties>

    <dependencies>
        <!-- project dependencies -->
    </dependencies>

    <build>
        <plugins>
            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-compiler-plugin</artifactId>
                <version>3.6.1</version>
                <configuration>
                    <source>1.8</source>
                    <target>1.8</target>
                    <encoding>${project.build.sourceEncoding}</encoding>
                </configuration>
            </plugin>
            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-assembly-plugin</artifactId>
                <executions>
                    <execution>
                        <goals>
                            <goal>attached</goal>
                        </goals>
                        <phase>package</phase>
                        <configuration>
                            <finalName>dataimporter</finalName>
                            <descriptorRefs>
                                <descriptorRef>jar-with-dependencies</descriptorRef>
                            </descriptorRefs>
                            <archive>
                                <manifest>
                                    <mainClass>edu.tu_berlin.ise.opendata.dataimporter.DailyImporter</mainClass>
                                </manifest>
                            </archive>
                        </configuration>
                    </execution>
                </executions>
            </plugin>
        </plugins>
    </build>
</project>

Passing Arguments at Runtime

Main method args

To pass arguments to main(String[] args), simply pass them when running the Docker container:

$ docker run user/daily-data-importer:latest --"2017-07-04"

This will result in args[0] having the value "2017-07-04"

Environment Variables

Pass them when running the Docker container as follows:

docker run --env VAR1=value1 --env VAR2=value2 user/image:version

To read the values of the variables in Java, use System.getenv().

Background

Currently, online tutorials on the topic of dockerizing Java applications with Maven perform the dependency resolution and the build in the Docker. Usually with Dockerfiles along the lines of

FROM java:8

# Install maven
RUN apt-get update
RUN apt-get install -y maven

WORKDIR /code

# Prepare by downloading dependencies
ADD pom.xml /code/pom.xml
RUN ["mvn", "dependency:resolve"]
RUN ["mvn", "verify"]

# Adding source, compile and package into a fat jar
# This assumes you've configured such a goal in pom.xml
ADD src /code/src
RUN ["mvn", "package"]

EXPOSE 4567
CMD ["java", "-jar", "target/blume-jar-with-dependencies.jar"]

In other words, this translates into having to: 1) copy all source code to the Docker image, 2) installing Maven in the Docker image, 3) basing the image on a JDK-based image, 4) downloading and verifying dependencies, and 5) generating a JAR with a mvn package command.

Building the Docker image takes significantly longer than desirable, however, since the download & verification of dependencies takes a while, even when the project has only a handful of dependencies. Secondly, the resulting image has excess bagagge that it will never need in order to execute; namely, the source code, the build tools (Maven), the dependencies provided a fat JAR was generated, and the JDK.

Objectives

We want images which are:

  1. fast to build, and
  2. as slim as possible.

Walkthrough

  1. Make sure the pom.xml files has a jar-with-dependencies goal configured
  2. Modify the Dockerfile (making sure to use the correct filename for the resulting JAR)
  3. Run the Maven build (mvn package or in the IDE)
  4. Confirm that the fat JAR was generated in ./target
  5. Build the Docker image
  6. Tag it with your Dockerhub username in preparation for pushing it to the Hub (docker tag user/image:tag)
  7. Push it with docker push user/image:tag

Rationale

Docker attempts to solve the "it-works-in-my-machine" syndrome. It would therefore make sense to move the dependency resolution and the build outside the developer's machine and into an independent environment. While this is indeed the correct approach, moving it to the Docker image is hardly the right approach. Ideally, we would want some independent process which is able to build Java applications and create lean Docker images based only on the JRE and a JAR. This is normally taken care of in a CI pipeline, but since we don't have such a pipeline at our disposal, we will do the JAR build locally. The biggest risk in building is the large variation in development environments and thei configurations which could result in the same source code being built into different binaries due to those variations. But by using Maven (or another Maven-based dependency manager like Gradle), this risk is greatly reduced by centralizing the binary dependency repositories, and providing a standard descriptor for dependencies and the build itself (namely the pom.xml file). A big risk in having developers generate binaries is that we can no longer be sure that they represent the source code at all, and not some other unknown (and therefore potentially malicious) code. But our binaries are very much in an early development stage and all written by members of the same team. Obvioulsy the problem doesn't go away entirely and we're merely shifting it away from the infrastructre deployment stage. By reasoning and understanding the consequences of not building in Docker, we make an informed decision accept the risks of a temporary solution, which should really be solved by introducing a more robust CI pipeline.

In numbers

To illustrate the effect, the BLUME importer Maven-based image took around 1.5 minutes to build; this version takes around 20 seconds. The former image was based on openjdk:8-alpine and used 136 MB; the same image built using the method described above with openjdk:8-jre-alpine was 85 MB.

⚠️ **GitHub.com Fallback** ⚠️