Docker Best Practices - thuy-econsys/rails_app GitHub Wiki

leverage build cache

Each step of instructions in your Dockerfile creates a layer and an intermediate container. If you organize the Dockerfile from least changed to most changed, it allows for earlier layers to be cached and reused. Otherwise it's a domino-effect of changes where each layer after a change occurs must also be rebuilt from scratch even if no changes occur in any of the subsequent layers.

Think of the layers as orders of operations and if you place one layer that doesn't change often and has a lot of overhead after another layer that has the potential for changing frequently, the expensive layer will need to be rebuilt after each change in the layer before it. Consider the following scenario:

# Dockerfile
# ...

# copy Gemfiles to container project folder, cached layer
COPY Gemfile Gemfile.lock /project/

# install Gems, cached layer 
RUN bundle install

# copy the rest of the source code for your application to container project folder
COPY . /project

Frequent changes to the source code are to be expected during development. If the step for copying the source code to the container is moved just before RUN bundle install, every single time the source code is changed, the bundle install layer will re-run the potentially long process of installing all the Gem dependencies instead of being able to reuse the cached layer if bundle install ran before COPY . /project. Same scenario for npm install which will re-install all the package.json dependencies from scratch after each time the source code is changed if it's placed after COPY . /project instructions.

chain commands in RUN

Extraneous RUN commands causes unnecessary overheads and possible issues with dependency installations due to caching.

In particular,

Always combine RUN apt-get update with apt-get install in the same RUN statement.

Running them separately can cause subsequent apt-get install calls to fail.

References