Docker Best Practices - thuy-econsys/rails_app GitHub Wiki
leverage build cache
Each step of instructions in your Dockerfile creates a layer and an intermediate container. If you organize the Dockerfile from least changed to most changed, it allows for earlier layers to be cached and reused. Otherwise it's a domino-effect of changes where each layer after a change occurs must also be rebuilt from scratch even if no changes occur in any of the subsequent layers.
Think of the layers as orders of operations and if you place one layer that doesn't change often and has a lot of overhead after another layer that has the potential for changing frequently, the expensive layer will need to be rebuilt after each change in the layer before it. Consider the following scenario:
# Dockerfile
# ...
# copy Gemfiles to container project folder, cached layer
COPY Gemfile Gemfile.lock /project/
# install Gems, cached layer
RUN bundle install
# copy the rest of the source code for your application to container project folder
COPY . /project
Frequent changes to the source code are to be expected during development. If the step for copying the source code to the container is moved just before RUN bundle install
, every single time the source code is changed, the bundle install
layer will re-run the potentially long process of installing all the Gem dependencies instead of being able to reuse the cached layer if bundle install
ran before COPY . /project
. Same scenario for npm install
which will re-install all the package.json
dependencies from scratch after each time the source code is changed if it's placed after COPY . /project
instructions.
RUN
chain commands in Extraneous RUN
commands causes unnecessary overheads and possible issues with dependency installations due to caching.
In particular,
Always combine RUN apt-get update with apt-get install in the same RUN statement.
Running them separately can cause subsequent apt-get install
calls to fail.
- Best practices for writing Dockerfiles - Run | Docker Documentation
- Multiple RUN vs. single chained RUN in Dockerfile, what is better?