Building Docker images - til-ai/til-25 GitHub Wiki
Once your source code is ready for testing and submission, the next step is to build it into a Docker image.
Contents
You should already have set up your Vertex AI Workbench. You should also have code and models that are ready to be tested or submitted.
We'll use Docker to containerize your code, dependencies, and assets (like model weights). Your submission for each task will be built into separate Docker images, which can be run and submitted independently.
If you're not familiar with Docker, expand this section for a quick overview.
Docker is a software tool that helps developers build, run and manage containers. Check out the Docker unit of the TIL-AI curriculum to learn more.
Containers are lightweight, portable environments that run the same way everywhere, whether in your Vertex AI Workbench, on your laptop, or in the cloud. The foundation of every container is an image, which is a blueprint for what the container should contain and how it should behave. You can think of containers as instances of images.
The general Docker workflow involves these steps:
- Write a Dockerfile, which describes how to build your image (e.g. by specifying which files to copy and which installation commands to run).
- Building an image is similar to creating a whole new computer with only what you need inside it.
- Build the image, give it a tag (a name and version), and optionally push it to a container registry so others can access it.
- To execute your code, you run a container from that image.
Dependencies are the third-party packages you import, such as pytorch
or numpy
. Because Docker builds an entirely isolated image containing everything your code needs to run, you need to tell Docker which packages to install.
Your dependencies should be listed in requirements.txt
. This is the standard way most Python package managers, like pip
, track dependencies. Add your dependencies to this file, one per line. You can optionally specify specific versions.
fastapi==0.115.12
uvicorn[standard]==0.34.2
Remember that your submission for each task is built into a separate image. Hence, there's a separate Dockerfile
and requirements.txt
in each directory. If a dependency needs to be used in more than one image, you need to add it to every corresponding requirements.txt
.
Read more about the requirements.txt
file.
The Dockerfile
provides Docker with step-by-step instructions for building your image.
The til-ai/til-25
template repository already provides a Dockerfile
for each model. It uses a common base image for ML tasks, configures your environment, installs your dependencies, copies your src/
directory, and starts your model server.
Here are some things you might wish to change:
- If the default base image doesn't work for you, change it by editing the
FROM
step. - If you need to copy directories besides
src
, add anotherCOPY
step.- You should store model weights adjacent to
src/
, not inside it. We recommendasr/models/
(which is already in.gitignore
).
- You should store model weights adjacent to
Google Cloud provides some guidelines for how to get GPU-enabled custom containers working:
- Instructions on getting GPUs working in custom containers.
- Pre-built base images with CUDA packages and common ML packages pre-installed.
Your Docker container won't have access to the Internet when it's being evaluated, so make sure to download your model weights into your src/
folder before building the image.
For instance, the below code won't work because the .from_pretrained()
method attempts to download a pretrained model from the Hugging Face Hub, which your container can't access.
from transformers import AutoTokenizer, AutoModel
tokenizer = AutoTokenizer.from_pretrained("distilbert-base-uncased")
model = AutoModel.from_pretrained("distilbert-base-uncased")
To fix this particular example, you would download the weights to your src/
directory before building your container. One way to do this is by running the following code once, which downloads a model to the src/models/
directory:
AutoTokenizer.from_pretrained("distilbert-base-uncased", cache_dir="src/models/distilbert-base-uncased")
AutoModel.from_pretrained("distilbert-base-uncased", cache_dir="src/models/distilbert-base-uncased")
Then change your source code to reference the locally saved model instead of downloading it from Hugging Face:
tokenizer = AutoTokenizer.from_pretrained("src/models/distilbert-base-uncased")
model = AutoModel.from_pretrained("src/models/distilbert-base-uncased")
You can save your model weights in whichever folder you'd like. However, we recommend naming the final folder models/
because we've added this path to the template's .gitignore
, which tells Git not to track it. (You shouldn't commit large files like model weights to Git.)
You're now ready to build your Docker image!
First, open a new terminal in your Vertex AI Workbench, then cd
into the directory you want to build, such as asr
, ocr
, etc. Then, build the image using Docker with an image name and optional tag. Your image name should follow the format TEAM_ID-CHALLENGE
.
For instance, if your team name is myteam
and you want to build an image for the OCR challenge:
# Navigate to build directory
cd /home/jupyter/til-25/ocr
# Build your image and tag it as `latest`.
# Don't forget the period at the end - it tells Docker to build in the current folder.
docker build -t myteam-ocr:latest .
# You can also build without a tag.
docker build -t myteam-ocr .
Important
Your image name must end with -asr
, -cv
, -ocr
, or -rl
. Otherwise, the model type can't be inferred, and you'll get an error after submitting your model.
Tip
Although tags are optional, they're a great way to keep track of your submissions. You can tag your models with a word, semantic version, or commit SHA. Dex, your friendly neighborhood Discord bot, will include your image's tag in your team notifications.
Now, if you run docker image ls
, you should see your image appear:
REPOSITORY TAG IMAGE ID CREATED SIZE
myteam-ocr latest ba6519ee9de1 1 minute ago 176MB
The next step is to test your image.
Some power users may prefer to train and build models on their local machine rather than Vertex AI Workbench. This section provides additional info for advanced usage.
The TIL-AI evaluator (to which you will need to submit your models for Qualifiers), runs on an x86-64 system. If your local machine uses a different architecture (e.g. Apple Silicon Macs that run on ARM), you need to pass --platform linux/amd64
to the docker build
command. You may also consider using docker buildx
, which allows you to build for multiple platforms in a single command (e.g. one for submission and one for local testing).
If you choose to build your images locally, you need to take extra care while submitting them for evaluation. There's a separate section for power users in Submitting models.
- TIL-AI Curriculum unit on Docker: https://drive.google.com/drive/folders/1C-BP9-9_yfsV0AHtiIz-LM76FHoBTJh9
- Docker overview from Docker: https://docs.docker.com/get-started/docker-overview/
-
requirements.txt
explained: https://www.freecodecamp.org/news/python-requirementstxt-explained/ - Dockerfile overview: https://docs.docker.com/build/concepts/dockerfile/
- Building Docker images: https://docs.docker.com/get-started/docker-concepts/building-images/
- Docker multi-platform builds: https://docs.docker.com/build/building/multi-platform/