Testing models - til-ai/til-25 GitHub Wiki

After building a Docker image for your model, you can run it locally to test your code and model.

Contents

Prerequisites

You should already have built your Docker image, following the instructions in Building Docker images.

Running your Docker image

Overview: Use Docker to start a container and run your image.

An image is just a collection of files. To make it run, we need to tell Docker to start it up in a container.

If you a refresher on your existing images, run docker image ls and find the row corresponding to the image you want to test. Note the image name, which appears under REPOSITORY.

REPOSITORY       TAG       IMAGE ID        CREATED         SIZE
my-team-ocr      latest    ac2e140d85b4    1 minute ago    188MB

Then, use Docker to containerize and run your image.

docker run -p PORT --gpus all -d IMAGE_NAME:TAG

# Example:
docker run -p 5001:5001 --gpus all -d my-team-asr:latest

Let's break this down. The argument -p is the port on which the Docker container will listen for requests. --gpus all is an optional flag that tells Docker whether to enable GPUs; you don't need this for every image. Adding the optional -d flag runs the container in detached mode (i.e. in the background). Finally, my-team-asr:latest tells Docker which image ref to run.

Here are the ports for each model, and our suggestion for whether to include the --gpus all flag. If your model does not require GPU (as is the case for most RL models), your container runs far quicker and lighter without GPUs.

Model -p --gpus all
ASR 5001:5001 include
CV 5002:5002 include
OCR 5003:5003 include
RL 5004:5004 omit

After you run your image, you can run docker ps to see a list of running containers. You can use docker ps -a to see a list of all containers, running or not.

[!NOTE] Containers and images are not the same. Images are the snapshots of code, dependencies, and other files that you build from your source code. Containers can be thought of as running instances of images. Don't confuse the two. Learn more.

Running your container offline

To test whether your containers are able to run fully offline, you can also run your container with no network connection at all. Note, however, that you will be unable to reach your container on any otherwise-exposed ports.

# Test fully offline
docker run --network none --gpus all IMAGE_NAME:TAG

# Example:
docker run --network none --gpus all my-team-asr:latest

If your code runs successfully offline, you should be able to see the uvicorn server start up. After that, docker kill your container and re-run it with the right port exposed for testing.

Testing your model

Overview: Use the provided testing scripts to test and score your model.

The test/ directory in the template repo contains Python files that test and score your models locally. To start, simply run the corresponding Python file:

python test/test_asr.py

This will take a while to test your model's performance using your local testing dataset. If there are any errors, you'll see them too. After testing is done, you can see your model's performance as a score. This score will give you a good idea of how your model might perform on the hidden evaluation dataset after you submit it.

[!IMPORTANT] Testing your model locally is only for you to see whether your code works and how your model performs. It doesn't upload your model or record your score in the leaderboard. You must submit your model for the score to count.

When you're done, shut down (kill) your container. First, use docker ps to list the running Docker containers, and find the ID of the one you want to shut down. Then, tell Docker to shut down the container.

docker kill CONTAINER_ID

If you're happy with the results, you can submit your model.

Further reading