Docker Image Instructions - sidaw/codalab-worksheets GitHub Wiki

Bundles are run inside Docker containers. Containers provide an isolated Linux environment for your code containing various libraries and software packages. CodaLab uses an Ubuntu Linux 14.04 image by default. You can specify which image to use when you are creating a run bundle using the --request-docker-image <image> flag. If the default image doesn't have the package you need, your options are:

  1. Find an image that someone else has built. Package maintainers often release Docker images containing their packages.

  2. Build your own image and upload it to Docker Hub. Instructions are given below.

Building your own images

Detailed instructions for building images are available on the Docker website here. In the spirit of reproducibility we recommend building images using a Dockerfile so that how the image is built is documented. The steps are as follows:

  1. Download and install Docker.

  2. Create a directory for your image and cd into it. Then start editing a file with the name Dockerfile.

    mkdir myimage
    cd myimage
    vim Dockerfile
    
  3. Your image will contain everything from a base image that you will add to by running Linux commands. Good images to start from include ubuntu:14.04, codalab/ubuntu:1.9 and nvidia/cuda:7.5-cudnn4-devel (which sets up NVIDIA CUDA in a way compatible to CodaLab, more below). Specify the image in the Dockerfile:

    FROM ubuntu:14.04
    
  4. Specify a maintainer documenting who maintains the image:

    MAINTAINER My Humble Self <[email protected]>
    
  5. Add Linux commands to run that install the packages you need and do any other setup.

    RUN apt-get -y update
    RUN apt-get -y install python2.7
    
  6. (Optional) You can set up environment variables and so on with a .bashrc file. This file should go into a working directory and will be sourced when your container starts executing on CodaLab. Note that your base image may already have a working directory and .bashrc file that you should keep and add to. See what is inside your base image by running it: docker run -it --rm codalab/ubuntu:1.9 /bin/bash

    RUN mkdir -m 777 /user
    RUN printf "export PYTHONPATH=src\n" > /user/.bashrc
    WORKDIR /user
    
  7. Create an account on Docker Hub where you will upload the image. Note your Docker Hub ID which you will use below (for our example, we use the ID humblepeople).

  8. Finish editing the Dockerfile and build your image, specifying your Docker Hub ID, a name and a tag:

    docker build -t humblepeople/python:1.0 .
    
  9. (Optional) Verify your image looks good to you by running a few test commands. If it doesn't, update your Dockerfile and rerun the build command.

    docker run -it --rm humblepeople/python:1.0 /bin/bash
    root@fb586e56ac91:/user# python2.7
    >>> exit()
    root@fb586e56ac91:/user# exit
    
  10. Upload your image to Docker Hub:

docker push humblepeople/python:1.0
  1. Use your image on Codalab by specifying the --request-docker-image humblepeople/python:1.0 flag with the cl run command.

  2. Make your Dockerfile available when your share your worksheets. Either upload it to your worksheet, add a link to it from the worksheet, or set up automated builds.

Common use cases

Installing packages

It is straightforward to install other packages via the Dockerfile

RUN pip install --upgrade numpy

One common use case is that dependencies are specified in a pip requirements file requirements.txt, which we can add to docker with the command ADD <src>... <dest>

ADD ./requirements.txt /user/requirements.txt
RUN pip install -r /user/requirements.txt

More references on Docker files can be found here. pip install instructions for TensorFlow can be found here.

Using GPUs with CUDA

worksheets.codalab.org doesn't have any machines with GPUs, but if you run your own worker or are using a worker already set up by someone else on a machine with GPUs then keep reading.

CUDA consists of several components:

  1. The NVIDIA driver, a kernel module, working with the /dev/nvidia device files.
  2. CUDA driver, a shared library, working with the NVIDIA driver. This driver is different for different versions of the NVIDIA driver and thus the two need to agree.
  3. CUDA Toolkit containing code for using CUDA.

CodaLab makes 1 and 2 available inside the Docker container, as long as they are installed on the machine running the worker. If the image already contains the CUDA driver, it will be overridden since its version is unlikely to match the version of the NVIDIA driver on the host machine. To use the CUDA Toolkit you need to include it in your image. A good base image that contains the toolkit is nvidia/cuda:7.5-cudnn4-devel. The Tensorflow image gcr.io/tensorflow/tensorflow:latest-gpu is also set up correctly. Note, the Theano image available from the Theano website is not set up correctly at the time of writing, since it is missing the CUDA toolkit.

⚠️ **GitHub.com Fallback** ⚠️