Introduction to Docker - clizarraga-UAD7/Workshops GitHub Wiki
đ§
Docker, a platform based service (PaaS) uses OS-level virtualization to deliver software in packages called containers.
In other words, Docker is a platform used to containerize your software. With it, you can build your application, packaging it with all dependencies required for your application into a container. These containers can then be easily shipped to run on other machines
Representation of Docker Architecture
The Docker software as a service consists of three components:
Software: The Docker Engine includes:
- The Docker daemon, called
dockerd
, which is a process that manages Docker containers and handles container objects. The daemon listens for requests sent via the Docker Engine API. - The Docker client program, called
docker
, provides a command-line interface (CLI), that allows users to interact with Docker daemons.
đ Note: (click to open)
There is a Docker Engine called Docker Desktop, available for MacOS/Windows/Linux that includes a Docker daemon and Docker client (CLI) and other tools to run locally on your machine.
Objects:
Docker objects are various entities used to assemble an application in Docker. Objects are of three classes:
- A Docker container is a standardized, encapsulated environment that runs applications, and is managed using the Docker API or CLI.
- A Docker image is a read-only template used to build containers used to store and ship applications.
- A Docker service allows containers to be scaled across multiple Docker daemons, resulting in what is known as a Docker swarm, a set of cooperating daemons that communicate through the Docker API.
An important distinction is between base and child images.
- A base image is an images that has no parent image, usually are images with some OS version installed (
busybox
,alpine
,ubuntu
,centos
,amazonlinux
,debian
, etc.) - A child image is build on a base image with some extra functionality integrated.
Then we can find official and user images.
- Official images are maintained and supported by the staff at Docker.
-
User images are images build on base images with extra functionalities, created and shared by general users. These images can be identified as
user/image-name
. You can find certified users and general users.
Registries: A Docker registry is a repository for Docker images.
- Docker clients connect to registries to download ("pull") images for use or upload ("push") images that they have built.
- Container registries can be public or private. Two main public registries are Docker Hub, and Gitlab Registry. Docker Hub is the default registry where Docker looks for images.
Docker Hub (https://hub.docker.com), is the official repository for images.
Some popular images are:
-
Hello World. Used for testing your Docker Engine installation. (To download it type:
docker pull hello-world
). -
Alpine. It is a minimal Linux image less than 5MB in size. (To download it type:
docker pull alpine
). -
Ubuntu. Is an Ubuntu Linux distribution. (To download:
docker pull ubuntu
) -
rocker/rstudio. RStudio image. (To download:
docker pull rocker/rstudio
) -
jupyter/datascience-notebook. Jupyter Notebook Data Science Stack. (To download:
docker pull jupyter/datascience-notebook
) -
pangeo/pangeo-notebook. Pangeo big data geosciences. (To download:
docker pull pangeo/pangeo-notebook
).
âšī¸ First, you need to open a user account on Docker.com.
Next, to use Docker, you need to have either installed Docker Desktop on your machine or have access to a Github Codespaces developing environment in an Organization Github.
Docker Desktop will install all docker tools for container development and deployment. It provides with all needed software for running containers in our local machine.
It is a good practice, to also have an integrated code development environment VS Code Editor installed in your computer, that allows you to easily synchronize files with Github repositories. Please install it on your machine.
Aside that you can develop code using VS Code, you can also add a collection of extensions to integrate Container development and deployment (Docker, Kubernetes, Google Cloud, Azure and others), work in Data Science (Python, Jupyter Notebooks, PyTorch, Azure ML and more).
We will assume that you have your environment ready to start working with Docker.
Open VSCode, and start a Terminal.
You can do a simple test, running the hello-world docker application.
docker run hello-world
The docker API will download the hello-world
latest image and run it as a container and you should be getting back a message as a result of the action
Hello from Docker!
This message shows that your installation appears to be working correctly. ...
And it explains all the processes that were involved in printing the Hello World message to your Terminal.
Next, we can enter the command: docker ps -a
and the docker system will show a log history of containers that have been executed. Of the returned information, we need to note the CONTAINER_ID
. We can clean these cache memory by executing the command docker rm CONTAINER_ID
, it is sufficient to enter the first 3 unique characters of the CONTAINER_ID
, we do not need to enter the full ID.
The main docker command option is --help
: docker --help
Initial commands:
Command | Description |
---|---|
docker --help |
List all Docker command options |
docker create IMAGE_NAME |
Searches Docker Hub for that image, downloads it to your system and creates a stopped container. |
docker run [Options] IMAGE_NAME |
If image is not found, will search Docker Hub, download it and run it. |
docker rename CONTAINER NEW_NAME |
Rename a container. |
docker search TERM |
Searches Docker Hub for images. |
Container and Image manipulation:
Command | Description |
---|---|
docker container --help |
List Docker container options |
docker container ls |
List containers |
docker ps |
List the running containers |
docker ps -a |
Lists all active containers status |
docker container rm CONTAINER_ID or docker rm CONTAINER_ID
|
Removes a container by ID |
docker image --help |
List Docker image options |
docker image ls |
Lists available local static docker images |
docker image rm IMAGE_ID or docker rmi IMAGE_ID
|
Removes a specific static docker image |
From the terminal you can list the running containers by typing: docker ps -a
or docker container ls
. These commands will return information of the running containers (CONTAINER ID
, IMAGE
, COMMAND
, CREATED
, STATUS
, PORT
, NAME
). The CONTAINER ID
and IMAGE
will be used as an argument for other docker commands. The CONTAINER_ID
and NAME
tags, change every run.
đ Note: (click to open)
(We can substitute the full `CONTAINER_ID` or `IMAGE_ID` string, with the first 3 or 4 unique characters of the ID)Docker Start/Stop/Restart/Pause/Unpause CONTAINER_ID
:.
Command | Description |
---|---|
docker start CONTAINER_ID |
Starts a stopped container |
docker stop CONTAINER_ID |
Stops a running container |
docker restart CONTAINER_ID |
Restarts a stopped container |
docker pause CONTAINER_ID |
Pauses a running container |
docker unpause CONTAINER_ID |
Resumes a paused container |
Docker Volumes:
Command | Description |
---|---|
docker volume --help |
List Docker volume options |
docker volume ls |
List available volumes |
docker volume create _myvol_ |
Create a local volume named _myvol_
|
docker volume inspect _myvol_ |
Returns volume general description |
docker volume rm _myvol_ |
Removes the specific volume |
If we start with a Docker image base, and we would like to customize it by adding some additional packages to fit our needs, the we need to configure a Dockerfile
to build a new Docker image. The Dockerfile
is a set of line instructions and does not have any file extension.
Instruction | Description |
---|---|
FROM |
Initializes a new build stage and sets the base image. |
ARG |
Defines a variable that users can pass at build-time to the builder with the docker build command. The ARG variable can be used before RUN to pass a default value. |
ARG VERSION=latest |
|
FROM base:${VERSION} |
|
RUN |
Executes any command in a new layer on top of the current image and commits the results. |
RUN <command> . The command runs in a shell. |
|
RUN ["executable", "param1", "param2"] . The exec form. |
|
CMD |
Also has 3 forms: |
CMD ["executable", "param1", "param2"] . The exec form (preferable). |
|
CMD ["param1", "param2"] . As default parameters to ENTRYPOINT. |
|
CMD command param1 param2 . The shell form. |
|
LABEL |
Adds metadata to an image. |
EXPOSE |
Informs Docker that the container listens on the specified network ports at runtime. The port number must be included in the docker run -p 80:80
|
ENV |
Sets an environment variable value. |
ADD |
Copies new files, directories or remote file and adds them to the filesystem of the image at the path. |
COPY |
Copies new files or directories from <src> and adds them to the filesystem of the image at the path <dest> . |
Has 2 forms: | |
COPY <src> ... <dest> |
|
COPY ["<src>",...,"<dest>"] |
|
ENTRYPOINT |
has 2 forms: |
ENTRYPOINT ["executable", "param1", "param2"] . The exec form (preferable) |
|
ENTRYPOINT command param1 param2 . The shell form. |
|
You can override the default value with --entrypoint and an executable command. |
|
VOLUME |
Creates a mount point for exterior mounts. |
Format can be VOLUME ["/home/user"] or VOLUME /home/user . |
|
USER |
Sets the user name (or UID) to use when running the image and for any RUN , CMD , and ENTRYPOINT instructions that follows in the Dockerfile . |
WORKDIR |
Sets the working directory path for any RUN , CMD , ENTRYPOINT , COPY , and ADD instructions followed in the Dockerfile . |
Next we will run a minimal Docker image based on Alpine Linux named alpine
, which is only 5 MB in size. đ§
Let's create a local volume called myvol
.
docker volume create myvol
And type docker volume ls
and find out what docker volume inspect myvol
.
The local volumes can be assigned to a docker container using the -v myvol:/tmp
option, where myvol
will track changes in docker container /tmp
directory.
Run
docker run -it --rm -v myvol:/tmp --name AlpineLinux alpine
This command will download the latest docker image of Alpine Linux, and run a docker container, where we have introduced the docker options:
-
-it
, which runs ion interactive mode inside the terminal. - The
--rm
option tells the docker CLI to remove the cached image from memory when we finish. - The
-v myvol:/tmp
assigns equivalency between my volumemyvol
and the/tmp
directory in the docker container. - The
-name
option assigns a specific static nameAlpineLinux
to our running docker container. If we don't specify a name, the docker system will assign one in a random fashion every run.
Next, explore the Alpine container doing the following:
- Use the
apk update
command, to update the available packages list. - If you want to use the
nano
editor, you will find that it is not installed. Useapk add nano
to install it. - Change directory to
/tmp
and edit asample.txt
text file and save it.
Unfortunately when the Alpine docker container stops, we will loose all of our work. We need to find the way of importing and saving our work in an external work directory, available from the docker container.
To save a copy of the file we created inside the Alpine docker container, we can use the docker container cp
command to copy files/folders between a container and the local filesystem.
docker container cp AlpineLinux:/tmp/sample.txt .
Will copy the file sample.txt
from the /tmp
directory of docker running container into the present working directory .
in the terminal you are working. The copy command works in both directions to get information into the docker container or out of it.
Once you finish using this container, from a terminal enter docker ps -a
to find the CONTAINER_ID, then enter docker stop CONTAINER_ID
. Remember you can stop it, pause/unpause or restart later.
Say, we want to enhance out Alpine Linux base image and add an editor and also being able to compile code in C. So, we proceed to add a nano
editor and an essential C developer kit.
Create/Edit a file named Dockerfile
in one of your directories.
# The base image
FROM alpine:latest
LABEL author="your-name"
LABEL email="your@email-address"
LABEL version="v1.0"
LABEL description="This is your first Dockerfile"
LABEL date_created="2022-05-10"
# Install dev environment (editors & gcc compilers)
RUN apk update && \
apk add nano && \
apk add alpine-sdk
Then we can build a new customized Alpine Linux for software development, using the following command
docker build -t linux/alpine-sdk:latest .
The -t
flag option is the tag name linux/alpine-sdk:latest
for the customized docker image. The last .
in the above command tells it the location of the Dockerfile
, in this case is the present working directory.
After running the above command, we can see that now we have a new docker image in our list.
docker image list
Now we can run the new docker image and test it.
docker run -it --rm \
--name alpine-sdk -v myvol:/home/src \
linux/alpine-sdk:latest
In your Docker container change directory to /home/src
.
Edit/copy the usual Hello World! in C (hello.c
), using the nano
editor:
// Simple C program to display "Hello World"
// Header file for input output functions
#include <stdio.h>
// main function -
// where the execution of program begins
int main()
{
// prints hello world
printf("Hello World! \n");
return 0;
}
Then compile and run it.
gcc -o hello hello.c
./hello
and see if your program worked.
If we use RStudio for data analysis, then the Docker image rocker/rstudio
can be used.
To run it, from a terminal we enter the following command:
docker run -it --rm \
-v $(pwd):/home/rstudio -e PASSWORD=rs_rocks \
-p 8787:8787 rocker/rstudio:latest
Where the options we have used are:
-
-it
, it keeps the process running on the used terminal, where all the process log is being received. -
--rm
, will delete the container image after it stops. -
-v "$(pwd)":/home/rstudio
, will link the present working directory of the terminal running the process and the rstudio container directory/home/rstudio
. -
-e PASSWORD=rs_rocks
, we are setting a login password for the default userrstudio
. -
-p 8787:8787
are the internal and external ports to connect via the browser.
We then connect via browser to the RStudio Docker container landing page: http://localhost:8787
To login use username rstudio
and the set password rs_rocks
.
Since we mapped our local present working directory to the /home/rstudio
directory, all our R scripts can be accessed from there. Any saved edit will be saved in our local directory.
We are all set.
To stop the RStudio session, we need to save all our work from inside the RStudio container and then quit our session. Then use the command docker stop CONTAINER_ID
to stop it.
See more information and R Docker Images options in The Rocker Project. đ
Next, we show how to start a personal Jupyter Lab Notebook server in a local Docker container running the jupyter/datascience-notebook
Docker image.
docker run -it --rm \
-v "${PWD}":/home/jovyan/work \
-p 8888:8888 jupyter/datascience-notebook
Where the following options are used:
-
-it
, it keeps the process running on the used terminal, where all the process log is being received. -
--rm
, will delete the container image after it stops. -
-v "${PWD}":/home/jovyan/work
, will link the present working directory of the terminal running the process and the container directory/home/jovyan/work
, which will appear on the Files section of the Jupyter Notebook. -
-p 8888:8888
refers to the numeric port mapping,-p External:Internal
. The external port number will be used to connect to the local machine running the container viahttp://127.0.0.1:8888/lab?token=TOKEN_ID
.
By running the above command, the terminal will receive all messages off the running container. Look for lines similar to the following, that returns the instructions how to access your container.
To access the server, open this file in a browser:
file:///home/jovyan/.local/share/jupyter/runtime/jpserver-7-open.html
Or copy and paste one of these URLs:
http://43ec85338263:8888/lab?token=e53fd4dfeaca90cc78df58d1bde9bfb941f94041052eb5f6
or http://127.0.0.1:8888/lab?token=e53fd4dfeaca90cc78df58d1bde9bfb941f94041052eb5f6
Copy the last line and copy it into a Web Browser Tab, and you are ready to start working.
You can copy a Jupyter Notebook into the working directory where your terminal is running, and it will show inside the Jupyter container, since the local directory is mapped to the working directory in the Jupyter Notebook. All changes made will be saved to the local machine.
To stop the Jupyter Notebook container, you can use the usual way of exiting by doing twice Ctrl-C
to shut down the kernel, or you can use the standard docker stop CONTAINER_ID
.
You can read more information about how to use this Docker image at Jupyter Docker Stacks
Official Docs
- Docker Desktop
- Docker Hub - Container Images Library
- Docker overview
- Docker Educational Resources
- Docker Documentation
- Docker command line reference
- Docker Community
Supplementary