Linux CGroups Information - pc2ccs/pc2v9 GitHub Wiki

This Wiki page documents some things that the author learned about Linux CGroups while investigating their potential use as a Linux sandboxing mechanism for PC2.

Overview

Control Groups ("CGroups") are used to manage restrictions on resource usage -- CPU, Memory, IO, Network, etc. Processes are organized in hierarchies, and restrictions can be applied (by the kernel) to various points (process trees) in the hierarchy. Restrictions are managed in the kernel by CGroup "Resource Controllers", or simply "controllers". There is one controller for each type of resource (CPU, Memory, etc.). A "cgroup" is essentially a set of processes that are bound to a set of limits (controllers).

History and Status

CGroups Version 1 (V1) was added to the Linux kernel in 2008. After several years, deficiencies in the V1 organization were identified.

CGroups V2 was added to the kernel in 2016. It introduced a simplified process tree hierarchy and a reorganized set of controllers.

Ubuntu supports both cgroups v1 and v2 since at least 18.04 (possibly earlier). However, prior to 21.10, cgroups v1 is enabled by default. Starting with 21.10, v2 is the default. (So note that this means that v1 is the default in 20.04, which is what ICPC images as of 2022 are based on. Note also that 20.04 DOES have v2 installed; it's just not "activated" by default.)

CGroups are managed using a package called cgroup-tools. Ubuntu 21.10 and earlier (so, including 20.04) contains cgroup-tools v0.41, which only works with CGroups V1 and is no longer supported. 22.04 will have cgroup-tools v2.0, which supports CGroups V2.

According to this web page: "https://askubuntu.com/questions/1376093/is-cgroup-tools-using-cgroup-v1-or-v2", it should be possible to download/install cgroup-tools V2.0 and use it with 21.04 (so maybe even with 20.04?)

  • cgroup-tools v2.0 can be downloaded from https://packages.ubuntu.com/jammy/cgroup-tools.
  • cgroup-tools v2.0 requires upgrading libc6 to >=2.34 (20.04 has 2.31) and upgrading libcgroup1 to >=2.0 (20.04 has 0.41). (Followup: I have successfully installed cgroup-tools v2.0 from Jammy (22.04) and used it on the ICPC 20.04 image. The detailed steps taken to do this are listed below.)

Current Known Limitations

CGroups (including both V1 and V2) was designed to support management and sharing of Linux resource allocation via the Kernel. It was NOT designed to enforce specific restrictions such as those enforced by "ulimit" (e.g., CPU usage limit). Rather, CGroups allow specification of relative resource usage. For example, a given CGroup (set of processes) can be limited to, say, "50% of the available CPU time in a given period of time". There does not seem to be any way (that I have found) to enforce an absolute limit on processes (for example, to say that a given process cannot exceed more than 10 seconds of CPU time under any circumstances and that it should be automatically killed if it does). There MAY be a way to enforce such limits using CGroups; but I haven't found it yet.

There is a way to obtain specific processor-usage information for any given CGroup process. CGroups maintain a file called cpu.stats which records statistics for both a given CGroup. The contents of this file are similar to the output of the time command (/usr/bin/time, not the shell command). For example, the following cpu.stat file might exist in the /sys/fs/cgroup/pc2/pc2sandbox folder (see below for an explanation of this folder):

usage_usec: 2504
user_usec: 2504
system_usec: 10
nr_periods: 25
nr_throttled: 0
throttled_usec: 0

Time durations are in microseconds ("usec"). From this information it would be easily possible to ascertain how much time a sandboxed process took -- once the process terminates. It's just not clear how to get it to terminate after a given length of time. (One possible solution is to set a timer prior to invoking a user submission within a CGroup sandbox, and then when the timer goes off to read the cpu.stat file for the process and use that to determine the actual amount of CPU time taken. However, I have not had time to investigate doing this.)

Note that the above limitation does not apply to the CGroup memory controller (see below); it appears there is a way to force the memory controller to terminate a process when it exceeds its memory limit. However, this alone isn't sufficient for use in a PC2 sandbox; there also needs to be a way to (simultaneously) enforce a CPU time limit.

Technical Details

CGroups Organization

Using CGroups starts by mounting a "cgroup pseudo file system". By default in Ubuntu (and most Linux distribution) this file system is mounted at /sys/fs/cgroup (for v1), usually by systemd(1).

On many modern systems (including Ubuntu 20.04), systemd(1) automatically also mounts the cgroup v2 filesystem, at /sys/fs/cgroup/unified, during the boot process. (Again, however, it's not "activated".)

CGroups v1 and v2 can co-exist in a running system -- subject to some constraints, including that a controller cannot simultaneously be enabled in the v1 and v2 file systems. To use v1-enabled controllers under v2 it is necessary to first disable the v1 controllers. (This can be done at boot time by specifying the cgroup_no_v1=list option on the kernel boot command line; "list" is a comma-separated list of the names of the controllers to disable, or the word 'all' to disable all v1 controllers.

In v1, each user-defined cgroup can be defined at the root of the CGroup filesystem (thus, each cgroup becomes it's own "hierarchy"), and controllers can put put into each hierarchy separately. Alternatively, controllers can be put at the root of the filesystem and cgroups can be defined beneath each controller. (In part it is this complexity which led to the creation of v2.)

In v2, there is only a single cgroup hierarchy; each user-defined cgroup exists as a directory beneath the "cgroup" mount point (which for v2 is /sys/fs/cgroup/unified by default in Ubuntu 20.04). cgroup directories may contain sub-directories for child cgroups, but all user-defined cgroups reside under the single "unified" hierarchy (hence the name).

In creating the CGroup V2 installation on ICPC 20.04, I decided to work exclusively with V2. Therefore, I first REMOVED cgV1, then I arranged for cgV2 to be mounted at /sys/fs/cgroup at boot time. This was accomplished by editing /etc/default/grub to contain

GRUB_CMDLINE_LINUX="systemd.unified_cgroup_hierarchy=1

then running sudo update-grub and then rebooting.
This results in an Ubuntu system with cgV2 mounted (directly at /sys/fs/cgroup) and activated. (Note however that there are no actual CGroups defined at this point; see below.)

Creating CGroups

User-defined v2 cgroups are created by doing a mkdir in either the cgroup root folder or in an already-existing child cgroup directory. In either case the mkdir command causes the filesystem to create a set of files and folders defining the cgroup characterics, in the folder named in the command. (Note this unusual characteristic: a mkdir command automatically also creates files under the new folder.)

So for example, mkdir /sys/fs/cgroup/pc2 will create a "pc2 cgroup" containing a set of V2-defined files and folders describing the attributes of the "pc2" cgroup.

Management of CGroups normally requires root. However, CGroup subtree management can be delegated to a non-privileged user by following certain conventions. In cgV2, TWO LEVELS of cgroups are needed to support "delegation". The first is a CGroup created directly beneath the root (e.g. mkdir /sys/fs/cgroup/pc2). Certain files in this folder are then chownd" to the user (e.g. pc2).
(Note however that NOT ALL FILES CAN BE CHOWN'd this way; certain files must remain unaltered - see below for further details.)

The second level is then created beneath this level: e.g., mkdir /sys/fs/cgroup/pc2/pc2sandbox, which creates a "child cgroup" named "pc2sandbox" under the "pc2" cgroup, and populates /sys/fs/cgroup/pc2/pc2sandbox with the v2-defined set of files and folders. ALL of these files and folders are then chownd" to the user: chown -R pc2:pc2 /sys/fs/pc2/pc2sandbox.

A cgroup which doesn’t have any children or live processes can be destroyed by removing the directory. (If you get an error when executing rmdir <cgroupName>, it probably means there are files under that name, which must first be removed. Using rm -rf doesn't work; you must explicitly remove the files THEN use rmdir to remove the folder. This probably has something to do with the fact that CGroups are a "pseudo-filesystem", not a real file system.)

CGroup Interface Files

The following "interface files" defining the characteristics of each cgroup are created by mkdir (i.e., exist in each cgroup):

  • cgroup.controllers: defines the list of controllers which are available to that cgroup.

  • cgroup.subtree_control: defines the list of controllers which are active (enabled).

  • cgroup.procs: contains a list of PIDs which belong to the cgroup (and hence are subject to the limits of its active controllers).

(Note: there are numerous other interface files, not listed above; see https://www.kernel.org/doc/html/latest/admin-guide/cgroup-v2.html#interface-files.)

CGroup Controllers

The following controllers are defined by CGroups v2:

  • cpu (a combination of the v1 "cpu" and "cpuacct" controllers): supports application of both relative and absolute CPU limits to processes.
  • cpuset - supports binding a set of processes to a set of CPUs and NUMA nodes.
  • freezer - allows suspending ("freezing") and resuming a set of processes
  • io - supports application of constraints on process I/O)
  • memory - allows putting constraints on the memory used by a process
  • pids - allows putting limits on the number of child processes which a process can create.

(Note: there are numerous other controllers, not listed above; see https://www.kernel.org/doc/html/latest/admin-guide/cgroup-v2.html#controllers.)

By default there are no controllers enabled (activated). Controllers can be enabled and disabled for a given cgroup (or child cgroup) by writing a string to the cgroup.subtree_control file. These strings contain space-delimited controller names, each preceded by + (to enable a controller) or - (to disable a controller), as in the following example:

echo '+pids -memory' > x/y/cgroup.subtree_control

The above example enables the pids controller and disables the memory controller.

Only controllers which are listed in cgroup.controllers can be enabled. Enabling a controller creates the controller’s interface files in the child cgroups.

Managing CGroup Processes

Each v2 cgroup has a read-writeable file named cgroups.proc. Processes are placed into a v2 cgroup by writing the process's PID into the cgroup's cgroup.procs file.
For example:

echo $$ > /sys/fs/cgroup/pc2/cgroup.procs

The existence of a PID in a cgroup.procs file means that process is limited by the constraints of the active controllers of that CGroup.

The processes (PIDs) in cgroup.procs are not in order and may contain duplicates.

Only one PID write at a time is allowed.

Writing the value 0 to a cgroup.procs file causes the writing process to be moved to the corresponding CGroup.

CGroup Delegation

A cgroup can be delegated in two ways. First, to a less privileged user by granting write access of the directory and its cgroup.procs, cgroup.threads and cgroup.subtree_control files to the user. Second, if the nsdelegate mount option is set, automatically to a cgroup namespace on namespace creation (the nsdelegate option IS set on cgroup2 in Ubuntu 20.04).

The recommended procedure for using CGroups is to couple it with use of Linux "Namespaces", which allow encapsulating processes within their own "process namespace". I have not investigated doing that.

Additional CGroups Details

CGroups can be managed by systemd(1). See https://www.redhat.com/sysadmin/cgroups-part-four for some details.

Steps taken to Upgrade Ubuntu 20.04 to use CGroups V2

Based in part on info from: https://sleeplessbeastie.eu/2021/09/10/how-to-enable-control-group-v2/ and from: https://rootlesscontaine.rs/getting-started/common/cgroup2/

  • update Grub:
sudo sed -i -e 's/^GRUB_CMDLINE_LINUX=""/GRUB_CMDLINE_LINUX="systemd.unified_cgroup_hierarchy=1"/' /etc/default/grub
sudo update-grub
  • reboot: sudo reboot

  • Inspect cgroup filesystem: stat -c %T -f /sys/fs/cgroup

    • expect: cgroup2fs
  • Inspect cgroup.controllers file: cat /sys/fs/cgroup/cgroup.controllers

    • expect: cpuset cpu io memory hugetlb pids rdma misc
  • Allow delegation of all controllers (by default only "memory" and "pids" are delegated to child subgroups):

    • Verify that all controllers are available in the root cgroup: more /sys/fs/cgroup/cgroup.controllers

      • expect: cpuset cpu io memory hugetlb pids rdma
    • Verify that only memory and pids are enabled in the root cgroup: more /sys/fs/cgroup/cgroup.subtree_controllers

      • expect: memory pids
    • Update systemd to enable all needed controllers:

      $ sudo mkdir -p /etc/systemd/system/[email protected]
      $ cat <<EOF | sudo tee /etc/systemd/system/[email protected]/delegate.conf
      [Service]
      Delegate=cpu cpuset io memory pids
      EOF
      $ sudo systemctl daemon-reload
      $ sudo reboot
      
    • Verify that needed controllers are now enabled in the root cgroup: more /sys/fs/cgroup/cgroup.subtree_controllers

      • expect: cpuset cpu io memory pids
  • Install cggroup-tools v2.0 (20.04 has v0.41); also requires installing libc6>=2.34 (20.04 has 2.31) and libcgroup1>=2.0 (20.04 has 0.41) See https://askubuntu.com/questions/1376093/is-cgroup-tools-using-cgroup-v1-or-v2 and https://packages.ubuntu.com/jammy/cgroup-tools.

    • Verify nothing except cgroup-tools v1 (0.41) depends on libcgroup1:

      • run: apt rdepends --installed libcgroup1
        • expect: nothing listed other than cgroup-tools
    • Update apt to pull the required libraries from Ubuntu 22.04 (Jammy):

      • Add the following line to /etc/apt/sources.list:
        deb http://us.archive.ubuntu.com/ubuntu jammy restricted main multiverse universe`
        
      • run: sudo apt update
    • Verify appropriate packages are now upgradable:

      • run: apt list --upgradable | grep cgroup
        • expect:
          cgroup-tools/jammy 2.0-2 amd64 [upgradable from: 0.41-10]
          libcgroup1/jammy 2.0-2 amd64 [upgradable from: 0.41-10]
          
      • run: apt list --upgradable | grep libc6
        • expect:
          libc6-dbg/jammy 2.35-0ubuntu3 amd64 [upgradable from: 2.31-0ubuntu9.2]
          libc6-dev/jammy 2.35-0ubuntu3 amd64 [upgradable from: 2.31-0ubuntu9.2]
          libc6/jammy 2.35-0ubuntu3 amd64 [upgradable from: 2.31-0ubuntu9.2]
          
      • run: apt list | grep cgroup
        • expect:
          cgroup-lite/jammy,focal 1.15 all
          cgroup-tools/jammy 2.0-2 amd64 [upgradable from: 0.41-10]
          cgroupfs-mount/jammy,focal 1.4 all
          golang-github-containerd-cgroups-dev/jammy 1.0.3-1 all
          libcgroup-dev/jammy 2.0-2 amd64
          libcgroup1/jammy 2.0-2 amd64 [upgradable from: 0.41-10]
          libpam-cgroup/jammy 2.0-2 amd64
          
      • run: apt list --installed | grep cgroup
        • expect:
          cgroup-tools/focal,now 0.41-10 amd64 [installed,upgradable to: 2.0-2]
          libcgroup1/focal,now 0.41-10 amd64 [installed,upgradable to: 2.0-2]
          
    • Remove cgroup-tools v0.41 from the system:

      • run: sudo apt purge cgroup-tools
    • Remove libgroup1 v0.41 from the system (it was a dependency of cgroup-tools v0.41 and is no longer needed):

      • run: sudo apt autoremove
    • Install cgroup-tools v2.0 and its dependencies:

      • run: sudo apt install cgroup-tools
        • Answer "yes" to restarting services.
    • Verify cgroup-tools v2 and its dependencies are installed:

      • run: apt list --installed | grep cgroup
        • expect:
          cgroup-tools/jammy,now 2.0-2 amd64 [installed]
          libcgroup1/jammy,now 2.0-2 amd64 [installed,automatic]
          
      • run: apt list --installed | grep libc6
        • expect:
          libc6-dbg/jammy,now 2.35-0ubuntu3 amd64 [installed,automatic]
          libc6-dev/jammy,now 2.35-0ubuntu3 amd64 [installed,automatic]
          libc6/jammy,now 2.35-0ubuntu3 amd64 [installed,automatic]
          

Steps to create a pc2sandbox CGroup

These steps must be done ONCE, prior to executing any submissions in the sandbox. These steps are encapsulated in a script named pc2installsandbox in my GitHub fork, on branch i_295_develop_work_cgroups_sandbox.

  • Create a pc2 cgroup to be managed by the root cgroup:
sudo mkdir /sys/fs/cgroup/pc2
sudo chown pc2:pc2 /sys/fs/cgroup/pc2
  • allow the pc2 user to manage the pc2 cgroup files and subgroups. Note we do NOT change ownership of the "resource control" interface files, such as cpu.max; those must remain under control of the PARENT (root) cgroup:
sudo chown pc2:pc2 /sys/fs/cgroup/pc2/cgroup.procs
sudo chown pc2:pc2 /sys/fs/cgroup/pc2/cgroup.subtree_control
  • Enable the cpu and memory controllers for the pc2 cgroup. Note that this must be done BEFORE creating the pc2sandbox subgroup, so that the subgroup inherits the controllers:
echo "+cpu +memory" > /sys/fs/cgroup/pc2/cgroup.subtree_control
  • Create a "pc2sandbox" sub-cgroup that can be 100% managed by the pc2 cgroup and inherits the pc2 cgroup attributes:
mkdir /sys/fs/cgroup/pc2/pc2sandbox
sudo chown -R pc2:pc2 /sys/fs/cgroup/pc2/pc2sandbox
  • Enable the cpu and memory controllers for the pc2sandbox cgroup:
echo "+cpu +memory" > /sys/fs/cgroup/pc2/pc2sandbox/cgroup.subtree_control

Steps to add a currently-running process to the pc2sandbox croup

This step would be done inside a "pc2sandbox" script just prior to invoking the submission command. This step is encapsulated inside a script named pc2sandbox.sh in my GitHub fork, on branch `i_295_develop_work_cgroups_sandbox.

echo $$ > /sys/fs/cgroup/pc2/pc2sandbox/cgroup.procs
⚠️ **GitHub.com Fallback** ⚠️