docker ephemeral commandlines - benclifford/text GitHub Wiki

'c' command

want to run this on a multiuser system where the users are not trusted particularly to run stuff as root.

NFS Workstation cluster style

Influences: NFS workstation clusters in late 1990s early 200s but s/workstation/container/. Stack.

Home directory is shared. No other file systems are. (or in multi-user case, all home directories are shared).

User names/user id numbers for "regular users" are common between all containers. Only things writing into that shared space should be using those user ids. In individual containers, can have system local users but they aren't allowed to write into shared space.

In the case of persistent services, shared space is not just shared with other kinds of container, but also with past/future selves of this particular container, and so shared user ids must be used for all persistent files - eg for a jenkins server, we need a system-wide CI user, rather than the local 'jenkins' user create during apt-get install jenkins.

Like workstations but with added property that you can mess as root differently - you're impacting less rather than more by running as root because you can have a new container whenever you want. (c.f. having root on your own workstation that no one else uses but is in a workstation cluster)

cost to startup, encouraging a style of putting lots of commands separated by && esp in degenerate case where my postgres image starts up a DB even if you only want it for client mode. (maybe that should be separated/parameterised?)

how to deal with long running containers / containers that host services that need all users? in one deployment I have users supplied through an LDAP server, although some kind of munging of host-side /etc/password and mounting all of /home would also work without needing the cost of an nscd on each container. LDAP also lets user details change over the lifetime of the container (not so much an issue when ephemeral single user container) and work in non-docker environments (such as VMs or multiple physical hosts)

even though you starts as user X, and inside the container end up running as user X (albeit with root access inside your container), you still need root on the host to interact with docker. Although the c command could be setuid/setgid and enforce restrictions? That still allows execution of arbitrary docker images, but as I'm happy for users to run arbitrary commands as root inside any container, I'm not that fussed (?). There's probably a range of attacks there.

As I was originally expecting containers in these style to really be ephemeral, I scripted things so that "system installation" tasks (eg creating a user, or database) and "system boot" tasks (eg running a database) are conflated. Making a container setup this way run a second time can cause trouble because it tries to re-install things that are already installed. That can be separated out into two separate phases.

A different approach was to have 'userenv' persistent container with ssh running on a different port - so you can run a screen session inside that container. The approach above has more containers (almost one per process) rather than being VM-like.

Managing what gets shared into the container beyond ~ overlaps a bit with using autofs on NFS to provide a command list of mount points that are available everywhere.

Re-entrant in the sense that (if docker is shared properly) then we can call c inside a c-container and that will still work (although with docker style semantics of volume mounts, which are against the host container - but mounting paths at the same place on inside/outside and keeping same user-ids makes this work mostly ok)

What about trying to run this in an environment where you don't have to trust your users to not abuse root? Make 'c' setuid so it can start containers without user needing root access. Restrict what options can go to docker: for example, if you're going to let user have root access within the container, you can't let them mount arbitrary volumes. (even mounting their home directory might be dodgy in circumstances where there are files in there owned by someone else, which is unusual but legal - and I've definitely done it myself in the past)

Facilitates a style where "complex stuff" doesn't happen in container commandlines: you run one program per container and control it all from the outside, making the assumption that (e.g.) bash is sufficiently consistent for this.

Some stuff (eg pip, stack) wants to cache stuff in your home directory. So in this situation, it crosses between containers. This is a blessing and a curse: you get stuff cached without it needing to be put into the container, and can benefit from fancy application-aware caching; but you have to trust that that caching can cope with a multi-host environment.

In practice, need to be careful about different versions of tools putting stuff into ~ which is shared - in the same way as if you had a sunos version of a tool and an aix version of a tool that wrote incompatible stuff into a shared home directory. (so the downsides are inherited too...) but we have the ability to put in specific overridden mounts in a container if desired (which I've done in practice because two versions of a compiler couldn't be built in the same home directory even on the same platform (!))

startup time in 'cue' as of mid-may 2017 is way slower than I'd like - ok for multiminute build jobs but not for quick commands. start and end round trip is about 3s. ugh.