Kubernetes Exec on Node - cockpit-project/cockpit GitHub Wiki

Kubernetes Node Execute Access

In a pure cloud environment Nodes are often designed to be managed by a completely different persona than Kubernetes admins. In a hybrid cloud environment, it's far more common that the same DevOps cluster admin would want access to the Nodes, when troubleshooting issues and so on.

Goal: Gate access to an operating system cluster based on Kubernetes access. Lots of hybrid cloud upgrade/configuration scripts/tools want to have access to a node without inventing yet another form of authentication and access control.

Notable failure modes, an ideal solution solves all of these:

  • Cluster related: kubelet API master, certificate expiry. Execute commands on non-healthy members of a cluster.
  • Docker or container related: Docker daemon failed to start or cannot execute a container, disk space or security problems.
  • Upgrade failures: Used to work, but failure after a node OS upgrade and reboot.
  • Misconfiguration failure: Used to work, but after configuration

Brainstormed ideas follow.

1. SSH public keys stored in Kubernetes Node object

Use SSH keys to continue to access Kubernetes nodes, but use an AuthorizedKeysCommand to lookup in the kubernetes API Node data which SSH keys are authorized to access the node.

  • Public keys would be stored in annotations in a similar format to authorized_keys contents.
  • Public keys would have an optional expiry date.
  • The AuthorizedKeysCommand option calls an executable that:
    • Uses the kubelet kubeconfig to talk to the API server and retrieve the authorized_keys.
    • Caches the data locally in the case where the API server becomes unavailable or the kubelet's kubeconfig becomes invalid.
    • Removes keys that have expired.

Characteristics:

  • Covers all of above failure modes.
  • Whoever has access to write to a kubernetes Node object has access to nodes that have opted in.
  • Opt in by either an operating system or deployment mechanism by setting/clearing AuthorizedKeysCommand
  • Access does not disappear if kubelet dies.
  • Compatible with Ansible, Commisaire, Cockpit, systemd tools, and lots of other tools ... anything that uses SSH for remote access.

Cons:

  • Requires that the caller uses SSH, which may be regarded as extra complexity by some callers.

2. Extend /api/v1/proxy/node to execute commands

Kubernetes API has a facility for connecting to TCP sockets on the node via /api/v1/proxy/nodes/nodename:port/.... Using local TCP sockets doesn't solve the authentication side of the above goal (any local user can connect to those ports, performing privileged tasks would require additional authentication headers). In addition the proxy does not support CONNECT or WebSockets.

  • Extend the /api/v1/proxy/nodes API to be able to execute commands similar to how commands are executed in a container or pod.
  • Add WebSocket and HTTP/2 stream support to /api/v1/proxy/nodes

Characteristics:

  • Solves local privilege escalation (when compared to local TCP port). The process is spawned as the user running kubelet.
  • Access control is via a new verb / resource

Cons:

  • Not resilient to cluster failure. If kubelet dies or doesn't come up after a reboot, access to the node is gone.
  • Opt in needs to be at the kubelet argument level.
  • New API needs retrofiting into all the callers, or wrappers built.
    • For example Ansible doesn't work out of the box.

3. Extend /api/v1/proxy/node to connect to unix socket

Very similar to above, but instead of executing a command connect to a unix socket.

Characteristics:

  • Same as above
  • Solves local privilege escalation over unix sockets, as unix socket listener can identify that the unix socket was established by the kubelet running as root.

Cons:

  • Same as above

4. PAM module to validate credentials against API server

Add a PAM module installed on the nodes that validates credentials against the API server. Either token or user/password authentication should be supported.

Characteristics:

  • A new PAM module is installed in the sshd PAM config file.
  • The PAM module gets the the API server address from kubelet kubeconfig.
  • The PAM module authenticates and connects to API server using (ssh) login credentials.
  • The PAM module checks access against the API server.
  • Compatible with tools that use SSH access, but not compatible with SSH keys.

Cons:

  • Not resilient to kube master failure. Without access to the master the login credentials cannot be validated.
  • Access control is challenging here, done during login by the PAM module against the API server.
  • Due to above probably not compatible with stock kubernetes, just Origin/Openshift/Atomic
  • Requires adaptation of calling tools to be able to use credentials or tokens ... less often implemented in SSH using tools.

5. Execute in a privileged container

Run a command as a privileged pod on the given node.

Characteristics:

  • Run a a privileged pod on the given node.
  • Use either nodePort or spec.host to run the pod on a specific node.
  • Use /exec API to run a specific command inside the pod.
  • The --allow-privileged kubelet option needs to driven by access control (certain users can start privileged containers) rather than a kubelet flag.

Cons:

  • Not resilient to kubernetes or container failures.
  • Opt in needs to be at the kubelet argument level.
  • New API needs retrofiting into all the callers, or wrappers built.
    • Requires adaptation of calling tools to be able to use the /exec API.
    • For example Ansible doesn't work out of the box.