Kubernetes - casangi/RADPS GitHub Wiki

There are lots of resources for learning about Kubernetes but it's a large and complicated ecosystem. The purpose of this page is to capture some of the essential knowledge required to use and interact with k8s as well as be a home for internal documentation about our experiments with on-premises deployment.

Key concepts

This section contains useful high level references to review so that subsequent internal documentation is placed in the appropriate context.

Containers

Kubernetes is a container orchestration tool and therefore it is necessary to understand containerization, first as a concept distinct from virtualization; next by understanding specific terminology used to describe and discuss containers; and then in terms of two prominent container frameworks, Docker and Podman:

Kubernetes Architecture

Once the format underlying the software that will go into a deployment is understood, it is possible to proceed to the concepts of Kubernetes itself. A good place to learn that is from the big picture models in the official documentation. First, the specific terminology used to define the components of Kubernetes clusters at the highest level; next stepping one level down to nodes (the component enabling the capability of orchestrating computing resources); next going one abstraction level lower to pods (the units of computing which abstract containers); and finally the concept of services (mapping across these domains to create access to workloads running in the cluster):

Interfacing with Kubernetes clusters

Other sections of the official documentation will eventually be essential to have reviewed once we're using Kubernetes in any serious context. Until then we can skip ahead to the documentation of tools used to make it easier to configure, deploy, and manage clusters themselves. First the cluster management tool kubectl; next the structure of a localized, lightweight deployment based on k3s; then the k3s deployment "wrapper" k3d (which runs a k3s deployment inside a docker container); and finally helm (the "package manager" for Kubernetes applications):

Outline of a basic deployment

There are multiple abstraction layers, but each of them serves a purpose. Here is how they are arranged in the first test deployment on developer workstations, from RADPS#27:

{NRAO-ER}
    └── workstation [eventually spare R620s and then dedicated cluster hardware]
            └── k3d
                └── docker [or podman]
                    └── k3s
                        └── helm
                            └── k8s cluster namespaces
                                └── nodes
                                    └── pods
                                        └── containerized applications (Prefect, Airflow, etc.)

Note that it is possible to track every layer of this tree using configuration as code (playbooks, manifests, image definitions, etc.) by keeping the source files in version control (e.g., casangi/RADPS, casangi/cloudviper, gitlab, bitbucket, nexus, dockerhub, artifacthub, ...)

Diagrams

These are using the C4 model of visualizing system architecture, generated using draw.io.

System Contex

k3d_system_context drawio

Containers

Workflow manager

k3d_workflow_container drawio

Resource manager

k3d_resource_containers drawio

Deployment

k3d_deployment drawio

Related terms and tools

In the context of RADPS experiments with deploying Kubernetes on-premises, there are a number of other concepts, tools, and acronyms which are useful to learn about. Here are a few of them:

Ansible - configuration management tool from Red Hat that enables automated provisioning of remote computing resources using infrastructure as code ("playbooks")
ArgoCD - declarative continuous delivery tool
CNCF - Cloud Native Computing Foundation, from which Kubernetes (and some of the other projects listed below) matured
CRDs - custom resource definitions
Grafana - dashboarding tool that provides a front-end UI to monitoring databases; commonly-used component in LMA (logging, monitoring, alerting) stacks
k9s - CLI for Kubernetes
Lens - IDE-like tool to communicate with Kubernetes and visualize the configuration and state of running clusters
Longhorn - storage (I/O device) controller for Kubernetes, akin to the EBS service from AWS
Prometheus - monitoring and alerting for Kubernetes
Traefik - Ingress controller for Kubernetes, also used as a load balancer or proxy service

Outline of a deployment on shared infrastructure

The development workflow we are working towards for Kubernetes begins on the workstation/laptop and uses some basic developer tools with which to experiment and orient ourselves before deploying onto more substantial hardware for increasingly comprehensive testing.

First we need some basic tools installed: kubectl "Kube Control", a fundamental tool for interacting with Kubernetes. helm Helm is used to deploy to Kubernetes; another fundamental tool.

We are also using ArgoCD internally to manage our helm deployments, but this is primarily being used for infrastructure ( e.g. kube-prometheus-stack, prefect-server, etc. ) and so won't be covered in this Wiki unless it becomes a part of development deployments.

Configuring a deployment on shared infrastructure

tl;dr

mkdir ~/.kube
cp ~/Downloads/k3s.yaml ~/.kube/radps-k3s.yaml
EXPORT KUBECONFIG=~/.kube/radps-k3s.yaml

# verify you can connect
kubectl get nodes

# create a dev namespace
kubectl create namespace jdoe-prefect-workflow

# isolate activities to this namespace
kubectl config set-context --current --namespace jdoe-prefect-workflow

# using helm, deploy Prefect
helm repo add prefect https://prefecthq.github.io/prefect-helm

# modify the values as necessary found in charts/worker-manifest.yaml
cp charts/worker-manifest.yaml jdoe-prefect-workflow-values.yaml
vi jdoe-prefect-workflow-values.yaml
helm install prefect-server prefect/prefect-server -f jdoe-prefect-workflow-values.yaml

# configure Traefik to route HTTP traffic to our Prefect server service
# see template below
kubectl apply -f jdoe-prefect-workflow-ingress.yaml

# use the IP from kube config for hosts entry
# e.g. 10.15.233.10    prefect.local
sudo vi /etc/hosts

# modify your Prefect configuration to add a context for k3s
# see example below
vi ~/.prefect/profiles.toml

NB: At no time is it necessary to SSH to any node of the k3s cluster for any reason.

Shared Infrastructure Deployment Details (Prefect specific)

Once we have the Kubernetes configuration file, we can interact with the cluster using kubectl in the same way as the containerized k3d deployment as documented in the RADPS repository readme. All it takes is having the environment variable KUBECONFIG set to point at the location of a valid cluster spec, for example:

# assumptions:
#   - kubectl/helm/etc. are installed
#   - no other k3s/k8s configuration has been done
mkdir ~/.kube
# retrieve k3s configuration from an administrator, copy to this directory
cp k3s.yaml ~/.kube/radps-k3s.yaml
# set environment for kubectl/helm/k9s/etc.
export KUBECONFIG=~/.kube/radps-k3s.yaml

Now it is possible to use standard commands to interface with the existing cluster, modify configurations, and examine running services and pods:

kubectl get nodes
kubectl get svc
kubectl get pods --all-namespaces

There is a lot going on there (storage controllers, a monitoring stack, network services) and most of it will NOT be relevant to an application developer. Rather, our focus as developers will center on a standard Kubernetes idiom used to group resources and facilitate multi-tenancy: the namespace. For example:

kubectl create namespace jdoe-prefect-workflow

We may now proceed by using the namespace flag --namespace=jdoe-prefect-workflow passed as an argument to subsequent kubectl commands to limit the context of operations, configuration changes, and listed output.

Alternatively, it's also possible to use kubectl to setup a context:

# create a context
# the values for cluster and user may need adjustment
kubectl config set-context jdoe-workflow-test \
  --cluster=default \
  --user=default \
  --namespace=jdoe-workflow-test

# and then use the context to isolate activities:
kubectl config use-context jdoe-workflow-test

Now when we run commands to interact with Kubernetes, our view of resources is focused only on those in the namespace we've specified. This is especially handy in a busy cluster which may have many hundreds or thousands of resources. It is still possible to see resources outside the current context, but use of the --namespace flag is necessary to do so.

Assuming we have come to the point in our development where we want to move to an environment with more substantial resources, we're now setup to use our cluster to deploy resources.

Helm Deploy Prefect Server

Using the helm chart provided by Prefect, setup some overrides for important values and deploy Prefect server to our namespace. There are many configuration options but the basic are available in the charts/worker-manifest.yaml file.

helm repo add prefect https://prefecthq.github.io/prefect-helm

# some of the values can be found in charts/worker-manifest.yaml
cp charts/worker-manifest.yaml jdoe-prefect-workflow-values.yaml
vi jdoe-prefect-workflow-values.yaml

# and then install the Prefect server
helm install prefect-server prefect/prefect-server

Configure Prefect for Local and Remote Development

For Prefect, it will be necessary to create or modify our profile to support both local development as well as connecting remotely to k3s. Additionally, we'll need to configure the k3s HTTP proxy ( traefik ) to route network traffic to our deployed workflow.

First we configure our Prefect profile to create a context for both local and remote work. An example of a basic Prefect profile would be:

$ cat ~/.prefect/profiles.toml
active = "radps-k3s"

[profiles.local]
PREFECT_SERVER_ALLOW_EPHEMERAL_MODE = "true"
PREFECT_API_URL = "http://127.0.0.1:4200/api"
PREFECT_UI_API_URL = "http://127.0.0.1:4200/api"

[profiles.radps-k3s]
PREFECT_API_URL = "http://prefect.local/api"
PREFECT_UI_API_URL = "http://prefect.local/api"

Note that we have a hostname of prefect.local in the configuration and so the assumption is that we have Traefik configured to route HTTP traffic to our deployed service ( NB: we don't need to include the default port number 4200 as we do with local development; that is handled in the ingress manifest ).

In order to configure Traefik for this scenario we need to create an ingress which will be used to route HTTP traffic to the Prefect service running in our namespace. An example ingress would look as follows:

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: prefect-server-ingress
  namespace: jdoe-workflow-test
  annotations:
    traefik.ingress.kubernetes.io/router.entrypoints: web
spec:
  ingressClassName: traefik
  rules:
  - host: prefect.local
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: prefect-server
            port:
              number: 4200

The important parameters here are namespace, host, and service.name. Namespace needs to be the namespace of the developers deployment of Prefect. The host is a hostname we will add to our local /etc/hosts file to help Traefik route HTTP traffic to our Prefect service. And service.name is the name of the Prefect server service as it appears in kubectl get svc --namespace jdoe-workflow-test.

Using this file as a template, modify it to suit and the apply:

kubectl apply -f jdoe-workflow-test-ingress.yaml

The final setup step is to add an entry to our hosts file:

$ sudo vi /etc/hosts

# insert an entry of IP address  Hostname
10.4.29.100    prefect.local

In our environment, you should choose the IP from the k3s configuration file. Note that as this k3s cluster is not considered 'production' IP addresses may be changed or updated as worker hosts come and go, so you may need to update this entry in the future.