oneke_ops - OpenNebula/one-apps GitHub Wiki
The leader VNF node runs an HAProxy instance that by default exposes Kubernetes API port 6443
on the public VIP address over the HTTPS protocol, secured with two-way SSL/TLS certificates.
This HAProxy instance can be used in two ways:
- As a stable Control Plane endpoint for the whole Kubernetes cluster.
- As an external Kubernetes API endpoint that can be reached from outside the internal VNET.
graph LR;
internet --- vnf;
vnf --- master & worker & storage;
internet((Internet));
style vnf text-align:left
style master text-align:left
style worker text-align:left
style storage text-align:left
vnf[["vnf (NAT 🔀)"<br>haproxy - *:6443<br><hr>eth0:10.2.11.86<br><hr>eth1:172.20.0.86]];
master[master<br>kube-apiserver - *:6443<br><hr>eth0:172.20.0.101<br><hr>GW:172.20.0.86<br>DNS:1.1.1.1];
worker[worker<br><hr>eth0:172.20.0.102<br><hr>GW:172.20.0.86<br>DNS:1.1.1.1];
storage[storage<br><hr>eth0:172.20.0.103<br><hr>GW:172.20.0.86<br>DNS:1.1.1.1];
To access the Kubernetes API you'll need a kubeconfig file. In the case of RKE2, you can copy the /etc/rancher/rke2/rke2.yaml
file located on every master node. For example:
$ install -d ~/.kube/
$ scp -J [email protected] [email protected]:/etc/rancher/rke2/rke2.yaml ~/.kube/config
Warning: Permanently added '10.2.11.86' (ED25519) to the list of known hosts.
Warning: Permanently added '172.20.0.101' (ED25519) to the list of known hosts.
rke2.yaml
Additionally you must adjust the Control Plane endpoint in the file to point to the public VIP:
$ gawk -i inplace -f- ~/.kube/config <<'EOF'
/^ server: / { $0 = " server: https://10.2.11.86:6443" }
{ print }
EOF
Since OneKE 1.29, it's also possible to extract the kubeconfig file from the user template of any master VMs in the master role. For example:
onevm show 'master_0_(service_1)' --json | jq -r '.VM.USER_TEMPLATE.ONEKE_KUBECONFIG|@base64d' | install -m u=rw,go= -D /dev/fd/0 ~/.kube/config
And then your local kubectl
command should work just fine:
$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
oneke-ip-172-20-0-101 Ready control-plane,etcd,master 33m v1.29.4+rke2r1
oneke-ip-172-20-0-102 Ready <none> 28m v1.29.4+rke2r1
oneke-ip-172-20-0-103 Ready <none> 28m v1.29.4+rke2r1
oneke-ip-172-20-0-104 Ready control-plane,etcd,master 12m v1.29.4+rke2r1
oneke-ip-172-20-0-105 Ready control-plane,etcd,master 10m v1.29.4+rke2r1
Important
If you'd like to use a custom domain name for the Control Plane endpoint instead of the direct public VIP address, you need to add the domain to the ONEAPP_K8S_EXTRA_SANS
context parameter, for example localhost,127.0.0.1,k8s.yourdomain.it
, and set the domain inside the ~/.kube/config
file as well. You can set up your domain in a public/private DNS server or in your local /etc/hosts
file.
By default Kubernetes API Server's extra SANs are set to localhost,127.0.0.1
which allows you to access Kubernetes API via SSH tunnels.
Note
We recommend using the ProxyCommand
SSH feature.
Download the /etc/rancher/rke2/rke2.yaml
kubeconfig file:
$ install -d ~/.kube/
$ scp -o ProxyCommand='ssh -A [email protected] -W %h:%p' [email protected]:/etc/rancher/rke2/rke2.yaml ~/.kube/config
Note
The 10.2.11.86
is the public VIP address, 172.20.0.101
is a private address of a master node inside the private VNET.
Create SSH tunnel, forward TCP port 6443:
$ ssh -o ProxyCommand='ssh -A [email protected] -W %h:%p' -L 6443:localhost:6443 [email protected]
and then run kubectl
in another terminal:
$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
oneke-ip-172-20-0-101 Ready control-plane,etcd,master 58m v1.29.4+rke2r1
oneke-ip-172-20-0-102 Ready <none> 52m v1.29.4+rke2r1
oneke-ip-172-20-0-103 Ready <none> 52m v1.29.4+rke2r1
oneke-ip-172-20-0-104 Ready control-plane,etcd,master 31m v1.29.4+rke2r1
oneke-ip-172-20-0-105 Ready control-plane,etcd,master 29m v1.29.4+rke2r1
To create a 4 GiB persistent volume apply the following manifest using kubectl
:
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: nginx
spec:
accessModes:
- ReadWriteOnce
volumeMode: Filesystem
resources:
requests:
storage: 4Gi
storageClassName: longhorn-retain
$ kubectl apply -f nginx-pvc.yaml
persistentvolumeclaim/nginx created
$ kubectl get pvc,pv
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
persistentvolumeclaim/nginx Bound pvc-5b0f9618-b840-4544-bccc-6479c83b49d3 4Gi RWO longhorn-retain 78s
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
persistentvolume/pvc-5b0f9618-b840-4544-bccc-6479c83b49d3 4Gi RWO Retain Bound default/nginx longhorn-retain 76s
Important
The Retain reclaim policy may protect your persistent data from accidental removal. Always back up your data!
To deploy an NGINX instance using the PVC created previously, apply the following manifest using kubectl
:
---
kind: Deployment
apiVersion: apps/v1
metadata:
name: nginx
spec:
replicas: 1
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: http
image: nginx:alpine
imagePullPolicy: IfNotPresent
ports:
- name: http
containerPort: 80
volumeMounts:
- mountPath: "/persistent/"
name: nginx
volumes:
- name: nginx
persistentVolumeClaim:
claimName: nginx
$ kubectl apply -f nginx-deployment.yaml
deployment.apps/nginx created
$ kubectl get deployments,pods
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/nginx 1/1 1 1 32s
NAME READY STATUS RESTARTS AGE
pod/nginx-6b5d47679b-sjd9p 1/1 Running 0 32s
To expose the running NGINX instance over HTTP, on port 80 of the public VNF VIP address, apply the following manifest using kubectl
:
---
apiVersion: v1
kind: Service
metadata:
name: nginx
spec:
selector:
app: nginx
type: ClusterIP
ports:
- name: http
protocol: TCP
port: 80
targetPort: 80
---
# In Traefik < 3.0.0 it used to be "apiVersion: traefik.containo.us/v1alpha1".
apiVersion: traefik.io/v1alpha1
kind: IngressRoute
metadata:
name: nginx
spec:
entryPoints: [web]
routes:
- kind: Rule
match: Path(`/`)
services:
- kind: Service
name: nginx
port: 80
scheme: http
$ kubectl apply -f nginx-svc-ingressroute.yaml
service/nginx created
ingressroute.traefik.containo.us/nginx created
$ kubectl get svc,ingressroute
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/kubernetes ClusterIP 10.43.0.1 <none> 443/TCP 3h18m
service/nginx ClusterIP 10.43.99.36 <none> 80/TCP 63s
NAME AGE
ingressroute.traefik.containo.us/nginx 63s
Verify that the new IngressRoute
CRD (Custom Resource Definition) object is operational:
$ curl -fsSL http://10.2.11.86/ | grep title
<title>Welcome to nginx!</title>
To expose the running NGINX instance over HTTP on the port 80 using a private LoadBalancer
service provided by MetalLB, apply the following manifest using kubectl
:
---
apiVersion: v1
kind: Service
metadata:
name: nginx-lb
spec:
selector:
app: nginx
type: LoadBalancer
ports:
- name: http
protocol: TCP
port: 80
targetPort: 80
$ kubectl apply -f nginx-loadbalancer.yaml
service/nginx-lb created
$ kubectl get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 10.43.0.1 <none> 443/TCP 3h25m
nginx ClusterIP 10.43.99.36 <none> 80/TCP 8m50s
nginx-lb LoadBalancer 10.43.222.235 172.20.0.87 80:30050/TCP 73s
Verify that the new LoadBalancer
service is operational:
$ curl -fsSL http://172.20.0.87/ | grep title
<title>Welcome to nginx!</title>
Important
When upgrading, enabling access to the public internet is recommended, since RKE2 will need to download various Docker images to complete the upgrade.
K8s clusters can be upgraded with the System Upgrade Controller provided by RKE2. Here's a handy bash snippet to illustrate the procedure:
#!/usr/bin/env bash
: "${SUC_VERSION:=0.13.4}"
: "${RKE2_VERSION:=v1.29.4+rke2r1}"
set -o errexit -o nounset
# Deploy CRDs.
kubectl apply -f "https://github.com/rancher/system-upgrade-controller/releases/download/v${SUC_VERSION}/crd.yaml"
# Deploy the System Upgrade Controller.
kubectl apply -f "https://github.com/rancher/system-upgrade-controller/releases/download/v${SUC_VERSION}/system-upgrade-controller.yaml"
# Wait for required Custom Resource Definitions to appear.
for RETRY in 9 8 7 6 5 4 3 2 1 0; do
if kubectl get crd/plans.upgrade.cattle.io --no-headers; then break; fi
sleep 5
done && [[ "$RETRY" -gt 0 ]]
# Plan the upgrade.
kubectl apply -f- <<EOF
---
# Server plan
apiVersion: upgrade.cattle.io/v1
kind: Plan
metadata:
name: server-plan
namespace: system-upgrade
labels:
rke2-upgrade: server
spec:
concurrency: 1
nodeSelector:
matchExpressions:
- {key: rke2-upgrade, operator: Exists}
- {key: rke2-upgrade, operator: NotIn, values: ["disabled", "false"]}
# When using k8s version 1.19 or older, swap control-plane with master
- {key: node-role.kubernetes.io/control-plane, operator: In, values: ["true"]}
serviceAccountName: system-upgrade
tolerations:
- key: CriticalAddonsOnly
operator: Exists
cordon: true
# drain:
# force: true
upgrade:
image: rancher/rke2-upgrade
version: "$RKE2_VERSION"
---
# Agent plan
apiVersion: upgrade.cattle.io/v1
kind: Plan
metadata:
name: agent-plan
namespace: system-upgrade
labels:
rke2-upgrade: agent
spec:
concurrency: 1
nodeSelector:
matchExpressions:
- {key: rke2-upgrade, operator: Exists}
- {key: rke2-upgrade, operator: NotIn, values: ["disabled", "false"]}
# When using k8s version 1.19 or older, swap control-plane with master
- {key: node-role.kubernetes.io/control-plane, operator: NotIn, values: ["true"]}
prepare:
args:
- prepare
- server-plan
image: rancher/rke2-upgrade
serviceAccountName: system-upgrade
tolerations:
- key: node.longhorn.io/create-default-disk
value: "true"
operator: Equal
effect: NoSchedule
cordon: true
drain:
force: true
ignoreDaemonSets: true
timeout: 0
upgrade:
image: rancher/rke2-upgrade
version: "$RKE2_VERSION"
EOF
# Enable/Start the upgrade process on all cluster nodes.
kubectl label nodes --all rke2-upgrade=true
By default OneKE deploys Longhorn, Traefik, and MetalLB during cluster bootstrap. All these apps are deployed
as Addons using RKE2's Helm Integration and official Helm charts. To illustrate the process, let's upgrade Traefik Helm chart from version 23.0.0
to 28.0.0
, following four basic steps.
Important
When upgrading, enabling access to the public internet is recommended, since RKE2 will need to download various Docker images to complete the upgrade.
- To avoid downtime, ensure that the number of worker nodes is at least 2, so 2 (anti-affined) Traefik replicas are running.
$ oneflow scale 'Service OneKE 1.29' worker 2
$ oneflow show 'Service OneKE 1.29'
...
LOG MESSAGES
05/13/24 13:32 [I] New state: DEPLOYING_NETS
05/13/24 13:32 [I] New state: DEPLOYING
05/13/24 13:39 [I] New state: RUNNING
05/13/24 13:54 [I] Role worker scaling up from 1 to 2 nodes
05/13/24 13:54 [I] New state: SCALING
05/13/24 13:56 [I] New state: COOLDOWN
05/13/24 13:01 [I] New state: RUNNING
$ kubectl -n traefik-system get pods
NAME READY STATUS RESTARTS AGE
one-traefik-6768f7bdf4-cvqn2 1/1 Running 0 23m
one-traefik-6768f7bdf4-qqfcl 1/1 Running 0 23m
$ kubectl -n traefik-system get pods -o jsonpath='{range .items[*]}{.spec.containers[0].image}{"\n"}{end}'
traefik:2.7.1
traefik:2.7.1
- To enable downloading Traefik Helm chartes, update the Helm repositories.
$ helm repo add traefik https://helm.traefik.io/traefik
"traefik" has been added to your repositories
$ helm repo update
Hang tight while we grab the latest from your chart repositories...
...Successfully got an update from the "traefik" chart repository
Update Complete. ⎈Happy Helming!⎈
- Patch the
HelmChart/one-traefik
CRD object.
#/usr/bin/env bash
set -eo pipefail
helm pull traefik/traefik --version '28.0.0'
if ! test -f /opt/one-appliance/addons/one-traefik-backup.yaml; then
cat /opt/one-appliance/addons/one-traefik.yaml | tee /opt/one-appliance/addons/one-traefik-backup.yaml
fi
install -m u=rw,go= -D /dev/fd/0 /opt/one-appliance/addons/kustomization.yaml <<EOF
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- one-traefik-backup.yaml
patches:
- target:
kind: HelmChart
group: helm.cattle.io
version: v1
name: one-traefik
patch: |
- op: replace
path: /spec/chartContent
value: >-
$(base64 -w0 < ./traefik-28.0.0.tgz)
EOF
kubectl kustomize /opt/one-appliance/addons/ | tee /opt/one-appliance/addons/one-traefik.yaml
- Verify that Traefik's pods have been recreated.
$ kubectl -n traefik-system get pods
NAME READY STATUS RESTARTS AGE
one-traefik-7c5875d657-9v5h2 1/1 Running 0 88s
one-traefik-7c5875d657-bsp4v 1/1 Running 0 88s
$ kubectl -n traefik-system get pods -o jsonpath='{range .items[*]}{.spec.containers[0].image}{"\n"}{end}'
docker.io/traefik:v3.0.0
docker.io/traefik:v3.0.0
Warning
Since Treafik 3.0.0 the apiVersion: traefik.containo.us/v1alpha1
field in CRD objects must be replaced with apiVersion: traefik.io/v1alpha1
. Please update/patch all your Traefik-specific CRD objects.
Important
This example was a very simple and quick Helm chart upgrade, but in general config changes in the spec.valuesContent
field may also be required. Please plan your upgrades ahead!