Fedora k8s - hpaluch/hpaluch.github.io GitHub Wiki
Fedora k8s
Single node k8s on Fedora 41.
Generally I followed official guide on: https://docs.fedoraproject.org/en-US/quick-docs/using-kubernetes-kubeadm/ with few exceptions. Please read detailed setup guide on: https://github.com/hpaluch/k8s-wordpress/blob/master/README.md#setup-k8s-with-kubeadm-on-fedora-41
[!WARNING] I recommend using "short" hostname in system - without domain! RedHat in the past used FQDN in hostname which causes sometimes problems - some components are too smart and strip everything after
.
in hostname but some not - it typically creates problem that something matching hostname does not work properly.
Added metrics server, variant 1 using workaround from https://github.com/kubernetes-sigs/metrics-server/issues/1221:
# tested on k8s v1.32
sudo dnf install helm
helm repo add metrics-server https://kubernetes-sigs.github.io/metrics-server/
helm install metrics-server metrics-server/metrics-server --set args="{--kubelet-insecure-tls}" -n kube-system
Variant 2 - troublesome: Added metrics server from: https://devopscube.com/setup-kubernetes-cluster-kubeadm/ using:
curl -fLO https://raw.githubusercontent.com/techiescamp/kubeadm-scripts/main/manifests/metrics-server.yaml
kubectl apply -f metrics-server.yaml
Added simple test nginx from same page using nginx.yaml
:
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-deployment
spec:
selector:
matchLabels:
app: nginx
replicas: 2
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:latest
ports:
- containerPort: 80
---
apiVersion: v1
kind: Service
metadata:
name: nginx-service
spec:
selector:
app: nginx
type: NodePort
ports:
- port: 80
targetPort: 80
nodePort: 32000
And applied it with kubectl apply -f nginx.yaml
Tip: See also my project https://github.com/hpaluch/k8s-wordpress for WordPress deployment on single-node k8s cluster.
Problem: pod(s) stuck in ContainerCreating state
But there was problem - both metrics and nginx were in state ContainerCreating
:
$ kubectl get po -A
NAMESPACE NAME READY STATUS RESTARTS AGE
default nginx-deployment-7c79c4bf97-2h4bs 0/1 ContainerCreating 0 9m49s
default nginx-deployment-7c79c4bf97-z8pdp 0/1 ContainerCreating 0 9m49s
kube-flannel kube-flannel-ds-p97kl 1/1 Running 0 12m
kube-system coredns-76f75df574-6gzgp 1/1 Running 0 21m
kube-system coredns-76f75df574-kw699 1/1 Running 0 21m
kube-system etcd-fed-k8s.example.com 1/1 Running 0 21m
kube-system kube-apiserver-fed-k8s.example.com 1/1 Running 0 21m
kube-system kube-controller-manager-fed-k8s.example.com 1/1 Running 0 21m
kube-system kube-proxy-k6g8s 1/1 Running 0 21m
kube-system kube-scheduler-fed-k8s.example.com 1/1 Running 0 21m
kube-system metrics-server-d4dc9c4f-97d4f 0/1 ContainerCreating 0 10m
When I tried describe:
$ kubectl describe po nginx-deployment-7c79c4bf97-2h4bs
...
Warning FailedCreatePodSandBox 13m kubelet \
Failed to create pod sandbox: rpc error: code = Unknown \
desc = failed to create pod network sandbox k8s_nginx-deployment-7c79c4bf97-\
2h4bs_default_925a62ee-bf5e-4a5c-bb67-7836e179a41e_0(d70b4c9cd349833b21ce1c\
83bc74b1949a3fbe730fd31b7629982298a09bc9b8): error adding pod \
default_nginx-deployment-7c79c4bf97-2h4bs to CNI network "cbr0": \
plugin type="flannel" failed (add): failed to set bridge addr: "cni0" \
already has an IP address different from 10.244.0.1/24
Googling revealed page: https://devops.stackexchange.com/questions/14891/cni0-already-has-an-ip-address
There is tip to use sudo ip link delete cni0 type bridge
but I rather recommend rebooting system
(deleteing bridge will cause other issues when creating containers)...
And that helped - hoping that it was only transient bug.
Metrics server forbidden
Now metrics server is running but not working properly. Found:
$ kubectl logs -n kube-system metrics-server-d4dc9c4f-97d4f
I1225 14:21:01.208774 1 server.go:191] "Failed probe" probe="metric-storage-ready" err="no metrics to serve"
E1225 14:21:08.871515 1 scraper.go:149] "Failed to scrape node" \
err="request failed, status: \"403 Forbidden\"" node="fed-k8s.example.com"
Auditing API Server (dit not help)
It was "red hering" - because 403 Forbidden was returned by Kubelet instead of API Server(!). Please skip to next section for fix...
[!CAUTION] You can easily crash API server and make your whole k8s totally unusable! Use on YOUR OWN RISK!
I though that metrics 403 error was reported by API server, but it turned out to be not true. I followed https://kubernetes.io/docs/tasks/debug/debug-cluster/_print/#audit-policy
- create new file
/etc/kubernetes/audit-policy.yaml
with contents:
# Log all requests at the Metadata level.
# https://kubernetes.io/docs/tasks/debug/debug-cluster/_print/
apiVersion: audit.k8s.io/v1
kind: Policy
rules:
- level: Metadata
- copied
cp /etc/kubernetes/manifests/kube-apiserver.yaml /home/USER/
- applied following diff:
--- kube-apiserver.yaml.orig 2024-12-25 19:52:11.475000000 +0100
+++ kube-apiserver.yaml 2024-12-25 19:55:18.808000000 +0100
@@ -40,6 +40,8 @@
- --service-cluster-ip-range=10.96.0.0/12
- --tls-cert-file=/etc/kubernetes/pki/apiserver.crt
- --tls-private-key-file=/etc/kubernetes/pki/apiserver.key
+ - --audit-policy-file=/etc/kubernetes/audit-policy.yaml
+ - --audit-log-path=/var/log/kubernetes/audit/audit.log
image: registry.k8s.io/kube-apiserver:v1.29.12
imagePullPolicy: IfNotPresent
livenessProbe:
@@ -85,6 +87,12 @@
- mountPath: /etc/kubernetes/pki
name: k8s-certs
readOnly: true
+ - mountPath: /etc/kubernetes/audit-policy.yaml
+ name: audit
+ readOnly: true
+ - mountPath: /var/log/kubernetes/audit/
+ name: audit-log
+ readOnly: false
hostNetwork: true
priority: 2000001000
priorityClassName: system-node-critical
@@ -104,4 +112,12 @@
path: /etc/kubernetes/pki
type: DirectoryOrCreate
name: k8s-certs
+ - name: audit
+ hostPath:
+ path: /etc/kubernetes/audit-policy.yaml
+ type: File
+ - name: audit-log
+ hostPath:
+ path: /var/log/kubernetes/audit/
+ type: DirectoryOrCreate
status: {}
- now prepare log directory:
sudo 196 mkdir -p /var/log/kubernetes/audit/ sudo chmod a+rwxt /var/log/kubernetes/audit/
- WARNING! Now extremely dangerous step !!!
- copy back modified
kube-apiserver.yaml
to/etc/kubernetes/manifests/
- K8s will detect manifest change and re-deploying API Server POD - K8s will be completely unavailable for some time.
- if there was not error you should see quickly increasing file
/var/log/kubernetes/audit/audit.log
- to make it easier to read I passed it through jq:
jq < /var/log/kubernetes/audit/audit.log | tee /home/USER/audit-log.json
- but I quickly found that API Server is not sending 403:
$ fgrep code audit-log.json | sort -u "code": 200 "code": 201 "code": 404 "code": 500
So I decided to look into source:
- in my
metrics-server.yaml
I foundimage: registry.k8s.io/metrics-server/metrics-server:v0.7.1
- so it should be following tag: https://github.com/kubernetes-sigs/metrics-server/releases/tag/v0.7.1
Failed to scrape node
is inpkg/scraper/scraper.go
collectNode()
containsc.kubeletClient.GetMetrics(ctx, node)
- looking for magic
/metrics/resource
path found something inKNOWN_ISSUES
# returns prometheus metrics
kubectl get --raw /api/v1/nodes/`hostname`/proxy/metrics/resource
# returns json
kubectl get --raw /api/v1/nodes/`hostname`/proxy/stats/summary | jq
Finally got something:
curl -fsSkv https://`hostname -i`:10250/metrics/resource
< HTTP/2 403
< content-type: text/plain; charset=utf-8
< content-length: 80
< date: Wed, 25 Dec 2024 19:51:55 GMT
* The requested URL returned error: 403
And:
netstat -anp | fgrep 10250
tcp 0 0 10.244.0.1:54188 10.244.0.212:10250 TIME_WAIT -
tcp6 0 0 :::10250 :::* LISTEN 824/kubelet
tcp6 0 0 192.168.122.92:10250 10.244.0.212:54492 ESTABLISHED 824/kubelet
Fix for metrics 403 Forbidden
Kubelet is standalone process that manages containers:
/usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --fail-swap-on=false --pod-manifest-path=/etc/kubernetes/manifests --cluster-dns=10.96.0.10 --cluster-domain=cluster.local --authorization-mode=Webhook --client-ca-file=/etc/kubernetes/pki/ca.crt --cgroup-driver=systemd --container-runtime-endpoint=unix:///var/run/crio/crio.sock --pod-infra-container-image=registry.k8s.io/pause:3.9
And that's exactly difference between Ubuntu (where it works) and Fedora - the
--authorization-mode=Webhook
(Ubuntu is simply omitting this argument switching to default AlwaysAllow
)... On
https://serverfault.com/a/1166711 there is recommendation to replace
--authorization-mode=Webhook
with -authorization-mode=AlwaysAllow
, but it
is strongly discouraged for production (not my case :-)
Resulting change in /etc/systemd/system/kubelet.service.d/override.conf
[Service]
Environment="KUBELET_AUTHZ_ARGS=--authorization-mode=AlwaysAllow --client-ca-file=/etc/kubernetes/pki/ca.crt"
And sudo systemctl daemon-reload && sudo systemctl restart kubelet
- or rather restart whole system...
Once metrics works you can use for example these commands (it make take several minutes before 1st metrics data come):
$ kubectl top nodes
NAME CPU(cores) CPU% MEMORY(bytes) MEMORY%
fed-k8s2 137m 6% 1087Mi 27%
$ kubectl top pods -A
NAMESPACE NAME CPU(cores) MEMORY(bytes)
kube-flannel kube-flannel-ds-j9ksd 8m 43Mi
kube-system coredns-76f75df574-k4tbs 2m 12Mi
kube-system coredns-76f75df574-zsdwh 2m 56Mi
kube-system etcd-fed-k8s2 15m 65Mi
...
Installing web dashboard
K8s includes "Dashboard UI" - we will more or less follow: https://kubernetes.io/docs/tasks/access-application-cluster/web-ui-dashboard/ to install it:
sudo dnf install helm
helm repo add kubernetes-dashboard https://kubernetes.github.io/dashboard/
helm upgrade --install kubernetes-dashboard \
kubernetes-dashboard/kubernetes-dashboard \
--create-namespace --namespace kubernetes-dashboard
# poll command below until all PODs are "Running"
kubectl get pod -n kubernetes-dashboar
To access Dashboard you have to create temporary proxy using:
kubectl -n kubernetes-dashboard port-forward svc/kubernetes-dashboard-kong-proxy 8443:443
However it will forward only from localhost
. You have two choices to connect:
- create SSH tunnel from Workstation to K8s server - adding to appropriate entry in
~/.ssh/config
line:LocalForward 127.0.0.1:8443 127.0.0.1:8443
, then logged to my K8s node and run above command - or install on your Workstation K8s
kubectl
that has Proxy capability
To test later option I did:
-
query on your k8s server
kubectl
version:# run on k8s server: $ kubectl version Client Version: v1.29.11 Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3 Server Version: v1.29.12
-
now install on your Workstation kubectl following https://kubernetes.io/docs/tasks/tools/install-kubectl-linux/#install-kubectl-binary-with-curl-on-linux
-
but remember to replace
v1.29.0
in URL below with version matching your k8s deployment.# run on Workstation mkdir -p ~/bin curl -fL -o ~/bin/kubectl https://dl.k8s.io/release/v1.29.0/bin/linux/amd64/kubectl ~/bin/kubectl version Client Version: v1.29.0 Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3 The connection to the server localhost:8080 was refused - did you specify the right host or port?
-
to be able to access our K8s server we need to copy
~/.kube/config
from k8s server to Workstation using:# run on Workstation mkdir -p ~/.kube scp IP_OF_YOUR_K8S_SERVER:.kube/config ~/.kube/
-
when you run again
kubectl
on your workstation it should also report server version, for example:# run on Workstation $ kubectl version Client Version: v1.29.0 Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3 Server Version: v1.29.12
-
now run on your Workstation tunnel command:
# run on Workstation kubectl -n kubernetes-dashboard port-forward svc/kubernetes-dashboard-kong-proxy 8443:443
-
and run browser on your Workstation with URL: https://127.0.0.1:8443/
-
you should get advice how to generate token
# run on any Node: workstation or k8s server: kubectl -n NAMESPACE create token SERVICE_ACCOUNT
-
we know namespace but not
SERVICE_ACCOUNT
-
we have to follow https://github.com/kubernetes/dashboard/blob/master/docs/user/access-control/creating-sample-user.md to create
admin-user
withcluster-admin
role:
Create ui-admin-user.yaml
with contents:
apiVersion: v1
kind: ServiceAccount
metadata:
name: admin-user
namespace: kubernetes-dashboard
And apply it with: kubectl apply -f ui-admin-user.yaml
Now verify that ClusterRole named cluster-admin
exists:
$ kubectl get clusterrole -A cluster-admin
NAME CREATED AT
cluster-admin 2024-12-26T07:30:51Z
Create file ui-admin-binding.yaml
with contents:
piVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: admin-user
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: cluster-admin
subjects:
- kind: ServiceAccount
name: admin-user
namespace: kubernetes-dashboard
And apply it with kubectl apply -f ui-admin-binding.yaml
Finally we can generate token:
kubectl -n kubernetes-dashboard create token admin-user
On your browser paste this very looong token to textbox line Bearer token *
(it is easy to overlook it - it does not look like textbox - but like
horizontal line!).
Now you will be greeted with Web-UI - but please note that you have to select
proper Namespace on left-top corner or All namespaces
to list objects in all
namespaces.
Now you should see similar view as on top of this page:
And that's it!
Monitoring with Prometheus + Grafana
I just followed guide on https://medium.com/@muppedaanvesh/a-hands-on-guide-to-kubernetes-monitoring-using-prometheus-grafana-%EF%B8%8F-b0e00b1ae039