k8s_storage - henk52/knowledgesharing GitHub Wiki
Kubernetes storage
Introduction
Purpose
References
need iscsi admin nfs core lib or something
Vocabulary
Longhorn
open issues for Longhorn
- how is data replicated
- when is data replicated
- what happens when a pod is started on a node without a replica
- how to add prometheus scraping of longhorn metrics
- which engine version are we using? v1 or v2?
- how does a container/pod talk to a volume in general?
Longhorn overview
-
Engine
- The Longhorn Engine always runs in the same node as the Pod that uses the Longhorn volume.
- The engine synchronously replicates the volume across the multiple replicas stored on multiple nodes.
- v2 is a Storage Performance Development Kit - SPDK
- Each engine manages one volume.
- The Longhorn Engine always runs in the same node as the Pod that uses the Longhorn volume.
-
The Longhorn CSI driver takes the block device, formats it, and mounts it on the node. Then the kubelet bind-mounts the device inside a Kubernetes Pod. This allows the Pod to access the Longhorn volume.
-
A Longhorn volume itself cannot shrink in size if you’ve removed content from your volume.
- For example, if you create a volume of 20 GB, used 10 GB, then removed the content of 9 GB, the actual size on the disk would still be 10 GB instead of 1 GB.
-
It seems the engine is paused when creating a new replica How New Replicas are Added
Installing longhorn
Pre-requirements for installation
Requirements Quick Installation
On all nodes:
- open-iscsi
- iscsid daemon
- NFSv4 client
- disk fmt xfs or ext4
- bash, curl, findmnt, grep, awk, blkid, lsblk
- Mount propagation must be enabled.
ansible-playbook playbooks/longhorn_requirements.yml -u root -b -v -i kubespray/inventory/test/hosts.yml --private-key=~/.ssh/test_ops
---
- name: Configure longhorn requirements
hosts: all
become: yes
tasks:
- name: Install iscsi-initiator-utils without scripts
ansible.builtin.shell:
cmd: dnf --setopt=tsflags=noscripts install -y iscsi-initiator-utils
args:
creates: /usr/sbin/iscsiadm
- name: Generate iSCSI initiator name
ansible.builtin.command:
cmd: /sbin/iscsi-iname
register: iscsi_initiator_name
changed_when: false
- name: Configure initiator name
ansible.builtin.copy:
content: "InitiatorName={{ iscsi_initiator_name.stdout }}\n"
dest: /etc/iscsi/initiatorname.iscsi
owner: root
group: root
mode: '0600'
- name: Enable and start iscsid service
ansible.builtin.systemd:
name: iscsid
enabled: yes
state: started
- name: Load iscsi_tcp kernel module
community.general.modprobe:
name: iscsi_tcp
state: present
- name: Install nfs-utils
ansible.builtin.dnf:
name: nfs-utils
state: present
- name: Load nfs kernel module
community.general.modprobe:
name: nfs
state: present
- name: Install Cryptsetup
ansible.builtin.dnf:
name: cryptsetup
state: present
- name: Install Device Mapper Userspace Tool
ansible.builtin.dnf:
name: device-mapper
state: present
- name: Load dm_crypt kernel module
community.general.modprobe:
name: dm_crypt
state: present
- curl -sSfL -o longhornctl https://github.com/longhorn/cli/releases/download/v1.11.0/longhornctl-linux-amd64
- chmod +x longhornctl
- export KUBECONFIG=$HOME/.kube/config
- kubectl create namespace longhorn-system
- ~/longhornctl check preflight
~/longhornctl check preflight
INFO[2026-01-30T20:20:45+08:00] Initializing preflight checker
INFO[2026-01-30T20:20:45+08:00] Cleaning up preflight checker
INFO[2026-01-30T20:20:45+08:00] Running preflight checker
WARN[2026-01-30T20:21:19+08:00] Failed to get pod container log container=output-longhornctl error="Get \"https://10.26.101.186:10250/containerLogs/longhorn-system/longhorn-preflight-checker-2wjrv/output-longhornctl\": dial tcp 10.26.101.186:10250: i/o timeout" kind=DaemonSet name=longhorn-preflight-checker namespace=longhorn-system pod=longhorn-preflight-checker-2wjrv
WARN[2026-01-30T20:21:49+08:00] Failed to get pod container log container=output-longhornctl error="Get \"https://10.26.101.71:10250/containerLogs/longhorn-system/longhorn-preflight-checker-6f7xl/output-longhornctl\": dial tcp 10.26.101.71:10250: i/o timeout" kind=DaemonSet name=longhorn-preflight-checker namespace=longhorn-system pod=longhorn-preflight-checker-6f7xl
WARN[2026-01-30T20:22:19+08:00] Failed to get pod container log container=output-longhornctl error="Get \"https://10.26.101.174:10250/containerLogs/longhorn-system/longhorn-preflight-checker-d66tt/output-longhornctl\": dial tcp 10.26.101.174:10250: i/o timeout" kind=DaemonSet name=longhorn-preflight-checker namespace=longhorn-system pod=longhorn-preflight-checker-d66tt
INFO[2026-01-30T20:22:20+08:00] Retrieved preflight checker result:
node1:
info:
- '[KubeDNS] Kube DNS "coredns" is set with 2 replicas and 2 ready replicas'
- '[IscsidService] Service iscsid is running'
- '[MultipathService] multipathd.service is not found (exit code: 4)'
- '[MultipathService] multipathd.socket is not found (exit code: 4)'
- '[NFSv4] NFS4 is supported'
- '[Packages] nfs-utils is installed'
- '[Packages] iscsi-initiator-utils is installed'
- '[Packages] cryptsetup is installed'
- '[Packages] device-mapper is installed'
- '[KernelModules] nfs is loaded'
- '[KernelModules] iscsi_tcp is loaded'
- '[KernelModules] dm_crypt is loaded'
INFO[2026-01-30T20:22:20+08:00] Cleaning up preflight checker
INFO[2026-01-30T20:22:20+08:00] Completed preflight checker
- TODO figure out why this changes every run
at /etc/iscsi/initiatorname.iscsiand what impact it might have on a reboot.
Longhorn configurations
-
Settings
- Allow Collecting Longhorn Usage Metrics
Installing longhorn via helm
-
helm repo add longhorn https://charts.longhorn.io
-
helm repo update
-
export KUBECONFIG=~/.kube/longhorn_test
-
helm install longhorn longhorn/longhorn --namespace longhorn-system --create-namespace --version 1.11.0
WARNING: Kubernetes configuration file is group-readable. This is insecure. Location: /home/heko/.kube/longhorn_test
E0206 17:02:27.717677 2319823 memcache.go:287] couldn't get resource list for metrics.k8s.io/v1beta1: the server is currently unable to handle the request
...
E0206 17:02:33.896787 2319823 memcache.go:287] couldn't get resource list for metrics.k8s.io/v1beta1: the server is currently unable to handle the request
E0206 17:02:33.980648 2319823 memcache.go:121] couldn't get resource list for metrics.k8s.io/v1beta1: the server is currently unable to handle the request
E0206 17:02:34.147621 2319823 memcache.go:287] couldn't get resource list for metrics.k8s.io/v1beta1: the server is currently unable to handle the request
E0206 17:02:34.225116 2319823 memcache.go:121] couldn't get resource list for metrics.k8s.io/v1beta1: the server is currently unable to handle the request
E0206 17:02:34.386397 2319823 memcache.go:287] couldn't get resource list for metrics.k8s.io/v1beta1: the server is currently unable to handle the request
E0206 17:02:34.470796 2319823 memcache.go:121] couldn't get resource list for metrics.k8s.io/v1beta1: the server is currently unable to handle the request
E0206 17:02:34.646416 2319823 memcache.go:287] couldn't get resource list for metrics.k8s.io/v1beta1: the server is currently unable to handle the request
E0206 17:02:34.707847 2319823 memcache.go:121] couldn't get resource list for metrics.k8s.io/v1beta1: the server is currently unable to handle the request
E0206 17:02:34.872312 2319823 memcache.go:287] couldn't get resource list for metrics.k8s.io/v1beta1: the server is currently unable to handle the request
E0206 17:02:34.947485 2319823 memcache.go:121] couldn't get resource list for metrics.k8s.io/v1beta1: the server is currently unable to handle the request
E0206 17:02:35.102502 2319823 memcache.go:287] couldn't get resource list for metrics.k8s.io/v1beta1: the server is currently unable to handle the request
E0206 17:02:35.178767 2319823 memcache.go:121] couldn't get resource list for metrics.k8s.io/v1beta1: the server is currently unable to handle the request
E0206 17:02:35.344459 2319823 memcache.go:287] couldn't get resource list for metrics.k8s.io/v1beta1: the server is currently unable to handle the request
E0206 17:02:35.407633 2319823 memcache.go:121] couldn't get resource list for metrics.k8s.io/v1beta1: the server is currently unable to handle the request
E0206 17:02:35.565732 2319823 memcache.go:287] couldn't get resource list for metrics.k8s.io/v1beta1: the server is currently unable to handle the request
E0206 17:02:35.640357 2319823 memcache.go:121] couldn't get resource list for metrics.k8s.io/v1beta1: the server is currently unable to handle the request
E0206 17:02:35.801318 2319823 memcache.go:287] couldn't get resource list for metrics.k8s.io/v1beta1: the server is currently unable to handle the request
E0206 17:02:35.881761 2319823 memcache.go:121] couldn't get resource list for metrics.k8s.io/v1beta1: the server is currently unable to handle the request
E0206 17:02:36.036600 2319823 memcache.go:287] couldn't get resource list for metrics.k8s.io/v1beta1: the server is currently unable to handle the request
E0206 17:02:36.117715 2319823 memcache.go:121] couldn't get resource list for metrics.k8s.io/v1beta1: the server is currently unable to handle the request
E0206 17:02:36.320326 2319823 memcache.go:287] couldn't get resource list for metrics.k8s.io/v1beta1: the server is currently unable to handle the request
E0206 17:02:36.371316 2319823 memcache.go:121] couldn't get resource list for metrics.k8s.io/v1beta1: the server is currently unable to handle the request
E0206 17:02:36.534127 2319823 memcache.go:287] couldn't get resource list for metrics.k8s.io/v1beta1: the server is currently unable to handle the request
E0206 17:02:36.601262 2319823 memcache.go:121] couldn't get resource list for metrics.k8s.io/v1beta1: the server is currently unable to handle the request
E0206 17:02:36.764860 2319823 memcache.go:287] couldn't get resource list for metrics.k8s.io/v1beta1: the server is currently unable to handle the request
E0206 17:02:36.841641 2319823 memcache.go:121] couldn't get resource list for metrics.k8s.io/v1beta1: the server is currently unable to handle the request
E0206 17:02:37.014472 2319823 memcache.go:287] couldn't get resource list for metrics.k8s.io/v1beta1: the server is currently unable to handle the request
E0206 17:02:37.080299 2319823 memcache.go:121] couldn't get resource list for metrics.k8s.io/v1beta1: the server is currently unable to handle the request
E0206 17:02:37.242986 2319823 memcache.go:287] couldn't get resource list for metrics.k8s.io/v1beta1: the server is currently unable to handle the request
E0206 17:02:37.308348 2319823 memcache.go:121] couldn't get resource list for metrics.k8s.io/v1beta1: the server is currently unable to handle the request
E0206 17:02:37.674044 2319823 memcache.go:287] couldn't get resource list for metrics.k8s.io/v1beta1: the server is currently unable to handle the request
E0206 17:02:37.739585 2319823 memcache.go:121] couldn't get resource list for metrics.k8s.io/v1beta1: the server is currently unable to handle the request
E0206 17:02:37.905393 2319823 memcache.go:287] couldn't get resource list for metrics.k8s.io/v1beta1: the server is currently unable to handle the request
E0206 17:02:37.975335 2319823 memcache.go:121] couldn't get resource list for metrics.k8s.io/v1beta1: the server is currently unable to handle the request
E0206 17:02:38.141388 2319823 memcache.go:287] couldn't get resource list for metrics.k8s.io/v1beta1: the server is currently unable to handle the request
E0206 17:02:38.216406 2319823 memcache.go:121] couldn't get resource list for metrics.k8s.io/v1beta1: the server is currently unable to handle the request
E0206 17:02:38.387204 2319823 memcache.go:287] couldn't get resource list for metrics.k8s.io/v1beta1: the server is currently unable to handle the request
E0206 17:02:38.452441 2319823 memcache.go:121] couldn't get resource list for metrics.k8s.io/v1beta1: the server is currently unable to handle the request
E0206 17:02:38.620645 2319823 memcache.go:287] couldn't get resource list for metrics.k8s.io/v1beta1: the server is currently unable to handle the request
E0206 17:02:38.687255 2319823 memcache.go:121] couldn't get resource list for metrics.k8s.io/v1beta1: the server is currently unable to handle the request
E0206 17:02:38.887336 2319823 memcache.go:287] couldn't get resource list for metrics.k8s.io/v1beta1: the server is currently unable to handle the request
E0206 17:02:38.964706 2319823 memcache.go:121] couldn't get resource list for metrics.k8s.io/v1beta1: the server is currently unable to handle the request
E0206 17:02:39.158787 2319823 memcache.go:287] couldn't get resource list for metrics.k8s.io/v1beta1: the server is currently unable to handle the request
E0206 17:02:39.234262 2319823 memcache.go:121] couldn't get resource list for metrics.k8s.io/v1beta1: the server is currently unable to handle the request
E0206 17:02:39.421942 2319823 memcache.go:287] couldn't get resource list for metrics.k8s.io/v1beta1: the server is currently unable to handle the request
E0206 17:02:39.491892 2319823 memcache.go:121] couldn't get resource list for metrics.k8s.io/v1beta1: the server is currently unable to handle the request
E0206 17:02:39.686965 2319823 memcache.go:287] couldn't get resource list for metrics.k8s.io/v1beta1: the server is currently unable to handle the request
E0206 17:02:39.756624 2319823 memcache.go:121] couldn't get resource list for metrics.k8s.io/v1beta1: the server is currently unable to handle the request
E0206 17:02:40.398356 2319823 memcache.go:287] couldn't get resource list for metrics.k8s.io/v1beta1: the server is currently unable to handle the request
E0206 17:02:40.469471 2319823 memcache.go:121] couldn't get resource list for metrics.k8s.io/v1beta1: the server is currently unable to handle the request
E0206 17:02:40.646569 2319823 memcache.go:287] couldn't get resource list for metrics.k8s.io/v1beta1: the server is currently unable to handle the request
E0206 17:02:40.724785 2319823 memcache.go:121] couldn't get resource list for metrics.k8s.io/v1beta1: the server is currently unable to handle the request
E0206 17:02:40.910783 2319823 memcache.go:287] couldn't get resource list for metrics.k8s.io/v1beta1: the server is currently unable to handle the request
E0206 17:02:40.965578 2319823 memcache.go:121] couldn't get resource list for metrics.k8s.io/v1beta1: the server is currently unable to handle the request
E0206 17:02:41.220171 2319823 memcache.go:287] couldn't get resource list for metrics.k8s.io/v1beta1: the server is currently unable to handle the request
E0206 17:02:41.281784 2319823 memcache.go:121] couldn't get resource list for metrics.k8s.io/v1beta1: the server is currently unable to handle the request
E0206 17:02:43.740110 2319823 memcache.go:287] couldn't get resource list for metrics.k8s.io/v1beta1: the server is currently unable to handle the request
E0206 17:02:43.811645 2319823 memcache.go:121] couldn't get resource list for metrics.k8s.io/v1beta1: the server is currently unable to handle the request
NAME: longhorn
LAST DEPLOYED: Fri Feb 6 17:02:27 2026
NAMESPACE: longhorn-system
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
Longhorn is now installed on the cluster!
Please wait a few minutes for other Longhorn components such as CSI deployments, Engine Images, and Instance Managers to be initialized.
Visit our documentation at https://longhorn.io/docs/
Troubleshooting longhorn
Multi-Attach error for volume
this is shown for about a minute thne it comes back.
Multi-Attach error for volume "pvc-ee166920-785f-4312-b1c8-1a9965457c8a" Volume is already exclusively attached to one node and can't be attached to another. (40s)