Why misalignment should not happen - cniackz/public GitHub Wiki

Objective:

To show why misalignment should not happen in MinIO Pods

Diagram:

<img width="484"### Objective:

To show why misalignment should not happen in MinIO Pods

Diagram:

Screenshot 2023-04-14 at 9 06 17 AM

Reasoning:

  • PVC have the node specification

Observation:

  1. Everytime pods get restarted in this testing cluster alignment is same, meaning:
  • minio1-pool-0-0 always goes to kind-worker3 after each restart
  • minio1-pool-0-1 always goes to kind-worker4 after each restart
  • minio1-pool-0-2 always goes to kind-worker after each restart
  • minio1-pool-0-3 always goes to kind-worker2 after each restart
$ k get pods -n tenant-ns -o wide
NAME              READY   STATUS    RESTARTS   AGE   IP            NODE           NOMINATED NODE   READINESS GATES
minio1-pool-0-0   2/2     Running   0          14m   10.244.1.12   kind-worker3   <none>           <none>
minio1-pool-0-1   2/2     Running   0          14m   10.244.3.12   kind-worker4   <none>           <none>
minio1-pool-0-2   2/2     Running   0          14m   10.244.2.11   kind-worker    <none>           <none>
minio1-pool-0-3   2/2     Running   0          14m   10.244.4.11   kind-worker2   <none>           <none>
  1. This is because mountPath in MinIO Pod is attach to same persistentVolumeClaim and PersistentVolumeClaim has annotated the node and is attached to same PV no matter if there is a pod restart:
  • mountPath in Pod
      volumeMounts:
        - name: cfg-vol
          mountPath: /tmp/minio/
        - name: data0
          mountPath: /export0
        - name: data1
          mountPath: /export1
        - name: data2
          mountPath: /export2
        - name: data3
          mountPath: /export3
  • persistentVolumeClaim in pod:
spec:
  volumes:
    - name: data0
      persistentVolumeClaim:
        claimName: data0-minio1-pool-0-0
    - name: data1
      persistentVolumeClaim:
        claimName: data1-minio1-pool-0-0
    - name: data2
      persistentVolumeClaim:
        claimName: data2-minio1-pool-0-0
    - name: data3
      persistentVolumeClaim:
        claimName: data3-minio1-pool-0-0
  • selected-node in PVC
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: data0-minio1-pool-0-0
  namespace: tenant-ns
  uid: 04a6d9f6-d7fd-4bee-b2a9-4ccd8cee8948
  resourceVersion: '17243'
  creationTimestamp: '2023-04-13T21:28:50Z'
  labels:
    v1.min.io/console: minio1-console
    v1.min.io/pool: pool-0
    v1.min.io/tenant: minio1
  annotations:
    pv.kubernetes.io/bind-completed: 'yes'
    pv.kubernetes.io/bound-by-controller: 'yes'
    volume.beta.kubernetes.io/storage-provisioner: rancher.io/local-path
    volume.kubernetes.io/selected-node: kind-worker3
  • Selected PV in PVC via volumeName:
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 10Gi
  volumeName: pvc-04a6d9f6-d7fd-4bee-b2a9-4ccd8cee8948
  storageClassName: standard
  volumeMode: Filesystem
  • Selected Node in PV via nodeSelectorTerms:
  storageClassName: standard
  volumeMode: Filesystem
  nodeAffinity:
    required:
      nodeSelectorTerms:
        - matchExpressions:
            - key: kubernetes.io/hostname
              operator: In
              values:
                - kind-worker3

Possible Errors:

  • If something similar to below happens to you, please check on above points, and make sure your scheduler is assigning same PVC after each restart and help us root cause this issue as it is not common as far as we know:
Error: Following error has been printed 3 times.. Detected unexpected drive ordering refusing to use the drive: expecting http://omer-tenant-pool-0-2.omer-tenant-hl.omer-tenant.svc.cluster.local:9000/export1, found http://omer-tenant-pool-0-0.omer-tenant-hl.omer-tenant.svc.cluster.local:9000/export1, refusing to use the drive (*fmt.wrapError)
4/13/2023 3:31:48 PM 

alt="Screenshot 2023-04-14 at 9 06 17 AM" src="https://user-images.githubusercontent.com/6667358/232051975-787efad7-4cbc-4eb2-8adc-9aacf54ca4fb.png">

Reasoning:

  • PVC have the node specification

Observation:

  1. Everytime pods get restarted in this testing cluster alignment is same, meaning:
  • minio1-pool-0-0 always goes to kind-worker3 after each restart
  • minio1-pool-0-1 always goes to kind-worker4 after each restart
  • minio1-pool-0-2 always goes to kind-worker after each restart
  • minio1-pool-0-3 always goes to kind-worker2 after each restart
$ k get pods -n tenant-ns -o wide
NAME              READY   STATUS    RESTARTS   AGE   IP            NODE           NOMINATED NODE   READINESS GATES
minio1-pool-0-0   2/2     Running   0          14m   10.244.1.12   kind-worker3   <none>           <none>
minio1-pool-0-1   2/2     Running   0          14m   10.244.3.12   kind-worker4   <none>           <none>
minio1-pool-0-2   2/2     Running   0          14m   10.244.2.11   kind-worker    <none>           <none>
minio1-pool-0-3   2/2     Running   0          14m   10.244.4.11   kind-worker2   <none>           <none>
  1. This is because mountPath in MinIO Pod is attach to same persistentVolumeClaim and PersistentVolumeClaim has annotated the node and is attached to same PV no matter if there is a pod restart:
  • mountPath in Pod
      volumeMounts:
        - name: cfg-vol
          mountPath: /tmp/minio/
        - name: data0
          mountPath: /export0
        - name: data1
          mountPath: /export1
        - name: data2
          mountPath: /export2
        - name: data3
          mountPath: /export3
  • persistentVolumeClaim in pod:
spec:
  volumes:
    - name: data0
      persistentVolumeClaim:
        claimName: data0-minio1-pool-0-0
    - name: data1
      persistentVolumeClaim:
        claimName: data1-minio1-pool-0-0
    - name: data2
      persistentVolumeClaim:
        claimName: data2-minio1-pool-0-0
    - name: data3
      persistentVolumeClaim:
        claimName: data3-minio1-pool-0-0
  • selected-node in PVC
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: data0-minio1-pool-0-0
  namespace: tenant-ns
  uid: 04a6d9f6-d7fd-4bee-b2a9-4ccd8cee8948
  resourceVersion: '17243'
  creationTimestamp: '2023-04-13T21:28:50Z'
  labels:
    v1.min.io/console: minio1-console
    v1.min.io/pool: pool-0
    v1.min.io/tenant: minio1
  annotations:
    pv.kubernetes.io/bind-completed: 'yes'
    pv.kubernetes.io/bound-by-controller: 'yes'
    volume.beta.kubernetes.io/storage-provisioner: rancher.io/local-path
    volume.kubernetes.io/selected-node: kind-worker3
  • Selected PV in PVC via volumeName:
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 10Gi
  volumeName: pvc-04a6d9f6-d7fd-4bee-b2a9-4ccd8cee8948
  storageClassName: standard
  volumeMode: Filesystem
  • Selected Node in PV via nodeSelectorTerms:
  storageClassName: standard
  volumeMode: Filesystem
  nodeAffinity:
    required:
      nodeSelectorTerms:
        - matchExpressions:
            - key: kubernetes.io/hostname
              operator: In
              values:
                - kind-worker3
⚠️ **GitHub.com Fallback** ⚠️