Kubernetes data storage - ghdrako/doc_snipets GitHub Wiki

Pods and Volumes

While a pod can contain multiple containers, the best practice is for a pod to contain a single application container, along with optional additional helper containers, as shown in the figure. These helper containers might include init containers that run prior to the main application container in order to perform configuration tasks, or sidecar containers that run alongside the main application container to provide helper services such as observability or management.

In Kubernetes, the term volume is used to represent access to storage within a pod. By using a volume, the container has the ability to persist data that will outlive the container. A volume may be accessed by multiple containers in a pod. Each container has its own volumeMount within the pod that specifies the directory to which it should be mounted, allowing the mount point to differ between containers.

A Volume in Kubernetes represents a directory with data that is accessible across multiple containers in a Pod. The container data in a Pod is deleted or lost when a container crashes or restarts, but when you use a volume, the new container can pick up the data at the state before the container crashes. The volume outlives the containers in a Pod and can be consumed by any number of containers within that Pod.

A volume usage entails the declaration of the volume in a Pod by specifying a “volumes” property under the spec (spec.volumes) field in a Pod manifest file, followed by the volume in an array format.

Volume Types Category

  • Ephemeral: This is the category with the same lifetime as the Pod lifecycle but persists beyond container restart. It is a fast volume solution but not durable, thus, should be used for temporary data or applications that do not require data persistency. The volume types under this category are emptyDir, configMap, secret etc.
  • Durable: These are volume types that outlive the Pod lifecycle. The lifetime is independent on the Pod lifecycle but persists across both container and Pod restarts. Data is preserved in this category when Pod crashes or is deleted. The volume types under this category are: hostPath, persistentVolumeClaim, awsElasticBlockStore, azureDisk, gcePersistentDisk etc.

There are multiple cases where you might want to share data between multiple containers in a pod:

  • An init container creates a custom configuration file for the particular environment that the application container mounts in order to obtain configuration values.
  • The application pod writes logs, and a sidecar pod reads those logs to identify alert conditions that are reported to an external monitoring tool. you’ll likely want to avoid situations in which multiple containers are writing to the same volume, because you’ll have to ensure the multiple writers don’t conflict - Kubernetes does not do that for you.
apiVersion: v1
kind: Pod
metadata:
  name: my-pod
spec:
  containers:
  - name: my-app
    image: nginx
    volumeMounts:
    - name: web-data     # volume usage
      mountPath: /app/config
 volumes:
 - name: web-data     # volume definition

The configuration to be valid, a volume must be declared before being referenced, and a volume must be used by at least one container in the pod.

kubectl apply -f nginx-pod.yaml
kubectl get pod my-pod -o yaml | grep -A 5 " volumes:"
volumes:
  - emptyDir: {}             # default type of volume using if not specified
    name: web-data
  - name: default-token-2fp89
    secret:
      defaultMode: 420

EmptyDir Volume Type

  • https://unofficial-kubernetes.readthedocs.io/en/latest/concepts/storage/volumes/#emptydir An emptyDir volume is a volume type that is first created when a Pod is assigned to a Node. Its lifespan is dependent on the lifecycle of the Pod on that Node but recreates when the containers crash or restart. When a Pod dies, crashes, or is removed from a Node, the data in the emptyDir volume is deleted and lost. This type of volume is suitable for temporary data storage.

emptyDir is a type of ephemeral volume similar to Docker tempfs but in pod scope

Ephemeral volumes can be useful for data infrastructure or other applications that want to create a cache for fast access. Although they do not persist beyond the lifespan of a pod, they can still exhibit some of the typical properties of other volumes for longer-term persistence, such as the ability to snapshot.

apiVersion: v1
kind: Pod
metadata:
  name: myapp
spec:
  containers:
  - name: my-app
    image: nginx
    ports:
    - containerPort: 8080
    imagePullPolicy: Always
    volumeMounts:
    - name: my-volume
      mountPath: /app
  volumes:
  - name: my-volume
    emptyDir: {} 

HostPath Volume Type

  • https://kubernetes.io/docs/concepts/storage/volumes/#hostpath hostPath volume type is a durable volume type that mounts a directory from the host Node’s filesystem into a Pod. The file in the volume remains intact even if the Pod crashes, is terminated or is deleted. It is important that the directory and the Pod are created or scheduled on the same Node.
volumes:
- name: hostpath-volume	  # The name of the volume
  hostPath:
   path: /data           # directory location on host
apiVersion: v1
kind: Pod
metadata:
  name: myapp
spec:
  containers:
  - name: my-app
    image: nginx
    ports:
    - containerPort: 8080
    volumeMounts:
    - name: my-volume
      mountPath: /app
  volumes:
  - name: my-volume
    hostPath:
      path: /mnt/vpath  

Configuration volumes

ConfigMap Volumes

A ConfigMap is a Kubernetes resource that is used to store configuration values external to an application as a set of name-value pairs. The resulting configuration data can be mounted into the application as a volume, where it will appear as a directory. Each configuration value is represented as a file, where the filename is the key, and the contents of the file contain the value.

Secret Volumes

A Secret is similar to a ConfigMap, only it is intended for securing access to sensitive data that requires protection. Configuring and accessing Secrets is similar to using ConfigMap, with the additional benefit that Kubernetes helps decrypt the secret upon access within the pod.

Downward API Volumes

The Kubernetes Downward API exposes metadata about pods and containers, either as environment variables or as volumes. This is the same metadata that is used by kubectl and other clients.

The available pod metadata includes the pod’s name, ID, namespace, labels, and annotations. The containerized application might wish to use the pod information for logging and metrics reporting, or to determine database or table names.

The available container metadata includes the requested and maximum amounts of resources such as CPU, memory, and ephemeral storage. The containerized application might wish to use this information in order to throttle its own resource usage.

Kubernetes PersistentVolumes (PVs) and PersistentVolumeClaims (PVCs)

A PersistentVolumeClaim (PVC), is a process of storage requests from PVs by the users in Kubernetes. Kubernetes binds PVs with the PVCs based on the request and property set on those PVs. Kubernetes searches for PVs that correspond to the PVCs’ requested capacity and specified properties, so that each PVC can bind to a single PV.