kubernetes taint toleration - ghdrako/doc_snipets GitHub Wiki
special constraints on the Kubernetes cluster; for example, some pods may require special hardware, colocation with other specific pods, or isolation from others. There are many options for placing those application containers into different, separate node groups, one of which is through the use of taints and tolerations.
Taints and tolerations are a mechanism that allows you to ensure that pods are not placed on inappropriate nodes. Taints are added to nodes, while tolerations are defined in the pod specification. When you taint a node, it will repel all the pods except those that have a toleration for that taint. A node can have one or many taints associated with it.
For example, most Kubernetes distributions will automatically taint the master nodes so that one of the pods that manages the control plane is scheduled onto them and not any other data plane pods deployed by users. This ensures that the master nodes are dedicated to run control plane pods.
A **taint **can produce three possible effects:
- NoSchedule The Kubernetes scheduler will only allow scheduling pods that have tolerations for the tainted nodes.
- PreferNoSchedule The Kubernetes scheduler will try to avoid scheduling pods that don’t have tolerations for the tainted nodes.
- NoExecute Kubernetes will evict the running pods from the nodes if the pods don’t have tolerations for the tainted nodes.
If you need to dedicate a group of worker nodes for a set of users, you can add a taint to those nodes, such as by using this command:
kubectl taint nodes nodename dedicated=groupName:NoSchedule
kubectl taint nodes <node name> <taint key>=<taint value>:<taint effect>
# kubectl taint nodes node1 key1=value1:NoSchedule
# kubectl taint nodes node1 key1=value1:NoExecute
# kubectl taint nodes node1 key2=value1:PreferNoSchedule
Node with access to GPU resources and we only want to schedule pods that are capable of using GPU resources on it.
kubectl taint nodes minikube-m02 gpu=true:NoSchedule
kubectl taint nodes <node name> <taint key>:<taint effect>-
kubectl taint nodes minikube-m02 gpu:NoSchedule-
Add tolerations of the taint in that user group’s pods so they can run on those nodes. To further ensure that pods only get scheduled on that set of tainted nodes, you can also add a label to those nodes, e.g., dedicated=groupName
. Then use NodeSelector in the deployment/pod spec, which will make sure that pods from the user group are bound to the node group and don’t run anywhere else.
Equal Operator
tolerations:
- key: "<taint key>"
operator: "Equal"
value: "<taint value>"
effect: "<taint effect>"
Exists Operator
tolerations:
- key: "<taint key>"
operator: "Exists"
effect: "<taint effect>"
The Equal operator requires the taint value and will not match if the value is different. Yet, the Exists operator will match any value as it only considers if the taint is defined regardless of the value.
Only the pod with the matching toleration, which is the gpu (taint key), will be allowed on the node:
apiVersion: v1
kind: Pod
metadata:
name: nginx-test
labels:
env: test-env
spec:
containers:
- name: nginx
image: nginx:latest
imagePullPolicy: IfNotPresent
resources:
requests:
memory: "128Mi"
cpu: "250m"
limits:
memory: "512Mi"
cpu: "500m"
tolerations:
- key: "gpu"
operator: "Equal"
value: "true"
effect: "NoSchedule"
we can match any taint in a node by simply defining the Exists operator without a key, value, or effect.
tolerations:
operator: "Exists"
We can match any value and effect in any node with the specified taint key by defining the operator and the key, as shown below.
tolerations:
operator: "Exists"
key: "<taint key>"
kubectl taint nodes minikube-m02 gpu=true:NoSchedule
kubectl taint nodes minikube-m02 project=system:NoExecute
kubectl taint nodes minikube-m02 type=process:NoSchedule
apiVersion: v1
kind: Pod
metadata:
name: nginx-test
labels:
env: test-env
spec:
containers:
- name: nginx
image: nginx:latest
imagePullPolicy: IfNotPresent
resources:
requests:
memory: "128Mi"
cpu: "250m"
limits:
memory: "512Mi"
cpu: "500m"
tolerations:
- key: "gpu"
operator: "Equal"
value: "true"
effect: "NoSchedule"
- key: "project"
operator: "Equal"
value: "system"
effect: "NoSchedule"
The NoExecute effect evicts pods from a node if they do not tolerate a specific taint.
Users can define how long a pod stays within a node when there is a matching taint by using the optional tolerationSeconds field in the tolerations section for NoExecute. It ensures that the pod remains within the node for the specified period before being evicted. The NoExecute get applied according to the following rules:
- Pods without the matching tolerations get evicted immediately.
- Pods that have matching toleration and have specified tolerationSeconds field will remain within the node for the specified time before getting evicted. The pod will not get removed if the taint is removed before the time expires.
- Pods that have matching tolerations and without the tolerationSeconds field will continue to live within the node unless manually removed.
tolerations:
- key: "cpu"
operator: "Equal"
value: "true"
effect: "NoExecute"
tolerationSeconds: 3600
Pod will only stay within the node for 3600 seconds before getting evicted.
NoExecute effect can be used to evict pods within a node. The node controller will automatically taint nodes if certain conditions are true, such as node being unreachable, network unavailability, and memory or disk pressure. You can find the full list of default conditions in the Kubernetes documentation. When a default condition occurs, the node controller or kubelet will add the relevant taints to the nodes with the NoExecute effect.
Administrators can modify the default behavior by creating custom tolerations. For example, Kubernetes automatically adds tolerations for node.kubernetes.io/not-ready and node.kubernetes.io/unreachable with tolerationSeconds set for 300 (5 minutes) by default. However, administrators can change this behavior by explicitly specifying toleration for these node conditions within the pod.
For example, the configuration below will override the default eviction behavior and allow the pod to remain within the node for an hour before eviction.
tolerations:
- key: "node.kubernetes.io/unreachable"
operator: "Exists"
value: "NoExecute"
tolerationSeconds: 3600
While taints and tolerations can repel pods from a node, they cannot ensure that a pod gets scheduled on a specific node. A pod can end up on any node without a taint repelling it.
Administrators can combine node affinity with taints and tolerations to create dedicated nodes that only allow specific pods to run on them.
Na nodzie - taint - zabrania uruchamiac sie podom chyba ze maja tolerancje na danego tainta
taints:
- effect: NoSchedule
key: harvester
value: harvester
W podzie - wskazanie na ktorej maszynie pod ma sie uruchomic
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: cloud.google.com/gke-nodepool
operator: In
values:
- batch-harvester
W podzie - wskazanie tolerancji na tainta z noda
tolerations:
- effect: NoSchedule
key: harvester
operator: Equal
value: harvester
- effect: NoExecute
key: node.kubernetes.io/not-ready
operator: Exists
tolerationSeconds: 300
- effect: NoExecute
key: node.kubernetes.io/unreachable
operator: Exists
tolerationSeconds: 300
- https://kubernetes.io/docs/concepts/scheduling-eviction/taint-and-toleration/
- https://blog.kubecost.com/blog/kubernetes-taints/
Taints are a Kubernetes node property that enable nodes to repel certain pods. Tolerations are a Kubernetes pod property that allow a pod to be scheduled on a node with a matching taint.
Add taint to node:
$ kubectl taint nodes <node name> <taint key>=<taint value>:<taint effect>
Remove taint
kubectl taint nodes <node name> <taint key>:<taint effect>-
The taint effect defines how a tainted node reacts to a pod without appropriate toleration. It must be one of the following effects;
-
NoSchedule
— The pod will not get scheduled to the node without a matching toleration. -
NoExecute
— This will immediately evict all the pods without the matching toleration from the node. -
PerferNoSchedule
— This is a softer version of NoSchedule where the controller will not try to schedule a pod with the tainted node. However, it is not a strict requirement.
kubectl taint nodes minikube-m02 gpu=true:NoSchedule
Add tolerations
Equal Operator
tolerations:
- key: "<taint key>"
operator: "Equal"
value: "<taint value>"
effect: "<taint effect>"
Exists Operator
tolerations:
- key: "<taint key>"
operator: "Exists"
effect: "<taint effect>"
kubectl taint nodes node1 key1=value1:NoSchedule
The taint has key key1, value value1, and taint effect NoSchedule. This means that no pod will be able to schedule onto node1 unless it has a matching toleration.
To remove the taint
kubectl taint nodes node1 key1=value1:NoSchedule-
You specify a toleration for a pod in the PodSpec. Both of the following tolerations "match" the taint created by the kubectl taint line above, and thus a pod with either toleration would be able to schedule onto node1:
tolerations:
- key: "key1"
operator: "Equal"
value: "value1"
effect: "NoSchedule"
tolerations:
- key: "key1"
operator: "Exists"
effect: "NoSchedule"