kubernetes node affinity inter‐Pod affinity - ghdrako/doc_snipets GitHub Wiki

Pod affinity

Setting hard rules for the Pod to be scheduled to a node based on its name or label doesn’t suffice. Instead of alienating a Pod if a node is not present, affinity provides a more flexible ruleset, so applications can be scheduled to a cluster environment that is constantly changing.

Affinity enables the Kubernetes scheduler to place a pod either on a group of nodes or a pod relative to the placement of other pods. To control pod placements on a group of nodes, a user needs to use node affinity rules. In contrast, pod affinity or pod anti-affinity rules provide the ability to control pod placements relative to other pods.

Kubernetes Advanced Pod Scheduling Techniques

Technique Summary
Taints and Tolerations Allowing a Node to control which pods can be run on them and which pods will be repelled.
NodeSelector Assigning a Pod to a specific Node using Labels
Node Affinity Similar to NodeSelector but More expressive and flexible such as adding “Required” and “Preferred” Rules
Pod Affinity and Anti-Affinity Co-locating Pods or Placing Pods away from each other based on Affinity and Anti-Affinity Rules

Node affinity rules use labels on nodes and label selectors in the pod specification file. Nodes don’t have control over the placement. If the scheduler places a pod using the node affinity rule and the rule is no longer valid later (e.g., due to a change of labels), the pod will continue to run on that node.

There are two types of Node Affinity rules:

  • Required
  • Preferred

Required rules must always be met for the scheduler to place the pod. With preferred rules the scheduler will try to enforce the rule, but doesn’t guarantee the enforcement.

Pod Affinity and Pod Anti-Affinity

Pod Affinity and Anti-Affinity enable the creation of rules that control where to place the pods relative to other pods. A user must label the nodes and use label selectors in pod specifications.

Pod Affinity/Anti-Affinity allows a pod to specify an affinity (or anti-affinity) towards a set of pods. As with Node Affinity, the node does not have control over the placement of the pod.

Affinity rules work based on labels. With an affinity rule, the scheduler can place the pod on the same node as other pods if the label on the new pod matches the label on the other pod.

An anti-affinity rule tells the scheduler not to place the new pod on the same node if the label on the new pod matches the label on another pod. Anti-affinity allows you to keep pods away from each other. Anti-affinity is useful in cases such as: avoiding placing a pod that will interfere in the performance of an existing pod on the same node.

Node affinity is one of the mechanisms Kubernetes provides to define where Kubernetes should schedule a pod.

Node affinity is conceptually similar to nodeSelector, allowing you to constrain which nodes your Pod can be scheduled on based on node labels. There are two types of node affinity:

  • requiredDuringSchedulingIgnoredDuringExecution: The scheduler can't schedule the Pod unless the rule is met. This functions like nodeSelector, but with a more expressive syntax.
  • preferredDuringSchedulingIgnoredDuringExecution: The scheduler tries to find a node that meets the rule. If a matching node is not available, the scheduler still schedules the Pod.

IgnoredDuringExecution means that if the node labels change after Kubernetes schedules the Pod, the Pod continues to run.

You can specify node affinities using the .spec.affinity.nodeAffinity field in your Pod spec.

apiVersion: v1
kind: Pod
metadata:
  name: with-node-affinity
spec:
  affinity:
    nodeAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
        - matchExpressions:
          - key: topology.kubernetes.io/zone
            operator: In
            values:
            - antarctica-east1
            - antarctica-west1
      preferredDuringSchedulingIgnoredDuringExecution:
      - weight: 1
        preference:
          matchExpressions:
          - key: another-node-label-key
            operator: In
            values:
            - another-node-label-value
  containers:
  - name: with-node-affinity
    image: registry.k8s.io/pause:2.0

obraz

Czyli pod ma byc uruchomiony na node z linux i fajnie by bylo jakby mial ssd ale nie koniecznie.

The preferred attribute is optional for the scheduler. If there isn’t a node that matches this preference, then the Pod will still be scheduled. Also, you can specify a weight between 1 and 100 for each preferred rule, which the scheduler evaluates based on the total weight, based on the score of other priority functions for the node. Nodes with the highest total score will be prioritized.

inter-Pod affinity

In the same vein as node affinity is inter-Pod affinity, which is the preference for a node based on other Pods that are already on that node. To put it in simpler terms, if a Pod already exists on the node (e.g., nginx), then go ahead and schedule a new Pod to that same node (the node with the nginx Pod on it)

obraz

Just after you schedule the Pod, you should see that the Pod has been scheduled to the same node as the Pod named nginx

topologyKey

The topologyKey domain is used to determine relative placement of the Pods being scheduled relative to the Pods identified by the ...labelSelector.matchExpressions section.

With podAffinity, a Pod will be scheduled in the same domain as the Pods that match the expression.

Two common label options are topology.kubernetes.io/zone and kubernetes.io/hostname. Others can be found in the Kubernetes Well-Known Labels, Annotations and Taints documentation.

  • topology.kubernetes.io/zone: Pods will be scheduled in the same zone as a Pod that matches the expression.
  • kubernetes.io/hostname: Pods will be scheduled on the same hostname as a Pod that matches the expression.

For podAntiAffinity, the opposite is true: Pods will not be scheduled in the same domain as the Pods that match the expression.

apiVersion: apps/v1
kind: Deployment
... 
 spec:
      topologySpreadConstraints:
        - maxSkew: 1
          topologyKey: topology.kubernetes.io/zone   # Zdecydowanie brakowało rozkładania podow po zonach
          whenUnsatisfiable: ScheduleAnyway
          labelSelector:
            matchLabels:
              app: „{{ metadata_name }}”
          matchLabelKeys:
            - version
        - maxSkew: 1
          topologyKey: kubernetes.io/hostname
          whenUnsatisfiable: ScheduleAnyway
          labelSelector:
            matchLabels:
              app: „{{ metadata_name }}”
          matchLabelKeys:
            - version

--
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata: 
  name: „{{ metadata_name }}”
  namespace: „{{ metadata_namespace }}”
spec: 
  scaleTargetRef: 
    apiVersion: apps/v1
    kind: Deployment
    name: „{{ metadata_name }}”
  minReplicas: 2
  maxReplicas: 10
lościi:
  scaleDown:
    stabilizationWindowSeconds: 300
    policies:
    - type: Percent
      value: 100
      periodSeconds: 15         # nie znam specyfiki tej aplikacji ale czy mozliwosc zeskalowania do minimum nie jest zbyt brytalna ?
  scaleUp:
    stabilizationWindowSeconds: 60
    policies:
    - type: Percent
      value: 100
      periodSeconds: 15.   
    - type: Pods
      value: 2                  # przy zaproponowanych wartosciach policy po podach imo zbedne bo na start 100% z minReplicas to 2, a skoro select masz max to zawsze 100% aktualnej ilości jest wybierane
      periodSeconds: 15
    selectPolicy: Max
  metrics: 
  - type: Resource              # zastanawial bym się nad uzyciem ContainerResource w celu analizy tylko kontenera aplikacji bez istio
    resource: 
      name: cpu
      target: 
        type: Utilization
        averageUtilization: 80 
-- 
apiVersion: policy/v1
kind: PodDisruptionBudget   # PDB zdecydowanie potrzebne jest nam
metadata: 
  name: „{{ metadata_name }}”
  namespace: „{{ metadata_namespace }}”
spec:
  minAvailable: 1
  selector:
    matchLabels:
      app: „{{ metadata_name }}”