Best practices - DeekshithSN/kubernetes GitHub Wiki

βœ… Use Declarative YAML (not kubectl run)

Define everything as code (Deployment, Service, Ingress, PVC, etc.).

Use Git to version-control your YAML files (GitOps style).

βœ… Prefer Deployments over Pods

Deployments manage rolling updates and restarts automatically.

Pods alone don’t self-heal or scale.

βœ… Use Liveness and Readiness Probes

Readiness: When the app is ready to accept traffic.

Liveness: When the app needs a restart (e.g. stuck).

livenessProbe:
  httpGet:
    path: /healthz
    port: 8080
  initialDelaySeconds: 10
  periodSeconds: 5

βœ… Use ConfigMaps and Secrets

ConfigMaps: Non-sensitive config like feature flags, environment variables.

Secrets: API keys, passwords (base64-encoded, but ideally use external secret managers too).

βœ… Don’t bake secrets into images or hardcode configs.

βœ… Run containers as non-root

securityContext:
  runAsUser: 1000
  runAsNonRoot: true

βœ… Use RBAC

Give minimum permissions needed (principle of least privilege).

Use roles per namespace and service accounts per workload.

βœ… Network Policies

Control which pods can talk to which β€” like firewalls for pods.

podSelector:
  matchLabels:
    role: frontend

βœ… Use Horizontal Pod Autoscaling (HPA)

minReplicas: 2
maxReplicas: 10
metrics:
- type: Resource
  resource:
    name: cpu
    targetAverageUtilization: 70

βœ… Spread workloads across Availability Zones

Use topologySpreadConstraints or multiple node groups.

βœ… Use Readiness Gates with external systems (optional)

If your app depends on an external service, don’t mark it ready until it's usable.

βœ… Centralize logs (don’t kubectl logs everything)

Use tools like:

Fluentd/Fluent Bit β†’ Elasticsearch + Kibana (EFK)

Loki + Grafana

Cloud-native logging like CloudWatch or Stackdriver

βœ… Use Prometheus + Grafana

For metrics, alerts, and dashboards.

βœ… Set Resource Requests and Limits

Without them, your app can hog CPU/memory or be evicted.

resources:
  requests:
    cpu: "250m"
    memory: "512Mi"
  limits:
    cpu: "500m"
    memory: "1Gi"

βœ… Use small base images (Alpine, Distroless)

Reduces image size and attack surface.

βœ… Tag images uniquely (v1.0.2 not latest)

Helps in rolling back and avoiding caching issues.

βœ… Use CI/CD pipelines (e.g., GitHub Actions, ArgoCD, Flux)

Automate testing, building, and deploying.

βœ…Use Namespaces to isolate environments (dev, staging, prod).

βœ…Enable PodDisruptionBudgets (PDBs) to prevent all pods from being evicted during upgrades.