mantainance task pods - juancamilocc/virtual_resources GitHub Wiki
Cleanup Pods Maintenance Task
In this guide, you will learn how to implement a CronJob to clean up evicted, succeeded, and failed pods in a Kubernetes environment. This task can be included in your daily checks and contributes to automated cluster health verification processes. All files can be found in this repository.
Bash executable
The following script will be used in the CronJob. You can adjust the number of minutes used as a filter — I recommend keeping this value low.
#!/bin/bash
now=$(date +%s)
# Search for failed, evicted, or succeeded pods older than 5 minutes
kubectl get pods -A -o json | jq -r --argjson now "$now" '
.items[]
| select(
.status.phase == "Failed" or
.status.reason == "Evicted" or
(
.status.phase == "Succeeded" and
(($now - ((.metadata.creationTimestamp | fromdateiso8601) | tonumber)) > 300)
)
)
| "\(.metadata.namespace) \(.metadata.name)"
' | while read -r namespace name; do
kubectl -n "$namespace" delete pod "$name"
done
RBAC Permissions
Define the RBAC permissions required for the CronJob to list and delete pod resources.
apiVersion: v1
kind: ServiceAccount
metadata:
name: cleanup-pods-sa
namespace: default
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: cleanup-pods-role
rules:
- apiGroups: [""]
resources: ["pods"]
verbs: ["list", "get", "delete"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: cleanup-pods-rolebinding
subjects:
- kind: ServiceAccount
name: cleanup-pods-sa
namespace: default
roleRef:
kind: ClusterRole
name: cleanup-pods-role
apiGroup: rbac.authorization.k8s.io
Apply the configuration.
kubectl apply -f rbac.yaml
Cronjob Manifest
apiVersion: batch/v1
kind: CronJob
metadata:
name: cleanup-pods
namespace: default
spec:
concurrencyPolicy: Allow
failedJobsHistoryLimit: 0
jobTemplate:
spec:
template:
spec:
serviceAccount: cleanup-pods-sa
containers:
- command:
- /bin/bash
- -c
- |
now=$(date +%s)
kubectl get pods -A -o json | jq -r --argjson now "$now" '
.items[]
| select(
.status.phase == "Failed" or
.status.reason == "Evicted" or
(
.status.phase == "Succeeded" and
(($now - ((.metadata.creationTimestamp | fromdateiso8601) | tonumber)) > 300)
)
)
| "\(.metadata.namespace) \(.metadata.name)"
' | while read -r namespace name; do
kubectl -n "$namespace" delete pod "$name"
done
sleep 60s
image: bitnami/kubectl:latest
imagePullPolicy: IfNotPresent
name: kubectl
resources: {}
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
dnsPolicy: ClusterFirst
restartPolicy: OnFailure
schedulerName: default-scheduler
securityContext: {}
terminationGracePeriodSeconds: 30
schedule: 30 7 * * *
successfulJobsHistoryLimit: 0
suspend: false
NOTE: Adjust the schedule field according to your needs. The default configuration runs every day at 7:30 AM.
Apply it.
kubectl apply -f cronjob.yaml
Testing the Cronjob
To test it, you can launch a manual job, as follows.
kubectl create job --from=cronjobs/cleanup-pods cleanup-pods
kubectl get pods
# cleanup-pods-57b74 1/1 Running 0 4s
Inspect the logs.
kubectl logs cleanup-pods-57b74 -f
# pod "pod-7d6f8b76bc-6tdmk" deleted
# pod "pod-7d6f8b76bc-kpsmw" deleted
# pod "pod-7d6f8b76bc-ms6zx" deleted
Conclusions
This process enables the automated cleanup of evicted, failed, and old succeeded pods through a Kubernetes CronJob, helping maintain a cleaner and healthier cluster without manual intervention. It relies on a script that identifies and removes pods based on their status and age, supported by the required RBAC permissions that allow the CronJob to list and delete pod resources. Once configured, the CronJob can be scheduled according to operational needs and manually tested to ensure proper functionality before being used in production.