Kubernetes Deployment - antimetal/system-agent GitHub Wiki
Kubernetes Deployment
This guide covers deploying the Antimetal System Agent to Kubernetes clusters using various methods including Helm, Kustomize, and raw manifests.
Prerequisites
- Kubernetes cluster (1.19+)
kubectl
configured with cluster access- Cluster admin permissions (for RBAC setup)
- Antimetal API key from console.antimetal.com
Deployment Methods
Method 1: Helm (Recommended)
The easiest way to deploy the agent:
# Add Antimetal Helm repository
helm repo add antimetal https://charts.antimetal.com
helm repo update
# Install the agent
helm install antimetal-agent antimetal/system-agent \
--namespace antimetal-system \
--create-namespace \
--set intake.apiKey="YOUR_API_KEY"
# Verify installation
kubectl get pods -n antimetal-system
helm status antimetal-agent -n antimetal-system
Helm Values
# values.yaml
replicaCount: 1
image:
repository: antimetal/system-agent
tag: latest
pullPolicy: IfNotPresent
intake:
endpoint: "intake.antimetal.com:443"
apiKey: "" # Required
batchSize: 100
batchInterval: "10s"
performance:
enabled: true
interval: "60s"
collectors:
- cpu
- memory
- network
- disk
resources:
requests:
cpu: 100m
memory: 256Mi
limits:
cpu: 500m
memory: 512Mi
nodeSelector: {}
tolerations: []
affinity: {}
serviceAccount:
create: true
name: antimetal-agent
rbac:
create: true
leaderElection:
enabled: true
namespace: antimetal-system
monitoring:
serviceMonitor:
enabled: false # Enable for Prometheus Operator
Custom Installation
# With custom values file
helm install antimetal-agent antimetal/system-agent \
-n antimetal-system \
--create-namespace \
-f values.yaml
# Upgrade existing installation
helm upgrade antimetal-agent antimetal/system-agent \
-n antimetal-system \
-f values.yaml
# Rollback if needed
helm rollback antimetal-agent -n antimetal-system
Method 2: Kustomize
For GitOps workflows:
# kustomization.yaml
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
namespace: antimetal-system
resources:
- https://github.com/antimetal/system-agent/config/default?ref=v1.0.0
# Create secret for API key
secretGenerator:
- name: antimetal-credentials
literals:
- api-key=YOUR_API_KEY
# Patch deployment with secret
patchesStrategicMerge:
- deployment-patch.yaml
# Custom configuration
configMapGenerator:
- name: antimetal-config
files:
- config.yaml
# deployment-patch.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: antimetal-agent
spec:
template:
spec:
containers:
- name: agent
env:
- name: ANTIMETAL_INTAKE_API_KEY
valueFrom:
secretKeyRef:
name: antimetal-credentials
key: api-key
Deploy with:
kubectl apply -k .
Method 3: Raw Manifests
For maximum control:
# namespace.yaml
apiVersion: v1
kind: Namespace
metadata:
name: antimetal-system
---
# serviceaccount.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
name: antimetal-agent
namespace: antimetal-system
---
# clusterrole.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: antimetal-agent
rules:
# Core resources
- apiGroups: [""]
resources:
- nodes
- pods
- services
- persistentvolumes
- persistentvolumeclaims
- namespaces
- endpoints
verbs: ["get", "list", "watch"]
# Apps resources
- apiGroups: ["apps"]
resources:
- deployments
- daemonsets
- statefulsets
- replicasets
verbs: ["get", "list", "watch"]
# Batch resources
- apiGroups: ["batch"]
resources:
- jobs
- cronjobs
verbs: ["get", "list", "watch"]
# Leader election
- apiGroups: ["coordination.k8s.io"]
resources:
- leases
verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]
# Events (for leader election)
- apiGroups: [""]
resources:
- events
verbs: ["create", "patch"]
---
# clusterrolebinding.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: antimetal-agent
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: antimetal-agent
subjects:
- kind: ServiceAccount
name: antimetal-agent
namespace: antimetal-system
---
# secret.yaml
apiVersion: v1
kind: Secret
metadata:
name: antimetal-credentials
namespace: antimetal-system
type: Opaque
stringData:
api-key: "YOUR_API_KEY"
---
# configmap.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: antimetal-config
namespace: antimetal-system
data:
config.yaml: |
intake:
endpoint: "intake.antimetal.com:443"
batchSize: 100
batchInterval: "10s"
performance:
enabled: true
interval: "60s"
---
# deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: antimetal-agent
namespace: antimetal-system
labels:
app: antimetal-agent
spec:
replicas: 1
selector:
matchLabels:
app: antimetal-agent
template:
metadata:
labels:
app: antimetal-agent
spec:
serviceAccountName: antimetal-agent
containers:
- name: agent
image: antimetal/system-agent:latest
imagePullPolicy: IfNotPresent
args:
- --config=/etc/antimetal/config.yaml
- --leader-election=true
- --leader-election-namespace=antimetal-system
env:
- name: ANTIMETAL_INTAKE_API_KEY
valueFrom:
secretKeyRef:
name: antimetal-credentials
key: api-key
- name: NODE_NAME
valueFrom:
fieldRef:
fieldPath: spec.nodeName
- name: HOST_PROC
value: "/host/proc"
- name: HOST_SYS
value: "/host/sys"
ports:
- name: metrics
containerPort: 8080
protocol: TCP
- name: health
containerPort: 8081
protocol: TCP
livenessProbe:
httpGet:
path: /healthz
port: health
initialDelaySeconds: 15
periodSeconds: 20
readinessProbe:
httpGet:
path: /readyz
port: health
initialDelaySeconds: 5
periodSeconds: 10
resources:
requests:
cpu: 100m
memory: 256Mi
limits:
cpu: 500m
memory: 512Mi
volumeMounts:
- name: config
mountPath: /etc/antimetal
readOnly: true
- name: proc
mountPath: /host/proc
readOnly: true
- name: sys
mountPath: /host/sys
readOnly: true
securityContext:
allowPrivilegeEscalation: false
readOnlyRootFilesystem: true
runAsNonRoot: true
runAsUser: 65532
capabilities:
drop:
- ALL
volumes:
- name: config
configMap:
name: antimetal-config
- name: proc
hostPath:
path: /proc
type: Directory
- name: sys
hostPath:
path: /sys
type: Directory
---
# service.yaml
apiVersion: v1
kind: Service
metadata:
name: antimetal-agent-metrics
namespace: antimetal-system
labels:
app: antimetal-agent
spec:
ports:
- name: metrics
port: 8080
targetPort: metrics
selector:
app: antimetal-agent
Deploy with:
kubectl apply -f namespace.yaml
kubectl apply -f .
Cloud-Specific Deployments
EKS (AWS)
Additional configuration for EKS:
# eks-values.yaml
cloudProvider: eks
# IAM for Service Accounts (IRSA)
serviceAccount:
create: true
annotations:
eks.amazonaws.com/role-arn: arn:aws:iam::123456789012:role/antimetal-agent
# Node affinity for better performance
affinity:
nodeAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
preference:
matchExpressions:
- key: node.kubernetes.io/instance-type
operator: In
values:
- m5.large
- m5.xlarge
GKE (Google Cloud)
# gke-values.yaml
cloudProvider: gke
# Workload Identity
serviceAccount:
create: true
annotations:
iam.gke.io/gcp-service-account: [email protected]
# GKE Autopilot compatible resources
resources:
requests:
cpu: 250m
memory: 512Mi
ephemeral-storage: 1Gi
limits:
cpu: 500m
memory: 512Mi
ephemeral-storage: 1Gi
AKS (Azure)
# aks-values.yaml
cloudProvider: aks
# Azure AD Pod Identity
podLabels:
aadpodidbinding: antimetal-agent
# AKS-specific tolerations
tolerations:
- key: CriticalAddonsOnly
operator: Exists
High Availability Deployment
For production environments:
# ha-values.yaml
replicaCount: 3 # Only one will be active (leader election)
# Pod disruption budget
podDisruptionBudget:
enabled: true
minAvailable: 1
# Anti-affinity to spread across nodes
affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
podAffinityTerm:
labelSelector:
matchExpressions:
- key: app
operator: In
values:
- antimetal-agent
topologyKey: kubernetes.io/hostname
# Resource limits for stability
resources:
requests:
cpu: 200m
memory: 512Mi
limits:
cpu: 1000m
memory: 1Gi
# Priority class
priorityClassName: system-cluster-critical
Security Hardening
Network Policies
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: antimetal-agent
namespace: antimetal-system
spec:
podSelector:
matchLabels:
app: antimetal-agent
policyTypes:
- Ingress
- Egress
ingress:
# Allow Prometheus scraping
- from:
- namespaceSelector:
matchLabels:
name: monitoring
ports:
- protocol: TCP
port: 8080
egress:
# Allow DNS
- to:
- namespaceSelector:
matchLabels:
name: kube-system
ports:
- protocol: UDP
port: 53
# Allow Kubernetes API
- to:
- namespaceSelector: {}
podSelector:
matchLabels:
component: kube-apiserver
# Allow intake service
- to:
- ipBlock:
cidr: 0.0.0.0/0
ports:
- protocol: TCP
port: 443
Pod Security Policy
apiVersion: policy/v1beta1
kind: PodSecurityPolicy
metadata:
name: antimetal-agent
spec:
privileged: false
allowPrivilegeEscalation: false
requiredDropCapabilities:
- ALL
volumes:
- 'configMap'
- 'secret'
- 'hostPath'
hostNetwork: false
hostIPC: false
hostPID: false
runAsUser:
rule: 'MustRunAsNonRoot'
seLinux:
rule: 'RunAsAny'
fsGroup:
rule: 'RunAsAny'
readOnlyRootFilesystem: true
Monitoring Integration
Prometheus ServiceMonitor
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: antimetal-agent
namespace: antimetal-system
spec:
selector:
matchLabels:
app: antimetal-agent
endpoints:
- port: metrics
interval: 30s
path: /metrics
scheme: http
Verification
Check Deployment Status
# Check pod status
kubectl get pods -n antimetal-system
# Check logs
kubectl logs -n antimetal-system -l app=antimetal-agent
# Check leader election
kubectl get lease -n antimetal-system
# Verify RBAC
kubectl auth can-i --list --as system:serviceaccount:antimetal-system:antimetal-agent
Health Checks
# Port-forward for local access
kubectl port-forward -n antimetal-system deployment/antimetal-agent 8081:8081
# Check health
curl http://localhost:8081/healthz
curl http://localhost:8081/readyz
# Check metrics
kubectl port-forward -n antimetal-system deployment/antimetal-agent 8080:8080
curl http://localhost:8080/metrics
Troubleshooting Deployment
Common Issues
-
CrashLoopBackOff
# Check logs kubectl logs -n antimetal-system -l app=antimetal-agent --previous # Common causes: # - Invalid API key # - Network connectivity issues # - Insufficient RBAC permissions
-
ImagePullBackOff
# Check events kubectl describe pod -n antimetal-system -l app=antimetal-agent # Solutions: # - Verify image name and tag # - Check image pull secrets if using private registry
-
Pending State
# Check events kubectl describe pod -n antimetal-system -l app=antimetal-agent # Common causes: # - Insufficient resources # - Node selector/affinity not matching # - Taints not tolerated
Debug Mode
Deploy with debug logging:
# debug-values.yaml
logLevel: debug
# Extra verbosity
extraArgs:
- --log-verbosity=controller:2,intake:3
# Disable leader election for debugging
leaderElection:
enabled: false
Upgrading
Helm Upgrade
# Check current version
helm list -n antimetal-system
# Check available versions
helm search repo antimetal/system-agent --versions
# Upgrade to specific version
helm upgrade antimetal-agent antimetal/system-agent \
-n antimetal-system \
--version 1.2.0 \
-f values.yaml
# Verify upgrade
kubectl rollout status -n antimetal-system deployment/antimetal-agent
Zero-Downtime Upgrade
# Enable rolling updates
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 1
maxUnavailable: 0
Uninstalling
Helm
helm uninstall antimetal-agent -n antimetal-system
kubectl delete namespace antimetal-system
Manual
kubectl delete -f .
kubectl delete namespace antimetal-system
Next Steps
- Configuration Guide - Detailed configuration options
- Security Considerations - Security best practices
- Troubleshooting - Common issues and solutions
For support, contact [email protected] or visit GitHub Issues