Configuration Options - antimetal/system-agent GitHub Wiki

Configuration Options

⚠️ Work in Progress: This documentation is currently being developed and may be incomplete or subject to change.

Overview

This page provides a comprehensive reference of all configuration options available for the Antimetal System Agent. Configuration can be provided through command-line flags, environment variables, or configuration files.

Configuration Methods

Priority Order

Configuration is applied in the following order (highest priority first):

  1. Command-line flags
  2. Environment variables
  3. Configuration file
  4. Default values

Command-Line Flags

antimetal-agent \
  --cluster-name=production \
  --api-endpoint=grpc.antimetal.com:443 \
  --collection-interval=30s \
  --log-level=info

Environment Variables

All flags can be set via environment variables with the prefix ANTIMETAL_:

export ANTIMETAL_CLUSTER_NAME=production
export ANTIMETAL_API_ENDPOINT=grpc.antimetal.com:443
export ANTIMETAL_COLLECTION_INTERVAL=30s
export ANTIMETAL_LOG_LEVEL=info

Configuration File

YAML configuration file specified with --config flag:

cluster_name: production
api_endpoint: grpc.antimetal.com:443
collection_interval: 30s
log_level: info

Core Configuration

Basic Settings

Option Type Default Description
cluster_name string (required) Kubernetes cluster identifier
api_endpoint string grpc.antimetal.com:443 Antimetal API endpoint
api_key string (required) API authentication key
disable_tls bool false Disable TLS (development only)

Logging Configuration

Option Type Default Description
log_level string info Log level (debug, info, warn, error)
log_format string json Log format (json, text)
log_output string stdout Log output (stdout, stderr, file)
log_file string - Log file path (when output=file)

Collection Configuration

Kubernetes Resources

Option Type Default Description
kube_config string In-cluster Path to kubeconfig file
kube_context string Current Kubernetes context to use
namespaces []string All Namespaces to monitor
exclude_namespaces []string - Namespaces to exclude
resource_types []string All Resource types to collect

Performance Metrics

Option Type Default Description
enable_performance bool true Enable performance collectors
collection_interval duration 10s Metric collection interval
host_proc_path string /proc Path to proc filesystem
host_sys_path string /sys Path to sys filesystem

Collector-Specific Options

CPU Collector

Option Type Default Description
cpu.enabled bool true Enable CPU metrics
cpu.interval duration 10s CPU collection interval
cpu.per_core bool true Collect per-core stats

Memory Collector

Option Type Default Description
memory.enabled bool true Enable memory metrics
memory.interval duration 10s Memory collection interval
memory.include_swap bool true Include swap metrics

Disk Collector

Option Type Default Description
disk.enabled bool true Enable disk metrics
disk.interval duration 30s Disk collection interval
disk.exclude_devices []string - Devices to exclude

Network Collector

Option Type Default Description
network.enabled bool true Enable network metrics
network.interval duration 10s Network collection interval
network.exclude_interfaces []string ["lo"] Interfaces to exclude

Process Collector

Option Type Default Description
process.enabled bool true Enable process metrics
process.interval duration 30s Process collection interval
process.top_count int 20 Number of top processes

Advanced Configuration

Performance Tuning

Option Type Default Description
worker_threads int 4 Number of worker threads
batch_size int 1000 Resource batch size
queue_size int 10000 Internal queue size
send_timeout duration 30s API send timeout

Filtering Options

filters:
  # Label-based filtering
  labels:
    include:
      environment: ["production", "staging"]
      monitored: ["true"]
    exclude:
      temporary: ["true"]
  
  # Annotation-based filtering
  annotations:
    exclude:
      - "kubectl.kubernetes.io/last-applied-configuration"
  
  # Resource name patterns
  names:
    exclude:
      - "^test-.*"
      - ".*-debug$"

Cloud Provider Settings

AWS/EKS

Option Type Default Description
cloud.provider string Auto-detect Cloud provider (aws, gcp, azure)
aws.region string Auto-detect AWS region
aws.assume_role string - IAM role to assume

GCP/GKE

Option Type Default Description
gcp.project string Auto-detect GCP project ID
gcp.zone string Auto-detect GCP zone

Azure/AKS

Option Type Default Description
azure.subscription_id string Auto-detect Azure subscription
azure.resource_group string Auto-detect Resource group

Security Options

Option Type Default Description
tls.ca_cert string System CA Custom CA certificate
tls.client_cert string - Client certificate
tls.client_key string - Client key
tls.server_name string - TLS server name override
tls.insecure_skip_verify bool false Skip TLS verification

Configuration Examples

Minimal Configuration

cluster_name: my-cluster
api_key: ${ANTIMETAL_API_KEY}

Production Configuration

cluster_name: production-eks-us-west-2
api_endpoint: grpc.antimetal.com:443
api_key: ${ANTIMETAL_API_KEY}

# Logging
log_level: info
log_format: json

# Collection
collection_interval: 30s
namespaces:
  - production
  - staging

# Performance
cpu:
  interval: 10s
memory:
  interval: 10s
disk:
  interval: 60s
  exclude_devices:
    - "^loop.*"
    - "^dm-.*"

# Filtering
filters:
  labels:
    include:
      monitored: ["true"]
  namespaces:
    exclude:
      - kube-system
      - kube-public

# Performance tuning
worker_threads: 8
batch_size: 2000

Development Configuration

cluster_name: local-development
api_endpoint: localhost:8443
disable_tls: true
api_key: dev-key

log_level: debug
log_format: text

# Reduced collection for development
collection_interval: 60s
process:
  enabled: false

Environment-Specific Settings

Container Environments

When running in containers, adjust paths:

host_proc_path: /host/proc
host_sys_path: /host/sys

Kubernetes Deployment

For in-cluster deployment:

# Automatically uses in-cluster config
kube_config: ""  # Empty means in-cluster

# Service account must have appropriate RBAC

Configuration Validation

The agent validates configuration on startup:

  1. Required fields are present
  2. Data types are correct
  3. Paths exist and are accessible
  4. API connectivity is verified

Invalid configuration causes the agent to exit with an error.

Dynamic Configuration

Some settings can be changed at runtime:

  • Log level (via signals)
  • Collection intervals (via API)
  • Filter rules (via API)

See Also