Memory Technologies Platform Specific Parca - antimetal/system-agent GitHub Wiki

Parca

Overview

Parca is an open-source continuous profiling platform built by Polar Signals that provides eBPF-based profiling with infrastructure-wide visibility and cost efficiency. It systematically takes profiles (CPU, Memory, I/O and more) of programs, collects, stores and makes profiles available to be queried over time.

Key characteristics:

  • eBPF-based continuous profiling platform
  • Always-on profiling with 1-2% overhead
  • Historical data retention and analysis
  • Open-source alternative to commercial APM tools
  • Zero instrumentation required

Performance Characteristics

  • Overhead: 1-2% (less than 1% for eBPF-based profiling)
  • Accuracy: Medium to High
  • False Positives: Low
  • Production Ready: Yes
  • Platform: Kubernetes, Linux with eBPF
  • Sampling Rate: 19Hz (19 times per second per logical CPU)

The 19Hz sampling rate is chosen because it's a prime number that avoids collisions with other periodic activities on the machine, unlike common frequencies like 100Hz that may coincide with user code periods.

Architecture

Core Components

  1. Parca Agent: eBPF-based collection agent
  2. Parca Server: Time-series storage backend with query engine and API
  3. Web UI: Visualization interface for flame graphs and analysis
  4. FrostDB: Columnar storage backend optimized for profiling data

eBPF Implementation

  • BPF CO-RE (Compile Once – Run Everywhere) using libbpf
  • Pre-compiled BPF programs statically embedded in binary
  • No runtime compilation - no Clang/LLVM or kernel headers needed
  • Two main BPF maps:
    • Stack traces map: stack trace ID → memory addresses of executed code
    • Counts map: (PID, user-space stack ID, kernel-space stack ID) → observation count

Data Collection Process

  1. Attaches eBPF program to PERF_COUNT_SW_CPU_CLOCK event
  2. Kernel calls BPF program 19 times per second per CPU
  3. Captures user-space and kernel-space stacktraces
  4. Builds pprof formatted profiles from extracted data
  5. Stores metadata (function names, line numbers) separately from samples

System-Agent Implementation Plan

Deployment as DaemonSet

apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: parca-agent
  namespace: parca
spec:
  selector:
    matchLabels:
      app: parca-agent
  template:
    metadata:
      labels:
        app: parca-agent
    spec:
      hostPID: true
      hostNetwork: true
      serviceAccountName: parca-agent
      securityContext:
        runAsUser: 0
      containers:
      - name: parca-agent
        image: ghcr.io/parca-dev/parca-agent:latest
        args:
          - --node=$(NODE_NAME)
          - --remote-store-address=parca.parca.svc:7070
          - --remote-store-insecure
        env:
        - name: NODE_NAME
          valueFrom:
            fieldRef:
              fieldPath: spec.nodeName
        securityContext:
          privileged: true
        volumeMounts:
        - name: proc
          mountPath: /host/proc
          readOnly: true
        - name: sys
          mountPath: /host/sys
          readOnly: true
      volumes:
      - name: proc
        hostPath:
          path: /proc
      - name: sys
        hostPath:
          path: /sys

Agent Configuration

  • Privileged Access: Requires root user or CAP_SYS_ADMIN for eBPF programs
  • Auto-Discovery: Automatically discovers containers in Kubernetes/systemd
  • Metadata Enrichment: Uses Kubernetes labels and annotations
  • Remote Store: Configure --remote-store-address for centralized collection

Storage Requirements

  • Meta Store: Function names, line numbers, file names (relatively small)
  • Sample Store: Profiling data using FrostDB columnar storage
  • Retention: Configurable based on storage capacity and requirements
  • Compression: Columnar storage optimizes repetitive data

Integration Points

  • Kubernetes API: Service discovery and metadata collection
  • Prometheus: Compatible labeling mechanism and relabeling features
  • Grafana: Built-in data source support
  • pprof Endpoints: Ingests any pprof formatted profiles

Production Deployments

Companies Using Parca

Parca is being adopted by organizations looking for cost-effective continuous profiling solutions. Many organizations have 20-30% of resources wasted with easily optimized code paths that can be identified through continuous profiling.

Scale Considerations

  • Infrastructure-wide Profiling: Single agent deployment covers entire node
  • Language Support: C, C++, Rust, Go, and more compiled languages
  • Kubernetes Integration: Works seamlessly with container orchestration
  • Multi-tenancy: Supports labeling for different teams/applications

Success Stories

  • Cost Optimization: Identifies resource waste and optimization opportunities
  • Performance Improvement: Statistical significance in performance comparisons
  • Incident Analysis: Historical data enables post-incident analysis
  • Development Velocity: No instrumentation overhead accelerates adoption

Performance Impact Studies

eBPF-based profiling typically adds less than 1% overhead in production environments, making it suitable for always-on continuous profiling scenarios.

Installation

Kubernetes Deployment

Quick Start

kubectl apply -f https://github.com/parca-dev/parca-agent/releases/download/v0.39.3/kubernetes-manifest.yaml

Helm Charts

helm repo add parca https://parca-dev.github.io/helm-charts/
helm repo update parca
helm install my-parca parca/parca

Standalone Installation

# Download and run Parca server
./bin/parca --config-path="parca.yaml"
# Web UI available on http://localhost:7070/

Configuration Options

Basic Configuration (parca.yaml)

object_storage:
  bucket:
    type: "FILESYSTEM"
    config:
      directory: "./data"

debug_info:
  bucket:
    type: "FILESYSTEM"  
    config:
      directory: "./data"

scrape_configs:
  - job_name: "default"
    scrape_interval: "10s"
    static_configs:
      - targets: ["127.0.0.1:7070"]

Agent Configuration

remote_store:
  address: "parca.parca.svc:7070"
  bearer_token: "your-token"
  insecure: false

external_labels:
  region: "us-west-2"
  datacenter: "dc1"

Storage Backends

  • Filesystem: Local storage for development/testing
  • Object Storage: S3-compatible storage for production
  • Cloud Storage: GCS, Azure Blob Storage support
  • Custom Backends: Configurable through object storage interface

Features

CPU Profiling

  • Always-on Sampling: 19Hz sampling rate with minimal overhead
  • Stack Trace Collection: Both user-space and kernel-space
  • Multi-language Support: Works with compiled languages (C, C++, Go, Rust)
  • Automatic Discovery: Zero-instrumentation target discovery

Memory Profiling

  • Heap Profiling: Current memory allocation tracking
  • Memory Leak Detection: Historical analysis for leak identification
  • Allocation Patterns: Understanding memory usage over time
  • OOM Analysis: Integration with OOM kill events

Flame Graphs

  • Icicle Graphs: Upside-down flame graphs for better code execution visualization
  • Interactive Visualization: Zoom, filter, and drill-down capabilities
  • Time Range Selection: View profiles over specific time periods
  • Differential Analysis: Compare profiles between time periods or deployments

Differential Analysis

  • Before/After Comparisons: Compare performance between deployments
  • Statistical Significance: Confidence in optimization impact
  • Regional Comparisons: Compare performance across different regions/environments
  • Version Analysis: Track performance changes across application versions

Historical Comparisons

  • Time-series Storage: Long-term retention of profiling data
  • Trend Analysis: Identify performance patterns over time
  • Incident Investigation: Analyze past incidents with historical data
  • Baseline Establishment: Create performance baselines for alerting

Code Examples

Deployment Manifests

Complete Kubernetes Deployment

# Namespace
apiVersion: v1
kind: Namespace
metadata:
  name: parca
---
# ServiceAccount
apiVersion: v1
kind: ServiceAccount
metadata:
  name: parca-agent
  namespace: parca
---
# ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: parca-agent
rules:
- apiGroups: [""]
  resources: ["nodes", "pods", "services", "endpoints"]
  verbs: ["get", "list", "watch"]
---
# ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: parca-agent
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: parca-agent
subjects:
- kind: ServiceAccount
  name: parca-agent
  namespace: parca
---
# Parca Server Deployment
apiVersion: apps/v1
kind: Deployment
metadata:
  name: parca
  namespace: parca
spec:
  replicas: 1
  selector:
    matchLabels:
      app: parca
  template:
    metadata:
      labels:
        app: parca
    spec:
      containers:
      - name: parca
        image: ghcr.io/parca-dev/parca:latest
        ports:
        - containerPort: 7070
        volumeMounts:
        - name: data
          mountPath: /data
      volumes:
      - name: data
        persistentVolumeClaim:
          claimName: parca-data
---
# Service
apiVersion: v1
kind: Service
metadata:
  name: parca
  namespace: parca
spec:
  selector:
    app: parca
  ports:
  - port: 7070
    targetPort: 7070

Query Examples

Basic Profile Query

# Query CPU profiles for the last hour
curl -G "http://localhost:7070/query_range" \
  --data-urlencode 'query=process_cpu:samples:count:cpu:nanoseconds{}' \
  --data-urlencode 'start=2024-01-01T10:00:00Z' \
  --data-urlencode 'end=2024-01-01T11:00:00Z'

Filtered Query with Labels

# Query profiles for specific application
curl -G "http://localhost:7070/query_range" \
  --data-urlencode 'query=process_cpu:samples:count:cpu:nanoseconds{job="myapp"}' \
  --data-urlencode 'start=2024-01-01T10:00:00Z' \
  --data-urlencode 'end=2024-01-01T11:00:00Z'

API Integration

Go Client Example

package main

import (
    "context"
    "fmt"
    "time"
    
    "github.com/parca-dev/parca/pkg/query"
    pb "github.com/parca-dev/parca/gen/proto/go/parca/query/v1alpha1"
)

func main() {
    client := query.NewQueryServiceClient(conn)
    
    resp, err := client.QueryRange(context.Background(), &pb.QueryRangeRequest{
        Query: "process_cpu:samples:count:cpu:nanoseconds{}",
        Start: timestamppb.New(time.Now().Add(-1 * time.Hour)),
        End:   timestamppb.New(time.Now()),
    })
    
    if err != nil {
        fmt.Printf("Error: %v\n", err)
        return
    }
    
    // Process response
    fmt.Printf("Series: %d\n", len(resp.Series))
}

Automated Analysis

Performance Regression Detection

import requests
import json
from datetime import datetime, timedelta

class ParcaAnalyzer:
    def __init__(self, parca_url):
        self.base_url = parca_url
    
    def compare_performance(self, query, baseline_start, baseline_end, 
                          current_start, current_end):
        baseline = self.query_range(query, baseline_start, baseline_end)
        current = self.query_range(query, current_start, current_end)
        
        # Simple comparison logic
        baseline_total = sum(s['value'] for s in baseline['series'])
        current_total = sum(s['value'] for s in current['series'])
        
        regression_pct = ((current_total - baseline_total) / baseline_total) * 100
        
        return {
            'regression_percentage': regression_pct,
            'baseline_total': baseline_total,
            'current_total': current_total,
            'is_regression': regression_pct > 5.0  # 5% threshold
        }
    
    def query_range(self, query, start, end):
        response = requests.get(f"{self.base_url}/query_range", params={
            'query': query,
            'start': start.isoformat(),
            'end': end.isoformat()
        })
        return response.json()

# Usage
analyzer = ParcaAnalyzer("http://localhost:7070")
result = analyzer.compare_performance(
    "process_cpu:samples:count:cpu:nanoseconds{job='myapp'}",
    datetime.now() - timedelta(days=7),  # Baseline: week ago
    datetime.now() - timedelta(days=6),
    datetime.now() - timedelta(hours=1),  # Current: last hour
    datetime.now()
)

print(f"Performance regression: {result['is_regression']}")
print(f"Change: {result['regression_percentage']:.2f}%")

Monitoring & Alerting

Memory Growth Detection

Grafana Alert Rule

alert:
  name: "Memory Growth Detection"
  frequency: 10m
  conditions:
    - query: A
      queryType: parca
      refId: A
      model:
        expr: 'increase(go_memstats_heap_inuse_bytes{}[1h])'
        interval: 1m
    - reducer: last
      type: reduce
    - evaluator:
        params: [50000000]  # 50MB increase threshold
        type: gt
      type: threshold

Baseline Establishment

Automated Baseline Calculation

#!/bin/bash
# Calculate performance baseline

QUERY="process_cpu:samples:count:cpu:nanoseconds{job='production'}"
START_TIME=$(date -d "7 days ago" --iso-8601)
END_TIME=$(date -d "yesterday" --iso-8601)

# Calculate baseline metrics
BASELINE=$(curl -s -G "http://localhost:7070/query_range" \
  --data-urlencode "query=$QUERY" \
  --data-urlencode "start=$START_TIME" \
  --data-urlencode "end=$END_TIME" | \
  jq '.series[] | .samples[] | .value' | \
  awk '{sum+=$1; count++} END {print sum/count}')

echo "Baseline CPU usage: $BASELINE"

# Store baseline for alerting
echo $BASELINE > /var/lib/parca/baseline.txt

Alert Configuration

PrometheusRule for Parca Metrics

apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
  name: parca-alerts
  namespace: parca
spec:
  groups:
  - name: parca.rules
    rules:
    - alert: HighCPUUsageDetected
      expr: |
        (
          avg_over_time(parca_cpu_profile_samples_total[5m]) 
          / avg_over_time(parca_cpu_profile_samples_total[1h] offset 1d)
        ) > 1.5
      for: 2m
      labels:
        severity: warning
      annotations:
        summary: "High CPU usage detected"
        description: "CPU usage is 50% higher than daily average"
    
    - alert: MemoryLeakDetected
      expr: |
        increase(parca_memory_profile_heap_bytes[1h]) > 100000000
      for: 5m
      labels:
        severity: critical
      annotations:
        summary: "Potential memory leak detected"
        description: "Memory usage increased by more than 100MB in 1 hour"

Integration with Prometheus

ServiceMonitor Configuration

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: parca
  namespace: parca
spec:
  selector:
    matchLabels:
      app: parca
  endpoints:
  - port: http
    interval: 30s
    path: /metrics

Comparison with Alternatives

Parca vs Pixie

Feature Parca Pixie
Scope Dedicated continuous profiling Full observability platform
Storage Long-term retention with FrostDB Short-term retention (24-48h)
Language Support Compiled languages (C, C++, Go, Rust) Broader language support
Deployment Lightweight agent More complex setup
Cost Open-source, self-hosted CNCF project, more resource intensive
Focus Performance profiling Comprehensive observability

Key Differences:

  • Storage Duration: Parca provides long-term storage suitable for historical analysis, while Pixie focuses on real-time observability with shorter retention
  • Complexity: Parca has simpler setup focused on profiling, Pixie offers broader observability with higher complexity
  • Use Cases: Choose Parca for dedicated continuous profiling with historical analysis; choose Pixie for comprehensive Kubernetes observability

Parca vs Pyroscope

Feature Parca Pyroscope (Grafana)
Technology eBPF-focused Multi-technology approach
Language Support Compiled languages Broader language support (Go, Python, Ruby, Java, .NET, PHP, Rust)
Query Language PromQL-like FlameQL
Storage FrostDB columnar Custom storage engine
Integration Grafana data source Native Grafana integration
Company Polar Signals Grafana Labs (acquired)

Key Differences:

  • Language Coverage: Pyroscope supports more interpreted languages with dedicated agents
  • Query Interface: Pyroscope has FlameQL vs Parca's PromQL-inspired queries
  • Ecosystem: Pyroscope benefits from Grafana Labs ecosystem integration
  • Technology Approach: Parca focuses on eBPF efficiency, Pyroscope uses language-specific agents

Parca vs Commercial APM

Feature Parca Commercial APM
Cost Open-source, self-hosted Subscription-based
Data Control Full control, on-premises Vendor-hosted
Customization Highly customizable Limited customization
Features Profiling-focused Full APM suite
Overhead <1% with eBPF Varies by vendor
Lock-in No vendor lock-in Potential vendor lock-in

Key Advantages of Parca:

  • Cost Effectiveness: No per-agent or data volume costs
  • Data Sovereignty: Keep sensitive profiling data in-house
  • Flexibility: Customize storage, retention, and analysis
  • Open Standards: Uses standard pprof format for interoperability

Repository & Documentation

GitHub Repository

Documentation Site

Community Resources

Additional Tools

  • OOMProf: https://github.com/parca-dev/oomprof - eBPF OOM Memory Profiler
  • Grafana Plugin: Native integration for visualization
  • CLI Tools: Command-line utilities for profile analysis

Security Considerations

  • Privileged Access: Requires root or CAP_SYS_ADMIN for eBPF
  • Network Security: Supports TLS and bearer token authentication
  • Data Privacy: Self-hosted deployment keeps profiling data internal
  • Resource Limits: Configure appropriate resource limits in Kubernetes
  • RBAC Integration: Works with Kubernetes role-based access control

Future Roadmap

  • Enhanced Language Support: Expanding support for interpreted languages
  • Advanced Analytics: ML-based anomaly detection and optimization suggestions
  • Multi-cluster Support: Enhanced federation and multi-cluster profiling
  • Integration Improvements: Deeper integration with observability ecosystem
  • Performance Optimizations: Continued focus on minimal overhead profiling