Memory Technologies Platform Specific Parca - antimetal/system-agent GitHub Wiki

Parca

Overview

Parca is an open-source continuous profiling platform built by Polar Signals that provides eBPF-based profiling with infrastructure-wide visibility and cost efficiency. It systematically takes profiles (CPU, Memory, I/O and more) of programs, collects, stores and makes profiles available to be queried over time.

Key characteristics:

eBPF-based continuous profiling platform
Always-on profiling with 1-2% overhead
Historical data retention and analysis
Open-source alternative to commercial APM tools
Zero instrumentation required

Performance Characteristics

Overhead: 1-2% (less than 1% for eBPF-based profiling)
Accuracy: Medium to High
False Positives: Low
Production Ready: Yes
Platform: Kubernetes, Linux with eBPF
Sampling Rate: 19Hz (19 times per second per logical CPU)

The 19Hz sampling rate is chosen because it's a prime number that avoids collisions with other periodic activities on the machine, unlike common frequencies like 100Hz that may coincide with user code periods.

Architecture

Core Components

Parca Agent: eBPF-based collection agent
Parca Server: Time-series storage backend with query engine and API
Web UI: Visualization interface for flame graphs and analysis
FrostDB: Columnar storage backend optimized for profiling data

eBPF Implementation

BPF CO-RE (Compile Once – Run Everywhere) using libbpf
Pre-compiled BPF programs statically embedded in binary
No runtime compilation - no Clang/LLVM or kernel headers needed
Two main BPF maps:
- Stack traces map: stack trace ID → memory addresses of executed code
- Counts map: (PID, user-space stack ID, kernel-space stack ID) → observation count

Data Collection Process

Attaches eBPF program to PERF_COUNT_SW_CPU_CLOCK event
Kernel calls BPF program 19 times per second per CPU
Captures user-space and kernel-space stacktraces
Builds pprof formatted profiles from extracted data
Stores metadata (function names, line numbers) separately from samples

System-Agent Implementation Plan

Deployment as DaemonSet

apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: parca-agent
  namespace: parca
spec:
  selector:
    matchLabels:
      app: parca-agent
  template:
    metadata:
      labels:
        app: parca-agent
    spec:
      hostPID: true
      hostNetwork: true
      serviceAccountName: parca-agent
      securityContext:
        runAsUser: 0
      containers:
      - name: parca-agent
        image: ghcr.io/parca-dev/parca-agent:latest
        args:
          - --node=$(NODE_NAME)
          - --remote-store-address=parca.parca.svc:7070
          - --remote-store-insecure
        env:
        - name: NODE_NAME
          valueFrom:
            fieldRef:
              fieldPath: spec.nodeName
        securityContext:
          privileged: true
        volumeMounts:
        - name: proc
          mountPath: /host/proc
          readOnly: true
        - name: sys
          mountPath: /host/sys
          readOnly: true
      volumes:
      - name: proc
        hostPath:
          path: /proc
      - name: sys
        hostPath:
          path: /sys

Agent Configuration

Privileged Access: Requires root user or CAP_SYS_ADMIN for eBPF programs
Auto-Discovery: Automatically discovers containers in Kubernetes/systemd
Metadata Enrichment: Uses Kubernetes labels and annotations
Remote Store: Configure --remote-store-address for centralized collection

Storage Requirements

Meta Store: Function names, line numbers, file names (relatively small)
Sample Store: Profiling data using FrostDB columnar storage
Retention: Configurable based on storage capacity and requirements
Compression: Columnar storage optimizes repetitive data

Integration Points

Kubernetes API: Service discovery and metadata collection
Prometheus: Compatible labeling mechanism and relabeling features
Grafana: Built-in data source support
pprof Endpoints: Ingests any pprof formatted profiles

Production Deployments

Companies Using Parca

Parca is being adopted by organizations looking for cost-effective continuous profiling solutions. Many organizations have 20-30% of resources wasted with easily optimized code paths that can be identified through continuous profiling.

Scale Considerations

Infrastructure-wide Profiling: Single agent deployment covers entire node
Language Support: C, C++, Rust, Go, and more compiled languages
Kubernetes Integration: Works seamlessly with container orchestration
Multi-tenancy: Supports labeling for different teams/applications

Success Stories

Cost Optimization: Identifies resource waste and optimization opportunities
Performance Improvement: Statistical significance in performance comparisons
Incident Analysis: Historical data enables post-incident analysis
Development Velocity: No instrumentation overhead accelerates adoption

Performance Impact Studies

eBPF-based profiling typically adds less than 1% overhead in production environments, making it suitable for always-on continuous profiling scenarios.

Installation

Kubernetes Deployment

Quick Start

kubectl apply -f https://github.com/parca-dev/parca-agent/releases/download/v0.39.3/kubernetes-manifest.yaml

Helm Charts

helm repo add parca https://parca-dev.github.io/helm-charts/
helm repo update parca
helm install my-parca parca/parca

Standalone Installation

# Download and run Parca server
./bin/parca --config-path="parca.yaml"
# Web UI available on http://localhost:7070/

Configuration Options

Basic Configuration (parca.yaml)

object_storage:
  bucket:
    type: "FILESYSTEM"
    config:
      directory: "./data"

debug_info:
  bucket:
    type: "FILESYSTEM"  
    config:
      directory: "./data"

scrape_configs:
  - job_name: "default"
    scrape_interval: "10s"
    static_configs:
      - targets: ["127.0.0.1:7070"]

Agent Configuration

remote_store:
  address: "parca.parca.svc:7070"
  bearer_token: "your-token"
  insecure: false

external_labels:
  region: "us-west-2"
  datacenter: "dc1"

Storage Backends

Filesystem: Local storage for development/testing
Object Storage: S3-compatible storage for production
Cloud Storage: GCS, Azure Blob Storage support
Custom Backends: Configurable through object storage interface

Features

CPU Profiling

Always-on Sampling: 19Hz sampling rate with minimal overhead
Stack Trace Collection: Both user-space and kernel-space
Multi-language Support: Works with compiled languages (C, C++, Go, Rust)
Automatic Discovery: Zero-instrumentation target discovery

Memory Profiling

Heap Profiling: Current memory allocation tracking
Memory Leak Detection: Historical analysis for leak identification
Allocation Patterns: Understanding memory usage over time
OOM Analysis: Integration with OOM kill events

Flame Graphs

Icicle Graphs: Upside-down flame graphs for better code execution visualization
Interactive Visualization: Zoom, filter, and drill-down capabilities
Time Range Selection: View profiles over specific time periods
Differential Analysis: Compare profiles between time periods or deployments

Differential Analysis

Before/After Comparisons: Compare performance between deployments
Statistical Significance: Confidence in optimization impact
Regional Comparisons: Compare performance across different regions/environments
Version Analysis: Track performance changes across application versions

Historical Comparisons

Time-series Storage: Long-term retention of profiling data
Trend Analysis: Identify performance patterns over time
Incident Investigation: Analyze past incidents with historical data
Baseline Establishment: Create performance baselines for alerting

Code Examples

Deployment Manifests

Complete Kubernetes Deployment

# Namespace
apiVersion: v1
kind: Namespace
metadata:
  name: parca
---
# ServiceAccount
apiVersion: v1
kind: ServiceAccount
metadata:
  name: parca-agent
  namespace: parca
---
# ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: parca-agent
rules:
- apiGroups: [""]
  resources: ["nodes", "pods", "services", "endpoints"]
  verbs: ["get", "list", "watch"]
---
# ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: parca-agent
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: parca-agent
subjects:
- kind: ServiceAccount
  name: parca-agent
  namespace: parca
---
# Parca Server Deployment
apiVersion: apps/v1
kind: Deployment
metadata:
  name: parca
  namespace: parca
spec:
  replicas: 1
  selector:
    matchLabels:
      app: parca
  template:
    metadata:
      labels:
        app: parca
    spec:
      containers:
      - name: parca
        image: ghcr.io/parca-dev/parca:latest
        ports:
        - containerPort: 7070
        volumeMounts:
        - name: data
          mountPath: /data
      volumes:
      - name: data
        persistentVolumeClaim:
          claimName: parca-data
---
# Service
apiVersion: v1
kind: Service
metadata:
  name: parca
  namespace: parca
spec:
  selector:
    app: parca
  ports:
  - port: 7070
    targetPort: 7070

Query Examples

Basic Profile Query

# Query CPU profiles for the last hour
curl -G "http://localhost:7070/query_range" \
  --data-urlencode 'query=process_cpu:samples:count:cpu:nanoseconds{}' \
  --data-urlencode 'start=2024-01-01T10:00:00Z' \
  --data-urlencode 'end=2024-01-01T11:00:00Z'

Filtered Query with Labels

# Query profiles for specific application
curl -G "http://localhost:7070/query_range" \
  --data-urlencode 'query=process_cpu:samples:count:cpu:nanoseconds{job="myapp"}' \
  --data-urlencode 'start=2024-01-01T10:00:00Z' \
  --data-urlencode 'end=2024-01-01T11:00:00Z'

API Integration

Go Client Example

package main

import (
    "context"
    "fmt"
    "time"
    
    "github.com/parca-dev/parca/pkg/query"
    pb "github.com/parca-dev/parca/gen/proto/go/parca/query/v1alpha1"
)

func main() {
    client := query.NewQueryServiceClient(conn)
    
    resp, err := client.QueryRange(context.Background(), &pb.QueryRangeRequest{
        Query: "process_cpu:samples:count:cpu:nanoseconds{}",
        Start: timestamppb.New(time.Now().Add(-1 * time.Hour)),
        End:   timestamppb.New(time.Now()),
    })
    
    if err != nil {
        fmt.Printf("Error: %v\n", err)
        return
    }
    
    // Process response
    fmt.Printf("Series: %d\n", len(resp.Series))
}

Automated Analysis

Performance Regression Detection

import requests
import json
from datetime import datetime, timedelta

class ParcaAnalyzer:
    def __init__(self, parca_url):
        self.base_url = parca_url
    
    def compare_performance(self, query, baseline_start, baseline_end, 
                          current_start, current_end):
        baseline = self.query_range(query, baseline_start, baseline_end)
        current = self.query_range(query, current_start, current_end)
        
        # Simple comparison logic
        baseline_total = sum(s['value'] for s in baseline['series'])
        current_total = sum(s['value'] for s in current['series'])
        
        regression_pct = ((current_total - baseline_total) / baseline_total) * 100
        
        return {
            'regression_percentage': regression_pct,
            'baseline_total': baseline_total,
            'current_total': current_total,
            'is_regression': regression_pct > 5.0  # 5% threshold
        }
    
    def query_range(self, query, start, end):
        response = requests.get(f"{self.base_url}/query_range", params={
            'query': query,
            'start': start.isoformat(),
            'end': end.isoformat()
        })
        return response.json()

# Usage
analyzer = ParcaAnalyzer("http://localhost:7070")
result = analyzer.compare_performance(
    "process_cpu:samples:count:cpu:nanoseconds{job='myapp'}",
    datetime.now() - timedelta(days=7),  # Baseline: week ago
    datetime.now() - timedelta(days=6),
    datetime.now() - timedelta(hours=1),  # Current: last hour
    datetime.now()
)

print(f"Performance regression: {result['is_regression']}")
print(f"Change: {result['regression_percentage']:.2f}%")

Monitoring & Alerting

Memory Growth Detection

Grafana Alert Rule

alert:
  name: "Memory Growth Detection"
  frequency: 10m
  conditions:
    - query: A
      queryType: parca
      refId: A
      model:
        expr: 'increase(go_memstats_heap_inuse_bytes{}[1h])'
        interval: 1m
    - reducer: last
      type: reduce
    - evaluator:
        params: [50000000]  # 50MB increase threshold
        type: gt
      type: threshold

Baseline Establishment

Automated Baseline Calculation

#!/bin/bash
# Calculate performance baseline

QUERY="process_cpu:samples:count:cpu:nanoseconds{job='production'}"
START_TIME=$(date -d "7 days ago" --iso-8601)
END_TIME=$(date -d "yesterday" --iso-8601)

# Calculate baseline metrics
BASELINE=$(curl -s -G "http://localhost:7070/query_range" \
  --data-urlencode "query=$QUERY" \
  --data-urlencode "start=$START_TIME" \
  --data-urlencode "end=$END_TIME" | \
  jq '.series[] | .samples[] | .value' | \
  awk '{sum+=$1; count++} END {print sum/count}')

echo "Baseline CPU usage: $BASELINE"

# Store baseline for alerting
echo $BASELINE > /var/lib/parca/baseline.txt

Alert Configuration

PrometheusRule for Parca Metrics

apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
  name: parca-alerts
  namespace: parca
spec:
  groups:
  - name: parca.rules
    rules:
    - alert: HighCPUUsageDetected
      expr: |
        (
          avg_over_time(parca_cpu_profile_samples_total[5m]) 
          / avg_over_time(parca_cpu_profile_samples_total[1h] offset 1d)
        ) > 1.5
      for: 2m
      labels:
        severity: warning
      annotations:
        summary: "High CPU usage detected"
        description: "CPU usage is 50% higher than daily average"
    
    - alert: MemoryLeakDetected
      expr: |
        increase(parca_memory_profile_heap_bytes[1h]) > 100000000
      for: 5m
      labels:
        severity: critical
      annotations:
        summary: "Potential memory leak detected"
        description: "Memory usage increased by more than 100MB in 1 hour"

Integration with Prometheus

ServiceMonitor Configuration

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: parca
  namespace: parca
spec:
  selector:
    matchLabels:
      app: parca
  endpoints:
  - port: http
    interval: 30s
    path: /metrics

Comparison with Alternatives

Parca vs Pixie

Feature	Parca	Pixie
Scope	Dedicated continuous profiling	Full observability platform
Storage	Long-term retention with FrostDB	Short-term retention (24-48h)
Language Support	Compiled languages (C, C++, Go, Rust)	Broader language support
Deployment	Lightweight agent	More complex setup
Cost	Open-source, self-hosted	CNCF project, more resource intensive
Focus	Performance profiling	Comprehensive observability

Key Differences:

Storage Duration: Parca provides long-term storage suitable for historical analysis, while Pixie focuses on real-time observability with shorter retention
Complexity: Parca has simpler setup focused on profiling, Pixie offers broader observability with higher complexity
Use Cases: Choose Parca for dedicated continuous profiling with historical analysis; choose Pixie for comprehensive Kubernetes observability

Parca vs Pyroscope

Feature	Parca	Pyroscope (Grafana)
Technology	eBPF-focused	Multi-technology approach
Language Support	Compiled languages	Broader language support (Go, Python, Ruby, Java, .NET, PHP, Rust)
Query Language	PromQL-like	FlameQL
Storage	FrostDB columnar	Custom storage engine
Integration	Grafana data source	Native Grafana integration
Company	Polar Signals	Grafana Labs (acquired)

Key Differences:

Language Coverage: Pyroscope supports more interpreted languages with dedicated agents
Query Interface: Pyroscope has FlameQL vs Parca's PromQL-inspired queries
Ecosystem: Pyroscope benefits from Grafana Labs ecosystem integration
Technology Approach: Parca focuses on eBPF efficiency, Pyroscope uses language-specific agents

Parca vs Commercial APM

Feature	Parca	Commercial APM
Cost	Open-source, self-hosted	Subscription-based
Data Control	Full control, on-premises	Vendor-hosted
Customization	Highly customizable	Limited customization
Features	Profiling-focused	Full APM suite
Overhead	<1% with eBPF	Varies by vendor
Lock-in	No vendor lock-in	Potential vendor lock-in

Key Advantages of Parca:

Cost Effectiveness: No per-agent or data volume costs
Data Sovereignty: Keep sensitive profiling data in-house
Flexibility: Customize storage, retention, and analysis
Open Standards: Uses standard pprof format for interoperability

Repository & Documentation

GitHub Repository

Main Repository: https://github.com/parca-dev/parca
Agent Repository: https://github.com/parca-dev/parca-agent
Helm Charts: https://github.com/parca-dev/helm-charts
Demo Deployments: https://github.com/parca-dev/demo-deployments

Documentation Site

Official Documentation: https://www.parca.dev/
Agent Overview: https://www.parca.dev/docs/parca-agent/
Agent Design: https://www.parca.dev/docs/parca-agent-design/
Kubernetes Guide: https://www.parca.dev/docs/kubernetes/
Storage Documentation: https://www.parca.dev/docs/storage/

Community Resources

Polar Signals Blog: https://www.polarsignals.com/blog/
CNCF Landscape: Listed in continuous profiling category
Grafana Documentation: https://grafana.com/docs/grafana/latest/datasources/parca/
Community Discussions: GitHub Discussions and Issues

Additional Tools

OOMProf: https://github.com/parca-dev/oomprof - eBPF OOM Memory Profiler
Grafana Plugin: Native integration for visualization
CLI Tools: Command-line utilities for profile analysis

Security Considerations

Privileged Access: Requires root or CAP_SYS_ADMIN for eBPF
Network Security: Supports TLS and bearer token authentication
Data Privacy: Self-hosted deployment keeps profiling data internal
Resource Limits: Configure appropriate resource limits in Kubernetes
RBAC Integration: Works with Kubernetes role-based access control

Future Roadmap

Enhanced Language Support: Expanding support for interpreted languages
Advanced Analytics: ML-based anomaly detection and optimization suggestions
Multi-cluster Support: Enhanced federation and multi-cluster profiling
Integration Improvements: Deeper integration with observability ecosystem
Performance Optimizations: Continued focus on minimal overhead profiling