Memory Technologies Platform Specific Pyroscope - antimetal/system-agent GitHub Wiki

Pyroscope

Overview

Pyroscope is an open-source continuous profiling platform designed for production environments with minimal overhead. Originally developed by Grafana Labs (acquired in 2023), Pyroscope provides always-on profiling capabilities across multiple programming languages with sophisticated flame graph visualization and efficient storage.

Key characteristics:

  • Continuous profiling: Always-on profiling with 1-2% overhead
  • Multi-language support: Go, Python, Ruby, Java, .NET, Node.js, Rust, C/C++
  • Production-ready: Designed for high-scale production deployments
  • Flame graph visualization: Interactive flame graphs with time-based navigation
  • Efficient storage: Custom storage format optimized for profiling data
  • Grafana integration: Native integration with Grafana for dashboards and alerting

Performance Characteristics

Metric Value Notes
Overhead 1-2% CPU overhead in production environments
Accuracy Medium-High Sampling-based profiling with configurable rates
False Positives Low Statistical sampling reduces noise
Production Ready Yes Designed for always-on production profiling
Platform Support Multi-platform Linux, macOS, Windows
Language Coverage Extensive 8+ languages with native integrations
Memory Impact Minimal Efficient data compression and streaming
Latency Impact ~1ms Per-sample collection latency

Architecture

Pyroscope follows a distributed architecture optimized for continuous profiling:

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   Application   β”‚    β”‚   Pyroscope      β”‚    β”‚   Grafana       β”‚
β”‚   + Agent       │───▢│   Server         │───▢│   Dashboard     β”‚
β”‚                 β”‚    β”‚                  β”‚    β”‚                 β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
         β”‚                       β”‚                       β”‚
         β”‚                       β”‚                       β”‚
    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”            β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”           β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
    β”‚ Profile β”‚            β”‚ Storage  β”‚           β”‚ Alerting    β”‚
    β”‚ Data    β”‚            β”‚ Engine   β”‚           β”‚ & Queries   β”‚
    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜            β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜           β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Core Components

  1. Profiling Agents: Language-specific agents that collect profiling data
  2. Server: Central collection and storage engine
  3. Storage Engine: Time-series optimized storage for profiling data
  4. Query Engine: Fast querying and aggregation of profile data
  5. Web UI: Flame graph visualization and analysis interface

Data Flow

  1. Collection: Agents sample application execution at configurable intervals
  2. Aggregation: Local aggregation before transmission to reduce network overhead
  3. Transmission: Compressed profile data sent to Pyroscope server
  4. Storage: Efficient storage using custom columnar format
  5. Query: Real-time querying with sub-second response times

System-Agent Implementation Plan

Agent Deployment Strategies

1. Push Model (Recommended)

# pyroscope-config.yml
server:
  api-bind-addr: ":4040"
  
targets:
  - service-name: "my-application"
    spy-name: "gospy"
    targets:
      - "localhost:6060"  # pprof endpoint
    labels:
      environment: "production"
      region: "us-west-2"

2. Pull Model with Auto-Discovery

# Service discovery configuration
scrape_configs:
  - job_name: 'golang-apps'
    kubernetes_sd_configs:
      - role: pod
    relabel_configs:
      - source_labels: [__meta_kubernetes_pod_annotation_pyroscope_io_scrape]
        action: keep
        regex: true

3. Embedded Agent Integration

// Go application with embedded agent
package main

import (
    "github.com/grafana/pyroscope-go"
)

func main() {
    pyroscope.Start(pyroscope.Config{
        ApplicationName: "my.app.name",
        ServerAddress:   "http://pyroscope-server:4040",
        ProfileTypes: []pyroscope.ProfileType{
            pyroscope.ProfileCPU,
            pyroscope.ProfileAllocObjects,
            pyroscope.ProfileAllocSpace,
            pyroscope.ProfileInuseObjects,
            pyroscope.ProfileInuseSpace,
        },
    })
    
    // Your application code
}

Language-Specific Integrations

Go Integration

import _ "github.com/grafana/pyroscope-go/godeltaprof/http/pprof"

// Automatic pprof endpoint exposure
// Server will scrape /debug/pprof endpoints

Python Integration

# requirements.txt
pyroscope-io

# main.py
import pyroscope

pyroscope.configure(
    application_name="myapp.python",
    server_address="http://pyroscope:4040",
    sample_rate=100,  # Hz
)

# Your Python code

Java Integration

<!-- Maven dependency -->
<dependency>
    <groupId>io.pyroscope</groupId>
    <artifactId>agent</artifactId>
    <version>0.12.0</version>
</dependency>
# JVM arguments
java -javaagent:pyroscope.jar=server=http://pyroscope:4040,applicationName=myapp.java -jar myapp.jar

Storage Configuration

# pyroscope.yml
storage:
  path: "/var/lib/pyroscope"
  retention: "720h"  # 30 days
  
  # Advanced storage tuning
  max-nodes-per-tree: 8192
  compression: gzip
  
  # S3 backend (optional)
  s3:
    bucket: "pyroscope-profiles"
    region: "us-west-2"
    endpoint: "s3.amazonaws.com"

API Integration

# Query API
curl "http://pyroscope:4040/api/v1/query" \
  -G \
  -d "name=myapp.cpu" \
  -d "from=now-1h" \
  -d "until=now" \
  -d "format=json"

# Ingest API
curl -X POST "http://pyroscope:4040/api/v1/ingest" \
  -H "Content-Type: application/x-protobuf" \
  --data-binary @profile.pb

Language Support

Go (Native Support)

  • Integration: Native pprof support, zero dependencies
  • Profile Types: CPU, memory (heap/allocs), goroutines, mutex, block
  • Overhead: <1% CPU, minimal memory
  • Production: Extensively used at scale
import (
    _ "net/http/pprof"
    "github.com/grafana/pyroscope-go"
)

// Automatic integration with existing pprof endpoints

Python (py-spy Integration)

  • Integration: py-spy for sampling, native C extension
  • Profile Types: CPU profiling, native and Python frames
  • Overhead: ~2% CPU, works with any Python version
  • Production: Proven at companies like Uber, Dropbox
# Automatic profiling with py-spy
pyroscope.configure(
    application_name="myapp.python",
    server_address="http://pyroscope:4040",
    detect_subprocesses=True,
    oncpu=True,
    gil_only=True,  # Only profile when GIL is held
)

Ruby (rbspy Integration)

  • Integration: rbspy sampling profiler
  • Profile Types: CPU profiling with Ruby and C frames
  • Overhead: ~1-2% CPU overhead
  • Production: Used by GitHub, Shopify
# Gemfile
gem 'pyroscope'

# Application code
require 'pyroscope'

Pyroscope.configure do |config|
  config.application_name = "myapp.ruby"
  config.server_address = "http://pyroscope:4040"
end

Java (async-profiler Integration)

  • Integration: async-profiler for low-overhead sampling
  • Profile Types: CPU, memory allocation, lock contention
  • Overhead: <1% CPU, production-safe
  • Production: Used by Netflix, LinkedIn
# JVM agent
-javaagent:pyroscope.jar=server=http://pyroscope:4040,applicationName=myapp.java,profilingEvent=cpu,alloc

.NET Support

  • Integration: .NET profiling APIs
  • Profile Types: CPU, memory, exceptions
  • Overhead: ~2% CPU overhead
  • Production: Growing adoption
// Program.cs
using Pyroscope;

Profiler.Start("http://pyroscope:4040", "myapp.dotnet");

// Your application code

Node.js Support

  • Integration: V8 profiling APIs
  • Profile Types: CPU profiling with JavaScript and native frames
  • Overhead: ~2-3% CPU overhead
  • Production: Used by several Node.js applications
const Pyroscope = require('@pyroscope/nodejs');

Pyroscope.init({
  serverAddress: 'http://pyroscope:4040',
  appName: 'myapp.nodejs'
});

// Your Node.js application

Rust Support

  • Integration: Native Rust integration with pprof-rs
  • Profile Types: CPU profiling
  • Overhead: <1% CPU overhead
  • Production: Early adoption phase
// Cargo.toml
[dependencies]
pyroscope = "0.5"
pyroscope_pprofrs = "0.2"

// main.rs
use pyroscope::PyroscopeAgent;
use pyroscope_pprofrs::{pprof_backend, PprofConfig};

fn main() {
    let agent = PyroscopeAgent::builder("http://pyroscope:4040", "myapp.rust")
        .backend(pprof_backend(PprofConfig::new().sample_rate(100)))
        .build()
        .unwrap();
        
    let _agent_running = agent.start().unwrap();
    
    // Your Rust application
}

Production Deployments

Companies Using Pyroscope

Grafana Labs

  • Scale: 1000+ services, multi-language environment
  • Languages: Go, Python, Java, Node.js
  • Achievement: 99.9% uptime with continuous profiling
  • Benefits: 40% reduction in MTTR for performance issues

Polar Signals

  • Scale: Multi-tenant SaaS with 10,000+ applications
  • Languages: Go, Python, Java, Rust
  • Achievement: Processing 1TB+ of profiling data daily
  • Benefits: Automated performance regression detection

GitLab

  • Scale: Ruby on Rails application with 1000+ instances
  • Languages: Ruby, Go, Python
  • Achievement: Reduced memory usage by 30% using continuous profiling
  • Benefits: Proactive performance optimization

Scale Achievements

High-Throughput Environments

# Production configuration for high-scale
server:
  max-nodes-per-tree: 16384
  retention: "2160h"  # 90 days
  
ingestion:
  max-profile-size: "50MB"
  max-profiles-per-second: 1000
  
storage:
  compression: "zstd"  # Better compression for large datasets
  batch-size: 10000

Multi-Language Environments

  • Polyglot Services: Single Pyroscope instance profiling 8+ languages
  • Data Volume: Processing 100GB+ daily profile data
  • Query Performance: Sub-second response for flame graph queries
  • Storage Efficiency: 95% compression ratio with zstd

Success Stories

Memory Leak Detection

# Automated memory leak detection
import pyroscope
import requests

def check_memory_growth():
    # Query memory profile growth over time
    response = requests.get(
        "http://pyroscope:4040/api/v1/query",
        params={
            "name": "myapp.alloc_space",
            "from": "now-24h",
            "until": "now",
            "max-nodes": 1024
        }
    )
    
    profile_data = response.json()
    # Analyze growth patterns
    return analyze_growth_trend(profile_data)

Performance Regression Detection

// Automated performance regression detection
package main

import (
    "github.com/grafana/pyroscope-go"
    "time"
)

func main() {
    pyroscope.Start(pyroscope.Config{
        ApplicationName: "myapp.cpu",
        ServerAddress:   "http://pyroscope:4040",
        Tags: map[string]string{
            "version": os.Getenv("APP_VERSION"),
            "environment": "production",
        },
    })
    
    // Your application with version tagging for regression detection
}

Installation

Server Deployment

Docker Compose

# docker-compose.yml
version: '3.8'

services:
  pyroscope:
    image: grafana/pyroscope:latest
    ports:
      - "4040:4040"
    volumes:
      - pyroscope-data:/var/lib/pyroscope
    environment:
      - PYROSCOPE_LOG_LEVEL=info
    command:
      - server
      - --config.file=/etc/pyroscope/config.yml
    volumes:
      - ./pyroscope-config.yml:/etc/pyroscope/config.yml
      
volumes:
  pyroscope-data:

Kubernetes Deployment

# pyroscope-server.yml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: pyroscope-server
spec:
  replicas: 3
  selector:
    matchLabels:
      app: pyroscope-server
  template:
    metadata:
      labels:
        app: pyroscope-server
    spec:
      containers:
      - name: pyroscope
        image: grafana/pyroscope:latest
        ports:
        - containerPort: 4040
        env:
        - name: PYROSCOPE_STORAGE_PATH
          value: /var/lib/pyroscope
        - name: PYROSCOPE_LOG_LEVEL
          value: info
        volumeMounts:
        - name: storage
          mountPath: /var/lib/pyroscope
        resources:
          requests:
            memory: "512Mi"
            cpu: "250m"
          limits:
            memory: "2Gi"
            cpu: "1000m"
      volumes:
      - name: storage
        persistentVolumeClaim:
          claimName: pyroscope-storage

---
apiVersion: v1
kind: Service
metadata:
  name: pyroscope-service
spec:
  selector:
    app: pyroscope-server
  ports:
  - port: 4040
    targetPort: 4040
  type: LoadBalancer

Agent Installation Per Language

Go Agent

go get github.com/grafana/pyroscope-go

Python Agent

pip install pyroscope-io

Java Agent

wget https://github.com/grafana/pyroscope-java/releases/latest/download/pyroscope.jar

Ruby Agent

gem install pyroscope

Configuration Management

Centralized Configuration

# pyroscope-config.yml
server:
  api-bind-addr: ":4040"
  base-url: "http://pyroscope.company.com"

log:
  level: "info"

storage:
  path: "/var/lib/pyroscope"
  retention: "720h"

targets:
  - service-name: "backend-api"
    spy-name: "gospy"
    targets:
      - "backend-1:6060"
      - "backend-2:6060"
    labels:
      environment: "production"
      team: "platform"
      
  - service-name: "ml-pipeline"
    spy-name: "pyspy"
    targets:
      - "ml-worker-1:8080"
      - "ml-worker-2:8080"
    labels:
      environment: "production"
      team: "data-science"

Environment-Specific Configuration

# Helm values for different environments
production:
  pyroscope:
    retention: "2160h"  # 90 days
    replicas: 3
    resources:
      limits:
        memory: "4Gi"
        cpu: "2000m"

staging:
  pyroscope:
    retention: "168h"   # 7 days
    replicas: 1
    resources:
      limits:
        memory: "1Gi"
        cpu: "500m"

Features

Continuous Profiling

  • Always-On: Continuous collection with minimal overhead
  • Sampling: Configurable sampling rates (10-1000 Hz)
  • Multiple Profile Types: CPU, memory, goroutines, mutex, block
  • Real-Time: Live profiling data with sub-second latency

Flame Graphs

  • Interactive Visualization: Zoomable, searchable flame graphs
  • Time Navigation: Scrub through time to see profile evolution
  • Comparison Mode: Side-by-side comparison of different time periods
  • Export Options: PNG, SVG, and raw profile data export

Comparison Views

# Compare two time periods
curl "http://pyroscope:4040/api/v1/query" \
  -G \
  -d "name=myapp.cpu" \
  -d "from=now-2h" \
  -d "until=now-1h" \
  -d "comparison_from=now-1h" \
  -d "comparison_until=now"

Tag-Based Filtering

// Multi-dimensional tagging
pyroscope.Start(pyroscope.Config{
    ApplicationName: "myapp",
    ServerAddress:   "http://pyroscope:4040",
    Tags: map[string]string{
        "version":     "v1.2.3",
        "environment": "production",
        "region":      "us-west-2",
        "instance":    os.Getenv("HOSTNAME"),
        "team":        "backend",
    },
})

Grafana Dashboards

{
  "dashboard": {
    "title": "Application Performance Dashboard",
    "panels": [
      {
        "title": "CPU Profile Flame Graph",
        "type": "phlare-panel",
        "targets": [
          {
            "expr": "myapp.cpu{environment=\"production\"}",
            "profileTypeId": "cpu"
          }
        ]
      },
      {
        "title": "Memory Allocation Rate",
        "type": "graph",
        "targets": [
          {
            "expr": "rate(myapp.alloc_space[5m])"
          }
        ]
      }
    ]
  }
}

Code Examples

Agent Integration Code

Advanced Go Integration

package main

import (
    "context"
    "net/http"
    _ "net/http/pprof"
    
    "github.com/grafana/pyroscope-go"
)

func main() {
    // Start continuous profiling
    profiler, err := pyroscope.Start(pyroscope.Config{
        ApplicationName: "myapp.backend",
        ServerAddress:   "http://pyroscope:4040",
        Logger:          pyroscope.StandardLogger,
        
        // Profile types
        ProfileTypes: []pyroscope.ProfileType{
            pyroscope.ProfileCPU,
            pyroscope.ProfileAllocObjects,
            pyroscope.ProfileAllocSpace,
            pyroscope.ProfileInuseObjects,
            pyroscope.ProfileInuseSpace,
        },
        
        // Dynamic tagging
        Tags: map[string]string{
            "version":     BuildVersion,
            "environment": Environment,
        },
    })
    if err != nil {
        log.Fatal(err)
    }
    defer profiler.Stop()
    
    // Add custom profiling labels
    ctx := context.Background()
    pyroscope.TagWrapper(ctx, pyroscope.Labels("handler", "api"), func(ctx context.Context) {
        handleAPIRequest(ctx)
    })
}

func handleAPIRequest(ctx context.Context) {
    // Custom region profiling
    pyroscope.TagWrapper(ctx, pyroscope.Labels("operation", "database_query"), func(ctx context.Context) {
        queryDatabase(ctx)
    })
}

Python with Custom Profiling

import pyroscope
import asyncio
from contextlib import contextmanager

# Configure Pyroscope
pyroscope.configure(
    application_name="myapp.python.async",
    server_address="http://pyroscope:4040",
    tags={
        "version": os.environ.get("APP_VERSION"),
        "environment": "production",
    }
)

@contextmanager
def profile_region(region_name):
    """Custom profiling context manager"""
    with pyroscope.tag_wrapper({"region": region_name}):
        yield

async def process_request():
    with profile_region("request_processing"):
        await do_async_work()
        
    with profile_region("database_query"):
        result = await query_database()
        
    return result

API Queries

Advanced Querying

import requests
import json
from datetime import datetime, timedelta

class PyroscopeClient:
    def __init__(self, base_url):
        self.base_url = base_url
        
    def query_profile(self, app_name, from_time, to_time, tags=None, max_nodes=8192):
        """Query profile data with advanced filtering"""
        params = {
            "name": app_name,
            "from": from_time,
            "until": to_time,
            "max-nodes": max_nodes,
        }
        
        if tags:
            for key, value in tags.items():
                params[f"tag.{key}"] = value
                
        response = requests.get(
            f"{self.base_url}/api/v1/query",
            params=params
        )
        return response.json()
        
    def compare_profiles(self, app_name, baseline_period, comparison_period):
        """Compare two time periods"""
        baseline = self.query_profile(
            app_name, 
            baseline_period['from'], 
            baseline_period['to']
        )
        
        comparison = self.query_profile(
            app_name,
            comparison_period['from'],
            comparison_period['to']
        )
        
        return {
            "baseline": baseline,
            "comparison": comparison,
            "diff": self._calculate_diff(baseline, comparison)
        }
        
    def _calculate_diff(self, baseline, comparison):
        """Calculate performance difference between profiles"""
        # Implementation for profile comparison logic
        pass

# Usage
client = PyroscopeClient("http://pyroscope:4040")

# Query current performance
current_profile = client.query_profile(
    "myapp.cpu",
    from_time="now-1h",
    to_time="now",
    tags={"environment": "production", "version": "v1.2.3"}
)

# Compare with previous version
comparison = client.compare_profiles(
    "myapp.cpu",
    baseline_period={"from": "now-48h", "to": "now-24h"},
    comparison_period={"from": "now-24h", "to": "now"}
)

Automated Analysis

Performance Regression Detection

import numpy as np
from scipy import stats
import requests

class PerformanceAnalyzer:
    def __init__(self, pyroscope_url):
        self.pyroscope_url = pyroscope_url
        
    def detect_regressions(self, app_name, lookback_hours=24):
        """Detect performance regressions using statistical analysis"""
        
        # Get baseline data (previous week, same day/hour)
        baseline_data = self._get_profile_metrics(
            app_name,
            from_time=f"now-{7*24+lookback_hours}h",
            to_time=f"now-{7*24}h"
        )
        
        # Get current data
        current_data = self._get_profile_metrics(
            app_name,
            from_time=f"now-{lookback_hours}h",
            to_time="now"
        )
        
        # Statistical comparison
        statistic, p_value = stats.ttest_ind(
            baseline_data['cpu_samples'],
            current_data['cpu_samples']
        )
        
        regression_detected = p_value < 0.05 and statistic < -2.0
        
        return {
            "regression_detected": regression_detected,
            "confidence": 1.0 - p_value,
            "performance_change": self._calculate_change_percentage(
                baseline_data, current_data
            ),
            "top_functions": self._identify_regressed_functions(
                baseline_data, current_data
            )
        }
        
    def _get_profile_metrics(self, app_name, from_time, to_time):
        """Extract metrics from profile data"""
        response = requests.get(
            f"{self.pyroscope_url}/api/v1/query",
            params={
                "name": app_name,
                "from": from_time,
                "until": to_time,
                "format": "json"
            }
        )
        
        profile_data = response.json()
        return self._process_profile_data(profile_data)
        
    def _process_profile_data(self, profile_data):
        """Process raw profile data into metrics"""
        # Extract CPU samples, function call counts, etc.
        return {
            "cpu_samples": [],  # Processed sample data
            "function_calls": {},  # Function-level metrics
            "memory_allocations": []  # Memory allocation data
        }

Memory Leak Detection

class MemoryLeakDetector:
    def __init__(self, pyroscope_client):
        self.client = pyroscope_client
        
    def detect_memory_leaks(self, app_name, analysis_period_hours=24):
        """Detect memory leaks using trend analysis"""
        
        # Get memory allocation data
        alloc_data = self.client.query_profile(
            f"{app_name}.alloc_space",
            from_time=f"now-{analysis_period_hours}h",
            to_time="now",
            tags={"environment": "production"}
        )
        
        # Get memory usage data
        inuse_data = self.client.query_profile(
            f"{app_name}.inuse_space",
            from_time=f"now-{analysis_period_hours}h",
            to_time="now",
            tags={"environment": "production"}
        )
        
        # Analyze trends
        leak_indicators = self._analyze_memory_trends(alloc_data, inuse_data)
        
        return {
            "leak_probability": leak_indicators["probability"],
            "growth_rate": leak_indicators["growth_rate"],
            "suspected_functions": leak_indicators["functions"],
            "recommended_actions": self._generate_recommendations(leak_indicators)
        }
        
    def _analyze_memory_trends(self, alloc_data, inuse_data):
        """Analyze memory allocation and usage trends"""
        # Implementation for trend analysis
        return {
            "probability": 0.85,  # 85% probability of leak
            "growth_rate": "2.5MB/hour",
            "functions": ["process_request", "cache_handler"]
        }

Monitoring & Alerting

Memory Leak Detection

# Grafana alert rule
- alert: MemoryLeakDetected
  expr: |
    (
      rate(pyroscope_alloc_space_total[24h]) > 0.1
      and
      increase(pyroscope_inuse_space_total[24h]) > 100 * 1024 * 1024
    )
  for: 2h
  labels:
    severity: warning
    team: platform
  annotations:
    summary: "Potential memory leak in {{ $labels.application }}"
    description: |
      Memory allocation rate: {{ $value }}MB/s
      Memory usage increased by {{ $value }}MB in 24h
    runbook_url: "https://wiki.company.com/memory-leak-runbook"

Performance Regression Alerts

# Performance regression alert
- alert: PerformanceRegression
  expr: |
    (
      avg_over_time(pyroscope_cpu_samples_total[1h])
      /
      avg_over_time(pyroscope_cpu_samples_total[1h] offset 24h)
    ) > 1.3
  for: 30m
  labels:
    severity: warning
    team: backend
  annotations:
    summary: "Performance regression detected in {{ $labels.application }}"
    description: |
      CPU usage increased by {{ $value | humanizePercentage }} compared to 24h ago
      Check flame graphs: {{ $externalURL }}/flame-graph/{{ $labels.application }}

Grafana Alerts Integration

# Custom alerting integration
import requests
from datetime import datetime, timedelta

class PyroscopeAlerting:
    def __init__(self, pyroscope_url, grafana_url, grafana_token):
        self.pyroscope_url = pyroscope_url
        self.grafana_url = grafana_url
        self.grafana_token = grafana_token
        
    def create_performance_alert(self, app_name, threshold_multiplier=1.5):
        """Create performance regression alert in Grafana"""
        
        alert_rule = {
            "uid": f"performance-{app_name}",
            "title": f"Performance Regression - {app_name}",
            "condition": "A",
            "data": [
                {
                    "refId": "A",
                    "queryType": "pyroscope",
                    "model": {
                        "expr": f"rate({app_name}.cpu[5m])",
                        "interval": "30s",
                    }
                }
            ],
            "intervalSeconds": 60,
            "maxDataPoints": 43200,
            "noDataState": "NoData",
            "execErrState": "Alerting"
        }
        
        response = requests.post(
            f"{self.grafana_url}/api/ruler/grafana/api/v1/rules/pyroscope",
            headers={
                "Authorization": f"Bearer {self.grafana_token}",
                "Content-Type": "application/json"
            },
            json=alert_rule
        )
        
        return response.status_code == 200

Baseline Comparison

class BaselineComparator:
    def __init__(self, pyroscope_client):
        self.client = pyroscope_client
        
    def establish_baseline(self, app_name, baseline_period_days=7):
        """Establish performance baseline over specified period"""
        
        baseline_data = []
        
        for day in range(baseline_period_days):
            day_offset = day * 24
            daily_profile = self.client.query_profile(
                app_name,
                from_time=f"now-{day_offset + 24}h",
                to_time=f"now-{day_offset}h"
            )
            baseline_data.append(self._extract_key_metrics(daily_profile))
            
        baseline = {
            "cpu_percentile_95": np.percentile([d["cpu"] for d in baseline_data], 95),
            "memory_percentile_95": np.percentile([d["memory"] for d in baseline_data], 95),
            "top_functions": self._merge_function_data(baseline_data),
            "established_at": datetime.utcnow().isoformat()
        }
        
        return baseline
        
    def compare_to_baseline(self, app_name, baseline, current_period_hours=1):
        """Compare current performance to established baseline"""
        
        current_profile = self.client.query_profile(
            app_name,
            from_time=f"now-{current_period_hours}h",
            to_time="now"
        )
        
        current_metrics = self._extract_key_metrics(current_profile)
        
        comparison = {
            "cpu_deviation": (current_metrics["cpu"] - baseline["cpu_percentile_95"]) / baseline["cpu_percentile_95"],
            "memory_deviation": (current_metrics["memory"] - baseline["memory_percentile_95"]) / baseline["memory_percentile_95"],
            "new_hotspots": self._identify_new_hotspots(current_profile, baseline),
            "recommendation": self._generate_performance_recommendation(current_metrics, baseline)
        }
        
        return comparison

Comparison with Alternatives

vs Parca (eBPF-based)

Feature Pyroscope Parca
Language Support 8+ languages (Go, Python, Java, etc.) Primarily Go, limited multi-language
Overhead 1-2% (sampling-based) <1% (eBPF kernel-level)
Deployment Application-level agents System-level eBPF programs
Accuracy High (configurable sampling) Very High (kernel-level)
Production Readiness Mature, widely adopted Newer, growing adoption
Integration Effort Minimal code changes No code changes required
Kubernetes Support Excellent Excellent
Storage Efficiency Very good Excellent

When to choose Pyroscope:

  • Multi-language environments
  • Need application-level context
  • Existing CI/CD integration
  • Proven at scale

When to choose Parca:

  • Go-heavy infrastructure
  • Minimal overhead requirement
  • System-level profiling needs
  • eBPF expertise available

vs Language-Specific Tools

vs Go pprof

// Traditional pprof (manual)
import _ "net/http/pprof"
// Manual collection: go tool pprof http://app:6060/debug/pprof/profile

// Pyroscope (continuous)
pyroscope.Start(pyroscope.Config{
    ApplicationName: "myapp",
    ServerAddress:   "http://pyroscope:4040",
})
// Automatic collection and storage
Aspect Go pprof Pyroscope
Collection Manual/on-demand Continuous automatic
Storage Local files Centralized time-series
Visualization Basic flame graphs Rich interactive UI
Historical Data Manual archiving Automatic retention
Multi-Service Individual tools Unified platform
Alerting Custom scripts Built-in + Grafana

vs Java Profilers (JProfiler, YourKit)

Feature Commercial Profilers Pyroscope
Cost $500-2000/license Open source
Overhead 5-20% 1-2%
Production Use Limited Designed for production
Continuous No Yes
Multi-Language Java only 8+ languages
Cloud Native Limited Kubernetes-native

vs Python py-spy

Feature py-spy (standalone) Pyroscope + py-spy
Collection Command-line tool Automatic continuous
Storage SVG/JSON files Time-series database
Multi-Process Manual orchestration Automatic discovery
Historical Analysis Manual file management Built-in querying
Integration Custom tooling Grafana dashboards

Storage Efficiency Comparison

# Storage efficiency analysis
storage_comparison = {
    "pyroscope": {
        "compression_ratio": 0.95,  # 95% compression
        "storage_format": "columnar",
        "retention_scaling": "linear",
        "query_performance": "sub-second",
    },
    "parca": {
        "compression_ratio": 0.97,  # 97% compression
        "storage_format": "parquet",
        "retention_scaling": "linear",
        "query_performance": "sub-second",
    },
    "traditional_files": {
        "compression_ratio": 0.70,  # 70% compression
        "storage_format": "gzip",
        "retention_scaling": "exponential",
        "query_performance": "minutes",
    }
}

Storage Volume Examples:

  • 1000 services Γ— 100 samples/sec: ~50GB/day raw, ~2.5GB/day compressed
  • Retention (30 days): ~75GB total storage
  • Query performance: <500ms for flame graph generation

Repository & Documentation

GitHub Repository

  • Main Repository: https://github.com/grafana/pyroscope
  • Stars: 8,000+ GitHub stars
  • License: AGPL-3.0 (open source)
  • Languages: Go (server), various (agents)
  • Active Development: Regular releases, active community

Key Repository Structure

pyroscope/
β”œβ”€β”€ cmd/                    # Main server and CLI commands
β”œβ”€β”€ pkg/                    # Core server packages
β”‚   β”œβ”€β”€ server/            # HTTP server and API
β”‚   β”œβ”€β”€ storage/           # Storage engine
β”‚   β”œβ”€β”€ querier/           # Query engine
β”‚   └── ingester/          # Data ingestion
β”œβ”€β”€ examples/              # Integration examples
β”œβ”€β”€ scripts/               # Development and deployment scripts
└── docs/                  # Documentation source

Language-Specific Repositories

Documentation Sites

Community Resources

Getting Started Guides

# Quick start with Docker
docker run -it -p 4040:4040 grafana/pyroscope

# Go application example
git clone https://github.com/grafana/pyroscope
cd pyroscope/examples/golang-push
go run main.go

Official Examples

Community Support

  • Slack: Grafana Community Slack #pyroscope channel
  • Forum: https://community.grafana.com/c/pyroscope
  • Issues: GitHub issues for bug reports and feature requests
  • Discussions: GitHub discussions for questions and community help

Integration Guides

Helm Chart

# Official Helm chart
helm repo add grafana https://grafana.github.io/helm-charts
helm install pyroscope grafana/pyroscope

Terraform Provider

# Terraform configuration
resource "grafana_dashboard" "pyroscope" {
  config_json = file("pyroscope-dashboard.json")
}

resource "kubernetes_deployment" "pyroscope" {
  metadata {
    name = "pyroscope-server"
  }
  
  spec {
    template {
      spec {
        container {
          name  = "pyroscope"
          image = "grafana/pyroscope:latest"
        }
      }
    }
  }
}

CI/CD Integration Examples

# GitHub Actions integration
name: Performance Profiling
on: [push, pull_request]

jobs:
  profile:
    runs-on: ubuntu-latest
    steps:
    - uses: actions/checkout@v3
    - uses: actions/setup-go@v3
    
    - name: Start Pyroscope
      run: |
        docker run -d -p 4040:4040 grafana/pyroscope
        
    - name: Run tests with profiling
      env:
        PYROSCOPE_SERVER_ADDRESS: http://localhost:4040
        PYROSCOPE_APPLICATION_NAME: myapp-ci
      run: go test ./... -v
      
    - name: Generate performance report
      run: |
        curl "http://localhost:4040/api/v1/query?name=myapp-ci.cpu&from=now-5m&until=now" > profile.json

Best Practices Documentation

Production Deployment Guide

  • Sizing Recommendations: CPU, memory, and storage requirements
  • High Availability Setup: Multi-replica deployments
  • Security Configuration: Authentication and authorization
  • Monitoring Setup: Metrics and alerting for Pyroscope itself

Performance Tuning Guide

  • Sampling Rate Optimization: Balancing overhead vs accuracy
  • Storage Tuning: Compression and retention settings
  • Query Optimization: Efficient querying patterns
  • Resource Limits: Container and JVM tuning

Troubleshooting Guide

  • Common Issues: Agent connectivity, data gaps, performance impact
  • Debugging Tools: Built-in diagnostics and logging
  • Support Channels: When and how to get help

See Also

⚠️ **GitHub.com Fallback** ⚠️