Prometheus - pranavkumarpk01/MD-DevOps GitHub Wiki

🚀 What is Prometheus?

Prometheus is an open-source monitoring and alerting toolkit built for reliability and scalability. It is designed for recording real-time metrics in a time series database, built using a pull-based model over HTTP.

❓ Why is Prometheus Required?

Modern applications are:

Distributed across multiple services and environments.
Dynamic, with autoscaling and microservices.
Resilient, but need real-time observability for performance and error monitoring.

✅ Prometheus Solves:

Monitoring CPU, memory, network, disk usage, etc.
Tracking application metrics (latency, error rate, request count).
Triggering alerts based on threshold breaches.
Providing a query language (PromQL) for detailed metric analysis.

🧠 Prometheus Core Concepts

Component	Description
Time Series	Data points with a timestamp, metric name, and optional labels
Labels	Key-value pairs attached to metrics (e.g., `instance="server1"`)
PromQL	Prometheus Query Language to filter, aggregate, and analyze time series
Targets	The endpoints Prometheus scrapes metrics from
Exporter	Tool to expose metrics from a system in Prometheus format (e.g., node_exporter)
Scraping	Prometheus collects metrics via HTTP from exporters
Alertmanager	Component to manage alerts triggered by rules

⚙️ Prometheus Architecture

+----------------+     +---------------+     +---------------------+
|   Exporters    | <-- |   Prometheus  | --> |   Alertmanager      |
| (e.g., node)   |     |    Server     |     | (Email, Slack, etc) |
+----------------+     +---------------+     +---------------------+
         ↑                        |
         |                        ↓
     App / Service         Prometheus UI / Grafana

🔌 Exporters

Exporters expose metrics in a format Prometheus understands.

Common Exporters:

Exporter	Monitors
node_exporter	CPU, memory, disk, network of Linux/Unix nodes
blackbox_exporter	HTTP, HTTPS, DNS, TCP endpoints
mysqld_exporter	MySQL metrics
cadvisor	Container metrics (CPU, memory, I/O)

🛠️ Installing Prometheus (Manual Setup)

1️⃣ Download Prometheus

wget https://github.com/prometheus/prometheus/releases/download/v2.51.0/prometheus-2.51.0.linux-amd64.tar.gz
tar -xvf prometheus-2.51.0.linux-amd64.tar.gz
cd prometheus-2.51.0.linux-amd64

2️⃣ Start Prometheus

./prometheus --config.file=prometheus.yml

Prometheus starts at http://localhost:9090

📁 prometheus.yml – Configuration File

global:
  scrape_interval: 15s

scrape_configs:
  - job_name: 'prometheus'
    static_configs:
      - targets: ['localhost:9090']

  - job_name: 'node'
    static_configs:
      - targets: ['localhost:9100']

scrape_interval: How often to collect data (default 15s).
job_name: Logical name for the target.
targets: List of exporters.

🔎 Prometheus Web UI

Access: http://localhost:9090
Key features:
- Graph Explorer
- Target Status
- Alerts Page
- PromQL Query Editor

📈 Prometheus + Grafana = ❤️

Prometheus stores data, but Grafana visualizes it beautifully.

🧰 Grafana Setup:

Add Prometheus as a data source
Import dashboards (e.g., Node Exporter Full)
Create custom graphs using PromQL

🔍 PromQL Examples

Query	Description
`up`	Returns whether the targets are reachable (1 = up, 0 = down)
`node_cpu_seconds_total`	Total CPU seconds per mode
`rate(http_requests_total[1m])`	Requests per second
`avg(rate(container_cpu_usage_seconds_total[5m])) by (container)`	Average CPU per container

🚨 Alerting with Prometheus

Prometheus can trigger alerts based on rules and send them to Alertmanager.

📄 Sample Alert Rule:

groups:
- name: example
  rules:
  - alert: HighCPUUsage
    expr: avg(rate(node_cpu_seconds_total{mode="user"}[1m])) > 0.7
    for: 1m
    labels:
      severity: warning
    annotations:
      summary: "High CPU usage on {{ $labels.instance }}"

📨 Alertmanager Overview

Handles alert delivery
Supports routing, grouping, inhibition
Supports Email, Slack, PagerDuty, Webhooks

✅ Steps:

Download & configure Alertmanager.
Connect it with Prometheus in prometheus.yml.
Define alert rules and receivers.

🧪 Real Use Cases

Infrastructure Monitoring:
- Servers, VMs, containers using node_exporter, cadvisor.
Application Monitoring:
- Request count, latency, error rate.
Kubernetes Monitoring:
- With kube-prometheus-stack (by Prometheus + Grafana + exporters).
CI/CD Monitoring:
- Monitor Jenkins, deployments, build failures.
Business Metrics:
- Track user signups, API usage, revenue.

🔐 Security Considerations

Prometheus has no built-in auth (use reverse proxy like Nginx).
Enable HTTPS using reverse proxy.
Limit access to Prometheus and exporters via firewalls or VPN.

📚 Bonus Tools

Tool	Use Case
PromLens	Visual builder for PromQL queries
Thanos	Long-term storage for Prometheus
VictoriaMetrics	Alternative TSDB for Prometheus
Pushgateway	For ephemeral jobs (e.g., cronjobs)
kube-prometheus	Prometheus stack for Kubernetes

🔚 Final Thoughts

Prometheus is perfect for cloud-native, Kubernetes-based, or microservice systems.
It is lightweight, scalable, and powerful with PromQL and Grafana integration.
Combined with Alertmanager, it enables automated alerting and observability.