Prometheus - pranavkumarpk01/MD-DevOps GitHub Wiki
๐ What is Prometheus?
Prometheus is an open-source monitoring and alerting toolkit built for reliability and scalability. It is designed for recording real-time metrics in a time series database, built using a pull-based model over HTTP.
โ Why is Prometheus Required?
Modern applications are:
- Distributed across multiple services and environments.
- Dynamic, with autoscaling and microservices.
- Resilient, but need real-time observability for performance and error monitoring.
โ Prometheus Solves:
- Monitoring CPU, memory, network, disk usage, etc.
- Tracking application metrics (latency, error rate, request count).
- Triggering alerts based on threshold breaches.
- Providing a query language (PromQL) for detailed metric analysis.
๐ง Prometheus Core Concepts
Component | Description |
---|---|
Time Series | Data points with a timestamp, metric name, and optional labels |
Labels | Key-value pairs attached to metrics (e.g., instance="server1" ) |
PromQL | Prometheus Query Language to filter, aggregate, and analyze time series |
Targets | The endpoints Prometheus scrapes metrics from |
Exporter | Tool to expose metrics from a system in Prometheus format (e.g., node_exporter) |
Scraping | Prometheus collects metrics via HTTP from exporters |
Alertmanager | Component to manage alerts triggered by rules |
โ๏ธ Prometheus Architecture
+----------------+ +---------------+ +---------------------+
| Exporters | <-- | Prometheus | --> | Alertmanager |
| (e.g., node) | | Server | | (Email, Slack, etc) |
+----------------+ +---------------+ +---------------------+
โ |
| โ
App / Service Prometheus UI / Grafana
๐ Exporters
Exporters expose metrics in a format Prometheus understands.
Common Exporters:
Exporter | Monitors |
---|---|
node_exporter | CPU, memory, disk, network of Linux/Unix nodes |
blackbox_exporter | HTTP, HTTPS, DNS, TCP endpoints |
mysqld_exporter | MySQL metrics |
cadvisor | Container metrics (CPU, memory, I/O) |
๐ ๏ธ Installing Prometheus (Manual Setup)
1๏ธโฃ Download Prometheus
wget https://github.com/prometheus/prometheus/releases/download/v2.51.0/prometheus-2.51.0.linux-amd64.tar.gz
tar -xvf prometheus-2.51.0.linux-amd64.tar.gz
cd prometheus-2.51.0.linux-amd64
2๏ธโฃ Start Prometheus
./prometheus --config.file=prometheus.yml
Prometheus starts at http://localhost:9090
๐ prometheus.yml โ Configuration File
global:
scrape_interval: 15s
scrape_configs:
- job_name: 'prometheus'
static_configs:
- targets: ['localhost:9090']
- job_name: 'node'
static_configs:
- targets: ['localhost:9100']
scrape_interval
: How often to collect data (default 15s).job_name
: Logical name for the target.targets
: List of exporters.
๐ Prometheus Web UI
- Access:
http://localhost:9090
- Key features:
- Graph Explorer
- Target Status
- Alerts Page
- PromQL Query Editor
๐ Prometheus + Grafana = โค๏ธ
Prometheus stores data, but Grafana visualizes it beautifully.
๐งฐ Grafana Setup:
- Add Prometheus as a data source
- Import dashboards (e.g., Node Exporter Full)
- Create custom graphs using PromQL
๐ PromQL Examples
Query | Description |
---|---|
up |
Returns whether the targets are reachable (1 = up, 0 = down) |
node_cpu_seconds_total |
Total CPU seconds per mode |
rate(http_requests_total[1m]) |
Requests per second |
avg(rate(container_cpu_usage_seconds_total[5m])) by (container) |
Average CPU per container |
๐จ Alerting with Prometheus
Prometheus can trigger alerts based on rules and send them to Alertmanager.
๐ Sample Alert Rule:
groups:
- name: example
rules:
- alert: HighCPUUsage
expr: avg(rate(node_cpu_seconds_total{mode="user"}[1m])) > 0.7
for: 1m
labels:
severity: warning
annotations:
summary: "High CPU usage on {{ $labels.instance }}"
๐จ Alertmanager Overview
- Handles alert delivery
- Supports routing, grouping, inhibition
- Supports Email, Slack, PagerDuty, Webhooks
โ Steps:
- Download & configure Alertmanager.
- Connect it with Prometheus in
prometheus.yml
. - Define alert rules and receivers.
๐งช Real Use Cases
-
Infrastructure Monitoring:
- Servers, VMs, containers using
node_exporter
,cadvisor
.
- Servers, VMs, containers using
-
Application Monitoring:
- Request count, latency, error rate.
-
Kubernetes Monitoring:
- With kube-prometheus-stack (by Prometheus + Grafana + exporters).
-
CI/CD Monitoring:
- Monitor Jenkins, deployments, build failures.
-
Business Metrics:
- Track user signups, API usage, revenue.
๐ Security Considerations
- Prometheus has no built-in auth (use reverse proxy like Nginx).
- Enable HTTPS using reverse proxy.
- Limit access to Prometheus and exporters via firewalls or VPN.
๐ Bonus Tools
Tool | Use Case |
---|---|
PromLens | Visual builder for PromQL queries |
Thanos | Long-term storage for Prometheus |
VictoriaMetrics | Alternative TSDB for Prometheus |
Pushgateway | For ephemeral jobs (e.g., cronjobs) |
kube-prometheus | Prometheus stack for Kubernetes |
๐ Final Thoughts
- Prometheus is perfect for cloud-native, Kubernetes-based, or microservice systems.
- It is lightweight, scalable, and powerful with PromQL and Grafana integration.
- Combined with Alertmanager, it enables automated alerting and observability.