Prometheus - tintinpython/DevOps-Notes GitHub Wiki


Prometheus:

Prometheus is an open-source monitoring system that is especially well-suited for cloud-native environments, like Kubernetes. It can monitor the performance of your applications and services, and alert you if there are any issues. It has a powerful query language that allows you to analyze the data it collects, and it also has a rich set of integrations with other tools and systems. For example, you can use Prometheus to monitor the health of your Kubernetes cluster, and use its integration with Grafana to visualize the data it collects.

Prometheus Architecture Prometheus is a monitoring system that consists of the following components:

A main server that scrapes and stores time series data A query language called PromQL is used to retrieve and analyze the data A set of exporters that are used to collect metrics from various systems and applications A set of alerting rules that can trigger notifications based on the data An alert manager that handles the routing and suppression of alerts The Prometheus server stores the data in a time series database, and it also provides a web interface that you can use to query and analyze the data using PromQL. The web interface also includes a dashboard builder that allows you to create custom dashboards to visualize your data.

Prometheus also includes a built-in alert manager that can send notifications based on the data it collects. You can define alerting rules that specify when an alert should be triggered, and the alert manager will route and suppress the alerts as needed.

This diagram shows the Prometheus architecture and how its components interact for monitoring and alerting. Here’s a breakdown of the flow:

Here’s a curated set of Prometheus-related videos to help you master monitoring, metrics, and Kubernetes integration:


πŸ“Š Prometheus for DevOps & Kubernetes

  1. Prometheus Sharding in Kubernetes Explained
    Explains how to scale Prometheus using sharding in Kubernetes. Ideal for understanding horizontal scaling and performance optimization.

  2. Day-3 | Best Prometheus Explanation | Practical Hands on
    Covers Prometheus architecture, PromQL, Alertmanager, and custom metrics. Great for hands-on practitioners looking to build real-world setups.

  3. Getting Started with Prometheus | Minimal Setup (Download)
    Walks through downloading, configuring, and running Prometheus locally. Perfect for beginners who want to get up and running quickly.

  4. Day-2 | Metrics, Monitoring and Prometheus | Basics of
    Explains Prometheus fundamentals, time-series databases, Helm installation, and Grafana integration. Ideal for full-stack DevOps learners.

  5. Prometheus Operator in Kubernetes Explained
    Deep dive into the Prometheus Operator, kube-prometheus stack, and service monitors. Essential for Kubernetes-native monitoring setups.


🎬 Bonus: Prometheus in Pop Culture


Awesome! Here's a hands-on lab to deploy Prometheus on Amazon EKS using Terraform, tailored for modular, production-grade setups. This includes Helm integration, IAM roles, and monitoring best practices.


πŸš€ Project: Deploy Prometheus on EKS with Terraform + Helm

πŸ“¦ Folder Structure

prometheus-on-eks/
β”œβ”€β”€ main.tf
β”œβ”€β”€ variables.tf
β”œβ”€β”€ outputs.tf
β”œβ”€β”€ eks/
β”‚   β”œβ”€β”€ main.tf
β”‚   β”œβ”€β”€ variables.tf
β”‚   β”œβ”€β”€ outputs.tf
β”œβ”€β”€ prometheus/
β”‚   β”œβ”€β”€ main.tf
β”‚   β”œβ”€β”€ values.yaml       # Helm chart values
β”‚   β”œβ”€β”€ outputs.tf

πŸ”Ή Step 1: Provision EKS Cluster (Terraform)

In eks/main.tf:

module "eks" {
  source          = "terraform-aws-modules/eks/aws"
  cluster_name    = var.cluster_name
  cluster_version = "1.29"
  subnets         = var.subnet_ids
  vpc_id          = var.vpc_id

  node_groups = {
    default = {
      desired_capacity = 2
      max_capacity     = 3
      min_capacity     = 1
      instance_types   = ["t3.medium"]
    }
  }

  tags = {
    Environment = "dev"
    Project     = "prometheus-monitoring"
  }
}

πŸ”Ή Step 2: Deploy Prometheus via Helm (Terraform + Helm Provider)

In prometheus/main.tf:

provider "helm" {
  kubernetes {
    config_path = "~/.kube/config"
  }
}

resource "helm_release" "prometheus" {
  name       = "prometheus"
  repository = "https://prometheus-community.github.io/helm-charts"
  chart      = "kube-prometheus-stack"
  namespace  = "monitoring"
  create_namespace = true

  values = [
    file("${path.module}/values.yaml")
  ]
}

πŸ”Ή Step 3: Customize values.yaml

prometheus:
  prometheusSpec:
    retention: "7d"
    serviceMonitorSelectorNilUsesHelmValues: false

grafana:
  adminPassword: "admin123"
  service:
    type: LoadBalancer

πŸ”Ή Step 4: Outputs

In outputs.tf:

output "grafana_url" {
  value = helm_release.prometheus.status[0].load_balancer[0].ingress[0].hostname
}

πŸ›‘οΈ Security & Best Practices

  • Use IRSA (IAM Roles for Service Accounts) for Prometheus scraping permissions.
  • Enable TLS and authentication for Grafana.
  • Configure Alertmanager for Slack/email notifications.
  • Use ServiceMonitors to scrape custom app metrics.

πŸ“˜ Reference Docs


Perfect! Here's a GitHub-ready Terraform project to deploy Prometheus on Amazon EKS, complete with modular structure, Helm integration, and a detailed README for deployment.


πŸ“ Project Structure: prometheus-on-eks

prometheus-on-eks/
β”œβ”€β”€ README.md
β”œβ”€β”€ main.tf
β”œβ”€β”€ variables.tf
β”œβ”€β”€ outputs.tf
β”œβ”€β”€ eks/
β”‚   β”œβ”€β”€ main.tf
β”‚   β”œβ”€β”€ variables.tf
β”‚   β”œβ”€β”€ outputs.tf
β”œβ”€β”€ prometheus/
β”‚   β”œβ”€β”€ main.tf
β”‚   β”œβ”€β”€ values.yaml
β”‚   β”œβ”€β”€ outputs.tf

πŸ“˜ README.md

# Prometheus on Amazon EKS with Terraform + Helm

This project deploys a production-ready Prometheus stack on Amazon EKS using Terraform and Helm. It includes Grafana, Alertmanager, and ServiceMonitors.

## πŸ”§ Prerequisites

- Terraform β‰₯ 1.5
- AWS CLI configured
- kubectl installed
- Helm installed

## πŸš€ Deployment Steps

### 1. Clone the Repo

```bash
git clone https://github.com/your-username/prometheus-on-eks.git
cd prometheus-on-eks

2. Initialize Terraform

terraform init

3. Set Variables

Update variables.tf with your VPC ID, subnet IDs, and desired cluster name.

4. Apply EKS Module

cd eks
terraform apply -auto-approve

5. Deploy Prometheus via Helm

cd ../prometheus
terraform apply -auto-approve

6. Access Grafana

Get the LoadBalancer URL:

terraform output grafana_url

Login with:

  • Username: admin
  • Password: admin123 (set in values.yaml)

πŸ“¦ Features

  • Modular EKS provisioning
  • Prometheus + Grafana via Helm
  • Custom values.yaml for retention and service type
  • Secure namespace isolation
  • Ready for ServiceMonitor integration

πŸ“˜ References

πŸ›‘οΈ Next Steps

  • Add IRSA for Prometheus scraping
  • Configure Alertmanager for Slack/email
  • Add ServiceMonitors for app metrics

---
Prometheus-on-eks project. You can copy these directly into your GitHub repo or local workspace.

All files are attached to this page

Prmetheus Terraform Modules
----------------
[main.tf.txt](https://github.com/user-attachments/files/21747528/main.tf.txt)
***
[outputs.tf.txt](https://github.com/user-attachments/files/21747546/outputs.tf.txt)
***
[variables.tf.txt](https://github.com/user-attachments/files/21747547/variables.tf.txt)
***
[eks.mainl.tf.txt](https://github.com/user-attachments/files/21747556/eks.mainl.tf.txt)
***
[eks.variables.tf.txt](https://github.com/user-attachments/files/21747575/eks.variables.tf.txt)
***
[eks.outputs.tf.txt](https://github.com/user-attachments/files/21747596/eks.outputs.tf.txt)
***
[prometheus.main.tf.txt](https://github.com/user-attachments/files/21747579/prometheus.main.tf.txt)
***
[prometheus.values.yml.txt](https://github.com/user-attachments/files/21747581/prometheus.values.yml.txt)
***
[prometheus..outputs.tf.txt](https://github.com/user-attachments/files/21747600/prometheus.outputs.tf.txt)

***

--------------------


Once you create the folder structure do execute below steps

- Run terraform init in each module folder.
- Apply EKS first (terraform apply in eks/), then Prometheus (terraform apply in prometheus/).
- Use kubectl get svc -n monitoring to verify Grafana and Prometheus services.