3 ‐ Alert - CPNV-ES-MON1/Prometheus GitHub Wiki

Alertmanager

Version used: 0.27.0

Prerequisites

Download the latest version of Alertmanager

wget https://github.com/prometheus/alertmanager/releases/download/v0.27.0/alertmanager-0.27.0.linux-amd64.tar.gz
--2024-06-10 07:11:32--  https://github.com/prometheus/alertmanager/releases/download/v0.27.0/alertmanager-0.27.0.linux-amd64.tar.gz
Resolving github.com (github.com)... 140.82.116.4
Connecting to github.com (github.com)|140.82.116.4|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://objects.githubusercontent.com/github-production-release-asset-2e65be/11452538/18333c17-a97b-4a1d-84f7-3562435ca553?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=releaseassetproduction%2F20240610%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20240610T071132Z&X-Amz-Expires=300&X-Amz-Signature=ba523c961ede794ff88a0ac58e8ccffd641bcc08f4cbd7f5ccbd348113159622&X-Amz-SignedHeaders=host&actor_id=0&key_id=0&repo_id=11452538&response-content-disposition=attachment%3B%20filename%3Dalertmanager-0.27.0.linux-amd64.tar.gz&response-content-type=application%2Foctet-stream [following]
--2024-06-10 07:11:32--  https://objects.githubusercontent.com/github-production-release-asset-2e65be/11452538/18333c17-a97b-4a1d-84f7-3562435ca553?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=releaseassetproduction%2F20240610%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20240610T071132Z&X-Amz-Expires=300&X-Amz-Signature=ba523c961ede794ff88a0ac58e8ccffd641bcc08f4cbd7f5ccbd348113159622&X-Amz-SignedHeaders=host&actor_id=0&key_id=0&repo_id=11452538&response-content-disposition=attachment%3B%20filename%3Dalertmanager-0.27.0.linux-amd64.tar.gz&response-content-type=application%2Foctet-stream
Resolving objects.githubusercontent.com (objects.githubusercontent.com)... 185.199.110.133, 185.199.111.133, 185.199.108.133, ...
Connecting to objects.githubusercontent.com (objects.githubusercontent.com)|185.199.110.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 30866868 (29M) [application/octet-stream]
Saving to: ‘alertmanager-0.27.0.linux-amd64.tar.gz’

alertmanager-0.27.0.linux-amd 100%[=================================================>]  29.44M  22.9MB/s    in 1.3s

2024-06-10 07:11:34 (22.9 MB/s) - ‘alertmanager-0.27.0.linux-amd64.tar.gz’ saved [30866868/30866868]
tar -xvf alertmanager-0.27.0.linux-amd64.tar.gz
alertmanager-0.27.0.linux-amd64/
alertmanager-0.27.0.linux-amd64/alertmanager
alertmanager-0.27.0.linux-amd64/alertmanager.yml
alertmanager-0.27.0.linux-amd64/NOTICE
alertmanager-0.27.0.linux-amd64/amtool
alertmanager-0.27.0.linux-amd64/LICENSE
cd alertmanager-0.27.0.linux-amd64
sudo mv alertmanager /usr/local/bin/
sudo mv amtool /usr/local/bin/

Create a user for Alertmanager

sudo useradd --no-create-home --shell /bin/false alertmanager
sudo mkdir /app/alertmanager
sudo mkdir /var/lib/alertmanager
sudo chown -R alertmanager:alertmanager /app/alertmanager /var/lib/alertmanager

Setup the service

Create a configuration file for Alertmanager

sudo nano /app/alertmanager/alertmanager.yml

Configuration for sending notifications to Discord

global:
  resolve_timeout: 5m

route:
  receiver: 'discord'
  group_wait: 5s
  group_interval: 5s
  repeat_interval: 3m

receivers:
- name: 'discord'
  discord_configs:
  - webhook_url: '<WEBHOOK_URL>'
    send_resolved: true
    title: '{{ template "discord.notification.title" . }}'
    message: '{{ template "discord.notification.description" . }}'

templates:
- '/app/alertmanager/alertmanager-templates.tmpl'

For the Webhook URL:

  1. On Discord, create a new server or use an existing one
  2. Set up a dedicated channel for receiving alerts
  3. Click on the settings for the channel you created
  4. Go to the "Integrations" section and create a new webhook
  5. Copy the webhook URL

webhook discord

Template for notifications

Create a template to customize notification messages

sudo nano /app/alertmanager/alertmanager-templates.tmpl
{{ define "discord.notification.title" }}
Alerte: {{ .CommonLabels.alertname }}
{{ end }}

{{ define "discord.notification.description" }}
Bonjour,

{{ if gt ( .Alerts.Firing | len) 0}}
Nous avons détecté un problème sur la machine **{{ .CommonLabels.machine_name }}**.

La machine est momentanément indisponible.

Nous travaillons à la résolution du problème.
{{ end }}

{{ if gt ( .Alerts.Resolved | len) 0}}
L'alerte concernant **{{ .CommonLabels.alertname }}** sur la machine **{{ .CommonLabels.machine_name }}** a été résolue.

La machine est fonctionnelle.
{{ end }}

Cordialement,
L'équipe IT.
{{ end }}
sudo chown alertmanager:alertmanager /app/alertmanager/alertmanager.yml

Create a systemd service file for Alertmanager

sudo nano /etc/systemd/system/alertmanager.service
[Unit]
Description=Alertmanager
Wants=network-online.target
After=network-online.target

[Service]
User=alertmanager
Group=alertmanager
ExecStart=/usr/local/bin/alertmanager --config.file=/app/alertmanager/alertmanager.yml --storage.path=/var/lib/alertmanager
Restart=always

[Install]
WantedBy=multi-user.target

Start the service and enable it

Reload systemd and enable Alertmanager service

sudo systemctl daemon-reload
sudo systemctl enable alertmanager
Created symlink /etc/systemd/system/multi-user.target.wants/alertmanager.service → /etc/systemd/system/alertmanager.service.
sudo systemctl start alertmanager

Update Prometheus server

Modify Prometheus configuration (prometheus.yml) to add Alertmanager

 sudo nano /app/prometheus2.51.2/prometheus.yml

Add under alerting

alerting:
  alertmanagers:
  - static_configs:
    - targets:
      - 'localhost:9093'

Also add under rule_files the alert rules file

rule_files:
  - "/app/prometheus2.51.2/alert.rules.yml"

Rules file

Create the alert rules file

sudo nano /app/prometheus2.51.2/alert.rules.yml
groups:
  - name: DebianMemory
    rules:
      - alert: DebianMemoryUsageOver60%
        expr: ((node_memory_MemTotal_bytes - node_memory_MemAvailable_bytes) / node_memory_MemTotal_bytes) * 100 > 60
        for: 30s
        labels:
          severity: warning
          machine_name: "[{{ $labels.machinename }}]"
          description_client: "Utilisation mémoire trop élevée"
        annotations:
          summary: "Instance [{{ $labels.instance }}] > 60% memory usage"
          description: "Memory usage is at {{ $value }}%"

  - name: WindowsMemory
    rules:
      - alert: WindowsMemoryUsageOver60%
        expr: ((windows_cs_physical_memory_bytes - windows_os_physical_memory_free_bytes) / windows_cs_physical_memory_bytes) * 100 > 60
        for: 30s
        labels:
          severity: warning
          machine_name: "[{{ $labels.machinename }}]"
          description_client: "Utilisation mémoire trop élevée"
        annotations:
          summary: "Instance [{{ $labels.instance }}] > 60% memory usage"
          description: "Memory usage is at {{ $value }}%"

 sudo systemctl restart prometheus