4 ‐ Act - CPNV-ES-MON1/Prometheus GitHub Wiki
mysqld_exporter
Install the agent to monitor the MySQL service status and restart it when it is down.
Version used: 0.15.1
Prerequisites
sudo mysql -e "CREATE USER 'mysqld_exporter'@'localhost' IDENTIFIED BY 'password' WITH MAX_USER_CONNECTIONS 3;
GRANT PROCESS, REPLICATION CLIENT, SELECT ON *.* TO 'mysqld_exporter'@'localhost';"
sudo useradd mysqld_exporter
Setup the service
wget https://github.com/prometheus/mysqld_exporter/releases/download/v0.15.1/mysqld_exporter-0.15.1.linux-amd64.tar.gz
tar -xvf mysqld_exporter-0.15.1.linux-amd64.tar.gz
sudo mv mysqld_exporter-0.15.1.linux-amd64/mysqld_exporter /usr/bin/
cat <<EOF | sudo tee -a /etc/systemd/system/mysqld_exporter.service
[Unit]
Description=MySQL Exporter Service
Wants=network.target
After=network.target
[Service]
User=mysqld_exporter
Group=mysqld_exporter
Environment="DATA_SOURCE_NAME=mysqld_exporter:password@tcp(127.0.0.1:3306)"
Type=simple
ExecStart=/usr/bin/mysqld_exporter --config.my-cnf "/etc/mysqld_exporter/.my.cnf"
Restart=always
[Install]
WantedBy=multi-user.target
EOF
sudo mkdir /etc/mysqld_exporter
cat <<EOF | sudo tee -a /etc/mysqld_exporter/.my.cnf
[client]
user=mysqld_exporter
password=password
EOF
Start the service and enable it
sudo systemctl daemon-reload
sudo systemctl enable mysqld_exporter.service
sudo systemctl start mysqld_exporter.service
sudo systemctl status mysqld_exporter.service
● mysqld_exporter.service - MySQL Exporter Service
Loaded: loaded (/etc/systemd/system/mysqld_exporter.service; enabled; vendor preset: enabled)
Active: active (running) since Tue 2024-06-11 10:38:41 CEST; 35min ago
Main PID: 3179 (mysqld_exporter)
Tasks: 5 (limit: 2327)
Memory: 12.0M
CGroup: /system.slice/mysqld_exporter.service
└─3179 /usr/bin/mysqld_exporter --config.my-cnf /etc/mysqld_exporter/.my.cnf
Update Prometheus server
sudo nano /app/prometheus2.51.2/prometheus.yml
Add the job configuration
- job_name: "MySQL"
static_configs:
- targets: ["<MySQL Server IP>:9104"]
Service Restarter
Python script running to remotely restart services Link of the script
Prerequisites
sudo apt update
sudo apt install python3 python3-flask python3-pyyaml -y
sudo useradd -m restarter
sudo usermod -aG sudo restarter
sudo -u restarter ssh-keygen -q -t ed25519 -N '' -f /home/restarter/.ssh/id_ed25519
sudo cat /home/restarter/.ssh/id_ed25519.pub
ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAICs1Ghs1Mb5dOTyHwiuJpFLn7PZx64WckjKI8jo4nci6 restarter@prom-srv-1
Script
Setup the log file
sudo touch /var/log/service_restarter.log
sudo chown restarter:restarter /var/log/service_restarter.log
sudo chmod 666 /var/log/service_restarter.log
Download the script
sudo curl https://raw.githubusercontent.com/YFanha/service_restarter/main/service_restarter -o /usr/bin/service_restarter
sudo chmod +x /usr/bin/service_restarter
sudo mkdir -p /etc/service_restarter
sudo curl https://raw.githubusercontent.com/YFanha/service_restarter/main/alertnames.yml.example -o /etc/service_restarter/alertnames.yml
cat /etc/service_restarter/alertnames.yml
alertnames:
- MySQLDown
- Apache2Down
- NginxDown
/etc/service_restarter/alertnames.yml with the alerts that should trigger a service restart
Setup the service file to run the script in background
cat <<EOM | sudo tee -a /etc/systemd/system/service_restarter.service
[Unit]
Description=Service Restarter for Prometheus Alerts
After=network.target
[Service]
User=restarter
ExecStart=/usr/bin/service_restarter
Restart=always
[Install]
WantedBy=multi-user.target
EOM
sudo systemctl daemon-reload
sudo systemctl enable service_restarter.service
sudo systemctl start service_restarter.service
sudo systemctl status service_restarter.service
● service_restarter.service - Service Restarter for Prometheus Alerts
Loaded: loaded (/etc/systemd/system/service_restarter.service; enabled; vendor preset: enabled)
Active: active (running) since Tue 2024-06-11 19:43:01 CEST; 1min 22s ago
Main PID: 2363 (python3)
Tasks: 1 (limit: 4514)
Memory: 17.8M
CPU: 109ms
CGroup: /system.slice/service_restarter.service
└─2363 python3 /usr/bin/service_restarter
Jun 11 19:43:01 prom-srv-1 systemd[1]: Started Service Restarter for Prometheus Alerts.
Jun 11 19:43:01 prom-srv-1 service_restarter[2363]: * Serving Flask app 'service_restarter'
Jun 11 19:43:01 prom-srv-1 service_restarter[2363]: * Debug mode: off
MySQL server (Client) configuration
Create the user who will restart the service on the client from an SSH command.
sudo useradd -m restarter
sudo usermod -aG sudo restarter
sudo mkdir -p /home/restarter/.ssh/
echo "restarter ALL=(ALL) NOPASSWD: ALL" | sudo tee -a /etc/sudoers
echo "<monitoring_server_user_pub_key>" | sudo tee -a /home/restarter/.ssh/authorized_keys
Example
On monitoring server
sudo cat /home/restarter/.ssh/id_ed25519.pub
ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAICs1Ghs1Mb5dOTyHwiuJpFLn7PZx64WckjKI8jo4nci6
On monitored server with a service that need to be restarted
echo "ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAICs1Ghs1Mb5dOTyHwiuJpFLn7PZx64WckjKI8jo4nci6" | sudo tee -a /home/restarter/.ssh/authorized_keys
Alertrules
Add new alerts rules
sudo nano /app/prometheus2.51.2/rules.d/mysql_alerts.yml
groups:
- name: mysql_alerts
rules:
- alert: MySQLDown
expr: mysql_up == 0
for: 1m
labels:
severity: critical
service: mysql
annotations:
summary: "MySQL service is down"
description: "The MySQL service on instance {{ $labels.instance }} is down."
Edit the prometheus file to add new alerts rules.
sudo nano /app/prometheus2.51.2/prometheus.yml
rule_files:
- /app/prometheus2.51.2/rules.d/mysql_alerts.yml
Edit the alertmanager.yml file to add the a new webhook
sudo nano /app/alertmanager/alertmanager.yml
global:
resolve_timeout: 5m
route:
receiver: 'discord'
group_by: ['alertname']
group_wait: 5s
group_interval: 1m
repeat_interval: 1m
routes:
- match:
alertname: 'MySQLDown'
receiver: 'restarter'
receivers:
- name: 'discord'
discord_configs:
- webhook_url: 'https://discord.com/api/webhooks/1250010842678689855/hwwTMw_lZOTeD2GdAcxdGt6NRvmqJQx2R8V7NCysqWWBY8XxtnuwqFZeeATp-M6GtUfy'
send_resolved: true
title: '{{ template "discord.notification.title" . }}'
message: '{{ template "discord.notification.description" . }}'
- name: 'restarter'
webhook_configs:
- url: 'http://localhost:5001/' # Calling our own script running on port 5001 (flask app)
send_resolved: true
templates:
- '/app/alertmanager/alertmanager-templates.tmpl'