031. Prometheus - kimdonggwan337/dongdong GitHub Wiki
-
Node Exporter: ๋ฌผ๋ฆฌ์ /๊ฐ์ ๋จธ์ (๋ ธ๋)์ ํ๋์จ์ด ๋ฐ OS ์์ค์ ๋ฉํธ๋ฆญ์ ์์ง ์: CPU, ๋ฉ๋ชจ๋ฆฌ, ๋์คํฌ, ๋คํธ์ํฌ, ์จ๋, ์ปค๋ ํต๊ณ ๋ฑ.
-
Process Exporter: ํน์ ํ๋ก์ธ์ค(์ ํ๋ฆฌ์ผ์ด์ )์ ์ํ ๋ฐ ๋ฆฌ์์ค ์ฌ์ฉ๋์ ์ถ์ ์: ํ๋ก์ธ์ค์ CPU/Memory ์ฌ์ฉ๋, ์คํ ์ค์ธ ํ๋ก์ธ์ค ์, ๋น์ ์ ์ข ๋ฃ ๊ฐ์ง ๋ฑ.
-
Prometheus: Node Exporter, process exporter ์๋ฒ ๋ฐ ์ ํ๋ฆฌ์ผ์ด์ ์ ๋ณด(metrics)๋ฅผ ์ฃผ๊ธฐ์ ์ผ๋ก pull ์์ฒญ, metrics ์์ง
-
Garafana: ๋ชจ์ ์ ๋ณด๋ฅผ ๋ฐํ์ผ๋ก ์๊ฐํํ๋ ๋๊ตฌ
-
Prometheus Version: 3.1 ** OS: RockyLinux 8.9
-
Node Exporter1 Version: 1.8.2 ** OS: SUSE Linux Enterprise Server 15 SP5
-
Node Exporter1 Version: 1.8.2 ** OS: Red Hat Enterprise Linux 8
# mkdir /opt/monitoring/prometheus > ์ธ๋ถ ํ๋ก๊ทธ๋จ์ ์ค์น ํ์ผ
# cd /opt/monitoring/prometheus
# wget https://github.com/prometheus/prometheus/releases/download/v3.1.0/prometheus-3.1.0.linux-amd64.tar.gz
# tar zxvf prometheus-3.1.0.linux-amd64.tar.gz
# cd prometheus-3.1.0.linux-amd64
## config ๋ฌธ๋ฒ ์ค๋ฅ ๊ฒ์ฌ
# ./promtool check config ./prometheus.yml
Checking ./prometheus.yml
SUCCESS: ./prometheus.yml is valid prometheus config file syntax
## Prometheus user ์์ฑ
# useradd --no-create-home --shell /bin/false prometheus
# cat /etc/passwd | grep "prometheus"
prometheus:x:1001:1001::/home/prometheus:/bin/false
## ๋๋ ํ ๋ฆฌ ๊ถํ ์ค์
# cd /opt
# chown -R prometheus:prometheus monitoring
# cd /var/lib/
# chown -R prometheus:prometheus prometheus
# cd /etc
# chown -R prometheus:prometheus prometheus
# cd /usr/bin
# chown -R prometheus:prometheus prometheus
## prometheus๋ฅผ ๊ด๋ฆฌํ ๋๋ ํ ๋ฆฌ ์์ฑ ๋ฐ ๋ช
๋ น์ด ๋ณต์ฌ
# mkdir /var/lib/prometheus > ํ์ผ, ๋ก๊ทธ ๋ฐ์ดํฐ ๊ทธ๋ฆฌ๊ณ ์์ํ์ผ ๊ฐ์ ๊ฐ๋ณ ๋ฐ์ดํฐ ํ์ผ
# mkdir /etc/prometheus > ์์คํ
์ ๋ถํ
, ์
ง๋ค์ด ์์ ํ์ํ ํ์ผ๋ค๊ณผ ์์คํ
์ ์ ๋ฐ์ ๊ฑธ์น ์ค์ ํ์ผ๋ค ๋ฐ ์ด๊ธฐ ์คํฌ๋ฆฝํธ ํ์ผ
# mkdir /usr/bin/prometheus
# mkdir /opt/monitoring/prometheus/prometheus-3.1.0.linux-amd64/consoles
# mkdir /opt/monitoring/prometheus/prometheus-3.1.0.linux-amd64/console_libraries
# cp ./prometheus /usr/bin/prometheus
# cp ./promtool /usr/bin/prometheus
# cp ./prometheus.yml /etc/prometheus/
# cp -r ./consoles /etc/prometheus
# cp -r ./console_libraries /etc/prometheus
## prometheus.service ํ์ผ ๋ฑ๋ก
# cd /etc/systemd/system
# vi prometheus.service
[Unit]
Description=Monitoring system and time series database
After=network-online.target
[Service]
User=prometheus
Type=simple
ExecStart=/usr/bin/prometheus \
--config.file=/etc/prometheus/prometheus.yml \
--storage.tsdb.path=/var/lib/prometheus \
--web.console.templates=/etc/prometheus/consoles \
--web.console.libraries=/etc/prometheus/console_libraries
[Install]
WantedBy=multi-user.target
# systemctl daemon-reload
# systemctl start prometheus
# ss -lntu
tcp LISTEN 0 2048 *:9090
prometheus.service - Prometheus Server
Loaded: loaded (/etc/systemd/system/prometheus.service; disabled; vendor preset: disabled)
Active: active (running) since Wed 2025-02-12 07:42:12 UTC; 4min 28s ago
Main PID: 18690 (prometheus)
Tasks: 7 (limit: 22938)
Memory: 28.0M
CGroup: /system.slice/prometheus.service
โโ18690 /usr/bin/prometheus/prometheus --config.file /etc/prometheus/prometheus.yml --storage.tsdb.path /var/lib/prometheus --web.console.templates=/etc/prometheus/consoles --web.console.li>
# mkdir /opt/monitoring/node_exporter
# cd /opt/monitoring/node_exporter
# wget https://github.com/prometheus/node_exporter/releases/download/v1.8.2/node_exporter-1.8.2.linux-amd64.tar.gz
# tar xvfz node_exporter-1.2.2.linux-amd64.tar.gz
# cp ./node_exporter /usr/local/bin/
# useradd --no-create-home --shell /bin/false node_exporter
# vi /etc/systemd/system/node_exporter.service
[Unit]
Description=Node Exporter
After=network.target
[Unit]
Description=Node Exporter
Wants=network-online.target
After=network-online.target
[Service]
Environment="NODE_EXPORTER_ARGS=--web.listen-address=\":9100\" "
ExecStart=/usr/local/bin/node_exporter $NODE_EXPORTER_ARGS
[Install]
WantedBy=multi-user.target
# systemctl daemon-reload
# systemctl start node_exporter
### [Prometheus.yml]
# vi /etc/prometheus/prometheus.yml
------------------------------- ์๋ต -------------------------------
scrape_configs:
# The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
- job_name: "prometheus"
# metrics_path defaults to '/metrics'
# scheme defaults to 'http'.
static_configs:
- targets: ["20.41.83.18:9090"]
- job_name: 'Suse_node-exporter1'
scrape_interval: 5s
static_configs:
- targets: ["52.231.10.251:9100"]
- job_name: 'Redhat_node-exporter2'
scrape_interval: 5s
static_configs:
- targets: ["4.218.17.32:9100"]
---------------------------------------------------------------------
# mkdir /opt/monitoring/process_exporter/
# wget https://github.com/ncabatoff/process-exporter/releases/download/v0.8.4/process-exporter-0.8.4.linux-amd64.tar.gz
# tar xvfz process-exporter-0.8.4.linux-amd64.tar.gz
# cp ./process-exporter /usr/local/bin/
# [process exporter config.yml]
process_names:
- name: "{{.Matches}}"
cmdline:
- 'nginx' # Nginx ํ๋ก์ธ์ค ์ถ์
- name: "{{.Matches}}"
cmdline:
- 'sshd' # SSH ํ๋ก์ธ์ค ์ถ์
# vi /etc/systemd/system/process-exporter.service
[Unit]
Description=Process Exporter
After=network.target
[Service]
User=process-exporter
ExecStart=/usr/local/bin/process-exporter -config.path /etc/process_exporter/config.yml
Restart=always
[Install]
WantedBy=multi-user.target
# useradd --no-create-home --shell /bin/false process-exporter
# chown -R process-exporter:process-exporter /etc/process-exporter
# chmod 640 /etc/process-exporter/config.yml
# systemctl daemon-reload
# systemctl start process-exporter
### [prometheus.yml]
- job_name: 'Suse_process-exporter1'
scrape_interval: 5s
static_configs:
- targets: ["52.231.10.251:9256"]
- job_name: 'Redhat_process-exporter2'
scrape_interval: 5s
static_configs:
- targets: ["4.218.17.32:9256"]
# firewall-cmd --permanent --add-port=9100/tcp
# firewall-cmd --reload
import time
import requests
import os
# Slack Webhook URL (Slack์์ ๋ฐ๊ธ๋ฐ์ ๊ฑธ ๋ฃ์ด์ฃผ์ธ์)
SLACK_WEBHOOK_URL = "https://hooks.slack.com/services/XXXX/YYYY/ZZZZ"
# Prometheus ๋ก๊ทธ ํ์ผ ๊ฒฝ๋ก
PROMETHEUS_LOG = "/var/log/prometheus/prometheus.log"
def send_slack_alert(message: str):
"""Slack์ผ๋ก ์๋ฆผ ์ ์ก"""
payload = {
"text": f":rotating_light: Prometheus Error Detected!\n```\n{message}\n```"
}
try:
response = requests.post(SLACK_WEBHOOK_URL, json=payload)
if response.status_code != 200:
print(f"Slack ์ ์ก ์คํจ: {response.text}")
except Exception as e:
print(f"Slack ์์ฒญ ์ค ์ค๋ฅ: {e}")
def monitor_log():
"""๋ก๊ทธ ํ์ผ ๋ชจ๋ํฐ๋ง (tail -f)"""
# ํ์ผ์ด ์ด๋ฆด ๋๊น์ง ๋๊ธฐ
while not os.path.exists(PROMETHEUS_LOG):
print("๋ก๊ทธ ํ์ผ์ ์ฐพ์ ์ ์์. ๋๊ธฐ ์ค...")
time.sleep(5)
with open(PROMETHEUS_LOG, "r") as f:
# ๊ธฐ์กด ๋ก๊ทธ๋ ๋ฌด์ํ๊ณ , ์๋ก์ด ๋ก๊ทธ๋ง ๊ฐ์
f.seek(0, os.SEEK_END)
while True:
line = f.readline()
if not line:
time.sleep(1)
continue
# error ํฌํจ๋ ๋ก๊ทธ๋ง ํํฐ๋ง
if "error" in line.lower():
print(f"[ERROR DETECTED] {line.strip()}")
send_slack_alert(line.strip())
if __name__ == "__main__":
monitor_log()