Ticket 321 ‐ NRPE Log Monitoring Implementation - SupaHotBall/OE2-Group-D GitHub Wiki

Task

Implementation Tasks

  • Identify critical log patterns requiring alerts
  • Document normal vs. abnormal log volumes
  • Establish severity thresholds (Warning: 3+ matches, Critical: 10+ matches)
  • Uses efficient grep pattern matching
  • Ensure your script runs as nagios user

Validation Procedures

  • Test Cases:
  • Force test errors: logger "TEST ERROR MESSAGE"
  • Verify Nagios alert transitions (OK → WARNING → CRITICAL)

Acceptance Criteria

  • ✔ Custom script detects all specified log patterns
  • ✔ NRPE responds to checks within 3-10 second
  • ✔ Nagios generates alerts within 5 minutes of threshold breach
  • ✔ Uses efficient grep pattern matching
  • ✔ Documentation includes:
  • Script parameters and thresholds

Documentation & Deliverables: • Clear documentation with evidence of before and after logs • Script and clear description of how script works • Evidence of log/syslog and alert generated in slack.


Steps Taken

Normal vs Abnormal Log Volumes

  • Normal Log Volume: < 3 matching entries (No issue, everything normal)
  • Abnormal Volume (Warning): 3 to 9 entries matching ERROR/CRITICAL/WARNING
  • Abnormal Volume (Critical): 10 or more entries matching

image

Step 1: Create the custom log check script

sudo nano /usr/lib/nagios/plugins/check_syslog_alerts.sh
#!/bin/bash
LOG_FILE="/var/log/syslog"
PATTERNS="ERROR|CRITICAL|WARNING"

COUNT=$(grep -E "$PATTERNS" "$LOG_FILE" | wc -l)

if [ "$COUNT" -ge 10 ]; then
  echo "CRITICAL - $COUNT matching log entries found"
  exit 2
elif [ "$COUNT" -ge 3 ]; then
  echo "WARNING - $COUNT matching log entries found"
  exit 1
else
  echo "OK - $COUNT matching log entries found"
  exit 0
fi

Parameter Explanation:

  • LOG_FILE: Target log file to scan

  • PATTERNS: Regex match for ERROR|CRITICAL|WARNING

  • COUNT: Number of matching lines

  • Exit codes: 0 = OK, 1 = WARNING, 2 = CRITICAL

image

Step 2: Make the script executable

sudo chmod +x /usr/lib/nagios/plugins/check_syslog_alerts.sh

Step 3: Test script manually

sudo /usr/lib/nagios/plugins/check_syslog_alerts.sh

image

Step 4: Simulate a log entry

logger "ERROR: TEST ERROR MESSAGE"
sudo /usr/lib/nagios/plugins/check_syslog_alerts.sh

image

Expected output:

WARNING - 1 matching log entries found

Step 5: Open the NRPE config file (on mgmt server):

sudo nano /etc/puppetlabs/code/modules/nagios_nrpe/files/nrpe.cfg

Add this line at the bottom of the file:

command[check_syslog_alerts]=/usr/lib/nagios/plugins/check_syslog_alerts.sh

image

Then apply:

sudo puppet agent --test

Step 6: Remote NRPE Test (on mgmt server)

/usr/lib/nagios/plugins/check_nrpe -H db-d -c check_syslog_alerts

image

Step 7: Define service in Nagios config (on mgmt)

sudo nano /etc/puppetlabs/code/modules/nagios/manifests/config.pp

Add:

nagios_service { "syslog-alerts":
  service_description => "Syslog Log Alerts",
  host_name           => "db-d",
  check_command       => "check_nrpe!check_syslog_alerts",
  max_check_attempts  => 3,
  retry_interval      => 1,
  check_interval      => 5,
  check_period        => "24x7",
  notification_interval => 30,
  notification_period => "24x7",
  notification_options => "w,u,c,r",
  contact_groups      => "slackgroup",
  target              => "/etc/nagios4/conf.d/ppt_services.cfg",
  mode                => "0644",
}

image

Apply config:

sudo puppet agent --test

Step 8: Restart Nagios (on mgmt server)

sudo systemctl restart nagios4

Then you should be able to see this:

image

🚀 Validation Results ✅ Nagios Service View:

  • Status: OK
  • Last Check: Shows successful script execution
  • State Info: OK - X matching log entries found

💬 Slack Alert Transitions:

10:01 AM - UNKNOWN: NRPE: Command not defined
10:06 AM - OK: Syslog Log Alerts is OK
10:33 AM - UNKNOWN: Command not defined (before fix)
11:03 AM - UNKNOWN: Command not defined (before fix)

✉️ Final Slack Message:

db-d/Syslog Log Alerts is OK

image


Challenges


External Resources


Ticket Reference

https://rt.dataraster.com/Ticket/Display.html?id=321