Ticket #929: Create a Nagios Check to Monitor SSH Logs - KeegMitch/Operations-Engineering-group-c GitHub Wiki
Ticket: #929
Create script for SSH failed logins on all 4 servers
-
Do this command to create the same script for all 4 servers:
sudo vim /usr/lib/nagios/plugins/check_ssh_logins.sh
-
Add the following inside the scripts:
#!/bin/bash
# Define variables
LOG_FILE="/var/log/auth.log"
THRESHOLD=5
TIME_PERIOD="10 minutes"
# Get the current time and time period ago in seconds since epoch
CURRENT_TIME=$(date +%s)
TIME_PERIOD_AGO=$(date --date="$TIME_PERIOD ago" +%s)
# Count failed login attempts in the specified time period
FAILED_ATTEMPTS=$(sudo awk -v current_time="$CURRENT_TIME" -v time_period_ago="$TIME_PERIOD_AGO" '
BEGIN { count=0 }
/Failed password for/ {
# Extract the timestamp from the log
split($1" "$2" "$3, timestamp, " ");
log_time=mktime(gensub(/[:-]/, " ", "g", timestamp[1]) " " timestamp[2] " " timestamp[3] " " timestamp[4] " " timestamp[5] " " timestamp[6]);
# Count only if the log time is within the specified period
if (log_time >= time_period_ago && log_time <= current_time) {
count++;
}
}
END {
print count
}' "$LOG_FILE")
# Debug output (optional)
echo "Current Time: $(date -d @$CURRENT_TIME)"
echo "Time Period Ago: $(date -d @$TIME_PERIOD_AGO)"
echo "Failed Attempts: $FAILED_ATTEMPTS"
# Check if failed attempts exceed the threshold or if there are no failed attempts
if [ ! "$FAILED_ATTEMPTS" =~ ^[0-9]+$ ](/KeegMitch/Operations-Engineering-group-c/wiki/-!-"$FAILED_ATTEMPTS"-=~-^[0-9]+$-); then
echo "UNKNOWN: Unable to determine failed login attempts"
exit 3
elif [ "$FAILED_ATTEMPTS" -ge "$THRESHOLD" ]; then
echo "CRITICAL: $FAILED_ATTEMPTS failed login attempts in the last $TIME_PERIOD"
exit 2
elif [ "$FAILED_ATTEMPTS" -eq 0 ]; then
echo "OK: No failed login attempts in the last $TIME_PERIOD"
exit 0
else
echo "WARNING: $FAILED_ATTEMPTS failed login attempts in the last $TIME_PERIOD"
exit 1
fi
In order to not get the UNKNOWN
status in your Nagios server:
- Open the sudoers configuration for editing:
sudo visudo -f /etc/sudoers.d/nagios
- Add the following line to allow the nagios user to run the required commands without a password:
nagios ALL=(ALL) NOPASSWD: /usr/bin/awk, /bin/date, /usr/lib/nagios/plugins/check_ssh_logins.sh
to test - After setting up the sudoers configuration, test running the script as the
sudo -u nagios /usr/lib/nagios/plugins/check_ssh_logins.sh
Apply to all the nrpe.cfg files
In mgmt
, db
, app
, and backup
-
sudo vim /etc/nagios/nrpe.cfg
-
Add this line to your other nrpe commands:
command[check_ssh_logins]=/usr/lib/nagios/plugins/check_ssh_logins.sh
- Inside the
mgmt
server also add the line above to/etc/puppetlabs/code/modules/nrpe/files/nrpe.cfg
Note: If your Nagios check is coming up UNKNOWN
and the message is something along the lines of "check_ssh_login is not defined", double check the nrpe.cfg
in the nrpe puppet module (no, not the one inside /etc/nagios
) and see if the check command is inside there, otherwise it won't work properly
- Apply the changes:
sudo /opt/puppetlabs/puppet/bin/puppet agent --test
or our aliastest_puppet_agent
Make Nagios check inside puppet module
-
Go inside the nagios puppet config:
sudo vim /etc/puppetlabs/code/modules/nagios/manifests/config.pp
-
Add a new Nagios hostgroup for failed SSH logins:
nagios_hostgroup {"Check-SSH-Logins":
target => "/etc/nagios3/conf.d/ppt_hostgroups.cfg",
mode => "0444",
alias => "Check failed ssh logins",
members => "db-c, mgmt-c, app-c, backup-c",
}
- Add a new NRPE service check:
nagios_service { "ssh_failed_logins":
service_description => "SSH Failed Logins",
hostgroup_name => "Check-SSH-Logins",
target => "/etc/nagios3/conf.d/ppt_services.cfg",
check_command => "check_nrpe!check_ssh_logins",
max_check_attempts => 3,
retry_check_interval => 1,
normal_check_interval => 5,
check_period => "24x7",
notification_interval => 30,
notification_period => "24x7",
notification_options => "w,u,c,r",
contact_groups => "slackgroup",
mode => "0444",
}
-
Apply the changes:
sudo /opt/puppetlabs/puppet/bin/puppet agent --test
or the aliastest_puppet_agent
-
Restart nagios:
sudo systemctl restart nagios3
(we have an alias for this calledrestart_nagios
)
Your output in Nagios should look like this: