Nagios Checks with NRPE - KeegMitch/Operations-Engineering-group-c GitHub Wiki
Issues: Our Issues page for issues we've encountered during the lab
Create new Puppet Module
using this structure by naming module called nrpe
modules/nrpe/files
modules/nrpe/templates
modules/nrpe/manifests
modules/nrpe/manifests/install.pp
modules/nrpe/manifests/service.pp
modules/nrpe/manifests/config.pp
modules/nrpe/manifests/init.pp
Install NRPE server using install.pp
Code for install.pp:
class nrpe::install {
package{"nagios-nrpe-server":
ensure=>present,
}
user { "nagios":
ensure => present,
}
group { "nagios":
ensure => present,
}
}
Do not worry about the nagios-nrpe plugins till later in the lab
Create service.pp to check if NRPE is running
Code for service.pp:
class nrpe::service {
service { "nagios-nrpe-server":
ensure => running,
hasstatus => true,
hasrestart => true,
enable => true,
require => Class["nrpe::config"],
}
}
NRPE server config
Code for config.pp:
class nrpe::config {
file{ "/etc/nagios/nrpe.cfg":
ensure => present,
source => "puppet:///modules/nrpe/nrpe.cfg",
mode => "0644",
owner => "root",
group => "root",
require => Class["nrpe::install"],
notify => Class["nrpe::service"],
}
}
Adding allowed hosts from db server:
sudo nano /etc/nagios/nrpe.cfg
Testing:
Nagios server config
Add the install plugins inside install.pp for the Nagios module (not nrpe);
class nagios::install {
package{ "nagios3":
ensure=>present,
}
package { "apache2-utils":
ensure => present,
}
package { "nagios-nrpe-plugin":
ensure => present,
}
user { "nagios":
ensure => present,
}
group { "nagios":
ensure => present,
}
}
Hostgroup Code for config.pp from Nagios module (not nrpe):
# nagios hostgroups
nagios_hostgroup {"my-ssh-servers":
target => "/etc/nagios3/conf.d/ppt_hostgroups.cfg",
mode => "0444",
alias => "My SSH servers",
members => "db-c, app-c, backup-c, mgmt-c",
}
nagios_hostgroup {"my-mariaDB-servers":
target => "/etc/nagios3/conf.d/ppt_hostgroups.cfg",
mode => "0444",
alias => "My mariaDB servers",
members => "db-c",
}
# new hostgroup for remote disks (edit to add all servers)
nagios_hostgroup {"Remote-Disks":
target => "/etc/nagios3/conf.d/ppt_hostgroups.cfg",
mode => "0444",
alias => "Remote Disks",
members => "db-c, app-c, backup-c, mgmt-c",
}
nagios_hostgroup {"Remote-Procs":
target => "/etc/nagios3/conf.d/ppt_hostgroups.cfg",
mode => "0444",
alias => "Remote procs",
members => "db-c, app-c, backup-c, mgmt-c",
}
Service Code for config.pp from Nagios module (not nrpe):
# nagios services
nagios_service {"ssh":
service_description => "ssh servers",
hostgroup_name => "my-ssh-servers",
target => "/etc/nagios3/conf.d/ppt_services.cfg",
check_command => "check_ssh",
max_check_attempts => 3,
retry_check_interval => 1,
normal_check_interval => 5,
check_period => "24x7",
notification_interval => 30,
notification_period => "24x7",
notification_options => "w,u,c",
contact_groups => "admins",
mode => "0444",
}
nagios_service {"mariaDB":
service_description => "mariaDB servers",
hostgroup_name => "my-mariaDB-servers",
target => "/etc/nagios3/conf.d/ppt_services.cfg",
check_command => "check_mysql_cmdlinecred!nagios!mypasswd",
max_check_attempts => 3,
retry_check_interval => 1,
normal_check_interval => 5,
check_period => "24x7",
notification_interval => 30,
notification_period => "24x7",
notification_options => "w,u,c",
contact_groups => "admins",
mode => "0444",
}
# Adding new service for disk check via NRPE
nagios_service {"root_disk_check":
service_description => "Root Disk Space",
hostgroup_name => "Remote-Disks",
target => "/etc/nagios3/conf.d/ppt_services.cfg",
check_command => "check_nrpe!check_sda1",
max_check_attempts => 3,
retry_check_interval => 1,
normal_check_interval => 5,
check_period => "24x7",
notification_interval => 30,
notification_period => "24x7",
notification_options => "w,u,c",
contact_groups => "admins",
mode => "0444",
}
nagios_service { "total_procs_check":
service_description => "Total Processes",
hostgroup_name => "Remote-Procs",
target => "/etc/nagios3/conf.d/ppt_services.cfg",
check_command => "check_nrpe!check_total_procs",
max_check_attempts => 3,
retry_check_interval => 1,
normal_check_interval => 5,
check_period => "24x7",
notification_interval => 30,
notification_period => "24x7",
notification_options => "w,u,c,r",
contact_groups => "admins",
mode => "0444",
}
Apply module to app and backup servers
Manually testing the disk space regarding the NRPE command to check disk
command[check_hda1]=/usr/lib/nagios/plugins/check_disk -w 20% -c 10% -p /dev/hda1
Going into this directory:
cd /usr/lib/nagios/plugins/
inside that directory, run this command:
./check_disk -w 20% -c 10% -p /dev/hda1
Says the disk is not accessible so using sudo df -h command
Testing using the sda1
Added the mgmt-c to allowed hosts in the nrpe.cfg file
Copied the nrpe.cfg file from the DB-c server then reran the puppet command