Ticket 245 ‐ Configure NRPE for Remote Server Monitoring - SupaHotBall/OE2-Group-D GitHub Wiki
Task
NRPE Server Setup
- Install nagios-nrpe-server on target servers (app/db/backup)
- Configure allowed_hosts in nrpe.cfg to permit mgmt server
- Define custom check commands (e.g., disk space monitoring)
Nagios Server Configuration
- Install nagios-nrpe-plugin on mgmt server
- Create "Remote Checks" hostgroup in Nagios
- Configure NRPE service checks in Puppet
Puppet Automation
- Develop NRPE configuration module
- Apply module to app/db/backup servers
- Exclude mgmt server from NRPE configuration
Steps Taken
Create a Puppet module for NRPE. We will start by creating a new directory in modules
sudo mkdir /etc/puppetlabs/code/modules/nagios_nrpe
Inside the nagios_nrpe module directory, create the manifests, files and templates directories
sudo mkdir /etc/puppetlabs/code/modules/nagios_nrpe/manifests
sudo mkdir /etc/puppetlabs/code/modules/nagios_nrpe/files
sudo mkdir /etc/puppetlabs/code/modules/nagios_nrpe/templates
Create the .pp files in the manifests directory
sudo nano /etc/puppetlabs/code/modules/nagios_nrpe/manifests/config.pp
sudo nano /etc/puppetlabs/code/modules/nagios_nrpe/manifests/init.pp
sudo nano /etc/puppetlabs/code/modules/nagios_nrpe/manifests/install.pp
sudo nano /etc/puppetlabs/code/modules/nagios_nrpe/manifests/service.pp
Configure the install class for install.pp. We don't need to create a nagios user and group here because we already did it in the nagios module. Only the required installation packages are needed here
class nagios_nrpe::install {
package { "nagios-nrpe-server":
ensure => present,
require => User["nagios"],
}
package { "nagios-nrpe-plugin":
ensure => present,
require => User["nagios"],
}
}
Include the nagios_nrpe::install class within the init.pp file
class nagios {
include nagios_nrpe::install
}
Configure the service.pp file
class nagios_nrpe::service {
service { 'nagios-nrpe-server':
ensure => running,
hasstatus => true,
hasrestart => true,
enable => true,
require => Class['nagiosnrpe::install'],
}
}
Include the service class in the init.pp file
include nagios_nrpe::service
Ensure that the puppet module is applied first so that the nrpe server package is installed on the management server
sudo puppet agent --test
Once the package is installed, navigate to the location of the NRPE service configuration file which should be located at /etc/nagios/nrpe.cfg
Copy this file into the files directory in the nagios_nrpe module
'sudo cp /etc/nagios/nrpe.cfg /etc/puppetlabs/code/modules/nagios_nrpe/files/nrpe.cfg'
It's important to note that the config file which should be edited is the copy, not the original version
Configure the NRPE server to perform a disk check. Run df
to see which device is currently mounted on the system.
In this scenario, the mounted device is sdb1 so we need the line in the config file command[check_hda1]=/usr/lib/nagios/plugins/check_disk -w 20% -c 10% -p /dev/hda1 needs to be changed to:
command[check_sdb1]=/usr/lib/nagios/plugins/check_disk -w 20% -c 10% -p /dev/sdb1
Edit the config.pp file with the correct configuration for the nrpe server
class nagios_nrpe::config {
file { "/etc/nagios/nrpe.cfg":
ensure => present,
source => "puppet:///modules/nagios_nrpe/nrpe.cfg",
mode => "0444",
owner => "root",
group => "root",
require => Class["nagios_nrpe::install"],
notify => Class["nagios_nrpe::service"],
}
}
After this is done, restart the nagios-nrpe-server so that the changes will take effect and check that the server is running
sudo systemctl restart nagios-nrpe-server
sudo systemctl status nagios-nrpe-server
Ensure that the init.pp file includes all the classes
class nagios_nrpe {
include nagios_nrpe::install
include nagios_nrpe::config
include nagios_nrpe::service
}
Run sudo puppet agent --test
to apply the new configurations
Install the plugin package to allow nagios to remote check other servers with NRPE
sudo apt install nagios-nrpe-plugin
Add a new hostgroup and service to the nagios config.pp file to check remote disks
nagios_hostgroup { "remote-disks":
alias => "Remote Disk Checks",
members => "db-d,apps-d,backup-d",
target => "/etc/nagios4/conf.d/hostgroups.cfg",
mode => "0444",
}
nagios_service { "check-disk-nrpe":
use => "generic-service",
hostgroup_name => "remote-disks",
service_description => "Root Disk Usage",
check_command => "check_nrpe!check_sdb1",
target => "/etc/nagios4/conf.d/services.cfg",
mode => "0444",
}
Include nagios_nrpe in db node in site.pp
node 'db-d.oe2.org.nz' {
include sudo
include ntp_service
include mariadb
include nagios_nrpe
package { 'vim': ensure => present }
}
Apply the configuration of the puppet module in the db server
Run the command /usr/lib/nagios/plugins/check_nrpe -H db-d -c check_sdb1
to check that the disk can be successfully read
Challenges
The module failed to apply because a user called nagios doesn't exist
Fixed the issue by declaring a user in the install.pp file of the nagios_nrpe module instea
External Resources
N/A
Ticket Reference
https://rt.dataraster.com/Ticket/Display.html?id=245