Ticket 245 ‐ Configure NRPE for Remote Server Monitoring - SupaHotBall/OE2-Group-D GitHub Wiki

Task

NRPE Server Setup

  • Install nagios-nrpe-server on target servers (app/db/backup)
  • Configure allowed_hosts in nrpe.cfg to permit mgmt server
  • Define custom check commands (e.g., disk space monitoring)

Nagios Server Configuration

  • Install nagios-nrpe-plugin on mgmt server
  • Create "Remote Checks" hostgroup in Nagios
  • Configure NRPE service checks in Puppet

Puppet Automation

  • Develop NRPE configuration module
  • Apply module to app/db/backup servers
  • Exclude mgmt server from NRPE configuration

Steps Taken

Create a Puppet module for NRPE. We will start by creating a new directory in modules

sudo mkdir /etc/puppetlabs/code/modules/nagios_nrpe

image

Inside the nagios_nrpe module directory, create the manifests, files and templates directories

sudo mkdir /etc/puppetlabs/code/modules/nagios_nrpe/manifests
sudo mkdir /etc/puppetlabs/code/modules/nagios_nrpe/files
sudo mkdir /etc/puppetlabs/code/modules/nagios_nrpe/templates

image

Create the .pp files in the manifests directory

sudo nano /etc/puppetlabs/code/modules/nagios_nrpe/manifests/config.pp
sudo nano /etc/puppetlabs/code/modules/nagios_nrpe/manifests/init.pp
sudo nano /etc/puppetlabs/code/modules/nagios_nrpe/manifests/install.pp
sudo nano /etc/puppetlabs/code/modules/nagios_nrpe/manifests/service.pp

image

Configure the install class for install.pp. We don't need to create a nagios user and group here because we already did it in the nagios module. Only the required installation packages are needed here

class nagios_nrpe::install {
    package { "nagios-nrpe-server":
        ensure  => present,
        require => User["nagios"],
    }
    package { "nagios-nrpe-plugin":
        ensure  => present,
        require => User["nagios"],
    }

}

image

Include the nagios_nrpe::install class within the init.pp file

class nagios {
  include nagios_nrpe::install
}

image

Configure the service.pp file

class nagios_nrpe::service {
  service { 'nagios-nrpe-server':
    ensure     => running,
    hasstatus  => true,
    hasrestart => true,
    enable     => true,
    require    => Class['nagiosnrpe::install'],
  }
}

image

Include the service class in the init.pp file

include nagios_nrpe::service

image

Ensure that the puppet module is applied first so that the nrpe server package is installed on the management server

sudo puppet agent --test

Once the package is installed, navigate to the location of the NRPE service configuration file which should be located at /etc/nagios/nrpe.cfg

Copy this file into the files directory in the nagios_nrpe module

'sudo cp /etc/nagios/nrpe.cfg /etc/puppetlabs/code/modules/nagios_nrpe/files/nrpe.cfg'

It's important to note that the config file which should be edited is the copy, not the original version

Configure the NRPE server to perform a disk check. Run df to see which device is currently mounted on the system.

image

In this scenario, the mounted device is sdb1 so we need the line in the config file command[check_hda1]=/usr/lib/nagios/plugins/check_disk -w 20% -c 10% -p /dev/hda1 needs to be changed to:

command[check_sdb1]=/usr/lib/nagios/plugins/check_disk -w 20% -c 10% -p /dev/sdb1

image

Edit the config.pp file with the correct configuration for the nrpe server

class nagios_nrpe::config {
  file { "/etc/nagios/nrpe.cfg":
    ensure  => present,
    source  => "puppet:///modules/nagios_nrpe/nrpe.cfg",
    mode    => "0444",
    owner   => "root",
    group   => "root",
    require => Class["nagios_nrpe::install"],
    notify  => Class["nagios_nrpe::service"],
  }
}

image

After this is done, restart the nagios-nrpe-server so that the changes will take effect and check that the server is running

sudo systemctl restart nagios-nrpe-server
sudo systemctl status nagios-nrpe-server

image

Ensure that the init.pp file includes all the classes

class nagios_nrpe {
  include nagios_nrpe::install
  include nagios_nrpe::config
  include nagios_nrpe::service
}

image

Run sudo puppet agent --test to apply the new configurations

image

Install the plugin package to allow nagios to remote check other servers with NRPE

sudo apt install nagios-nrpe-plugin

image

Add a new hostgroup and service to the nagios config.pp file to check remote disks

    nagios_hostgroup { "remote-disks":
        alias   => "Remote Disk Checks",
        members => "db-d,apps-d,backup-d",
        target  => "/etc/nagios4/conf.d/hostgroups.cfg",
        mode    => "0444",
    }

    nagios_service { "check-disk-nrpe":
        use                     => "generic-service",
        hostgroup_name          => "remote-disks",
        service_description     => "Root Disk Usage",
        check_command           => "check_nrpe!check_sdb1",
        target                  => "/etc/nagios4/conf.d/services.cfg",
        mode                    => "0444",
    }

image

Include nagios_nrpe in db node in site.pp

image

node 'db-d.oe2.org.nz' {
  include sudo
  include ntp_service
  include mariadb
  include nagios_nrpe  
  package { 'vim': ensure => present }
}

image

Apply the configuration of the puppet module in the db server

image

Run the command /usr/lib/nagios/plugins/check_nrpe -H db-d -c check_sdb1 to check that the disk can be successfully read

image


Challenges

The module failed to apply because a user called nagios doesn't exist

image

Fixed the issue by declaring a user in the install.pp file of the nagios_nrpe module instea

image


External Resources

N/A


Ticket Reference

https://rt.dataraster.com/Ticket/Display.html?id=245