Nagios Checks with NRPE - KeegMitch/Operations-Engineering-group-c GitHub Wiki

Issues: Our Issues page for issues we've encountered during the lab

Create new Puppet Module

using this structure by naming module called nrpe

modules/nrpe/files
modules/nrpe/templates
modules/nrpe/manifests
modules/nrpe/manifests/install.pp
modules/nrpe/manifests/service.pp
modules/nrpe/manifests/config.pp
modules/nrpe/manifests/init.pp

Install NRPE server using install.pp

Code for install.pp:

class nrpe::install {
        package{"nagios-nrpe-server":
        ensure=>present,
        }
        user { "nagios":
        ensure => present,
        }
        group { "nagios":
        ensure => present,
        }
}

Do not worry about the nagios-nrpe plugins till later in the lab

Create service.pp to check if NRPE is running

Code for service.pp:

class nrpe::service {
    service { "nagios-nrpe-server":
    ensure     => running,
    hasstatus  => true,
    hasrestart => true,
    enable     => true,
    require    => Class["nrpe::config"],
  }
}

NRPE server config

Code for config.pp:

class nrpe::config {

 file{ "/etc/nagios/nrpe.cfg":
    ensure => present,
    source => "puppet:///modules/nrpe/nrpe.cfg",
    mode    => "0644",
    owner =>  "root",
    group =>  "root",
    require => Class["nrpe::install"],
    notify  => Class["nrpe::service"],
  }
}

Adding allowed hosts from db server:

sudo nano /etc/nagios/nrpe.cfg

image

Testing: image

Nagios server config

Add the install plugins inside install.pp for the Nagios module (not nrpe);

class nagios::install {
        package{ "nagios3":
        ensure=>present,
        }
        package { "apache2-utils":
                ensure => present,
        }
        package { "nagios-nrpe-plugin":
                ensure => present,
        }
        user { "nagios":
        ensure => present,
        }
        group { "nagios":
        ensure => present,
        }
}

Hostgroup Code for config.pp from Nagios module (not nrpe):

# nagios hostgroups

nagios_hostgroup {"my-ssh-servers":
   target => "/etc/nagios3/conf.d/ppt_hostgroups.cfg",
   mode => "0444",
   alias => "My SSH servers",
   members => "db-c, app-c, backup-c, mgmt-c",
}

nagios_hostgroup {"my-mariaDB-servers":
  target => "/etc/nagios3/conf.d/ppt_hostgroups.cfg",
  mode => "0444",
  alias => "My mariaDB servers",
  members => "db-c",
}

# new hostgroup for remote disks (edit to add all servers)

nagios_hostgroup {"Remote-Disks":
  target => "/etc/nagios3/conf.d/ppt_hostgroups.cfg",
  mode => "0444",
  alias => "Remote Disks",
  members => "db-c, app-c, backup-c, mgmt-c",
}

nagios_hostgroup {"Remote-Procs":
  target => "/etc/nagios3/conf.d/ppt_hostgroups.cfg",
  mode => "0444",
  alias => "Remote procs",
  members => "db-c, app-c, backup-c, mgmt-c",
}

Service Code for config.pp from Nagios module (not nrpe):

# nagios services

nagios_service {"ssh":
service_description => "ssh servers",
hostgroup_name => "my-ssh-servers",
target => "/etc/nagios3/conf.d/ppt_services.cfg",
check_command => "check_ssh",
max_check_attempts => 3,
retry_check_interval => 1,
normal_check_interval => 5,
check_period => "24x7",
notification_interval => 30,
notification_period => "24x7",
notification_options => "w,u,c",
contact_groups => "admins",
mode => "0444",
}

nagios_service {"mariaDB":
service_description => "mariaDB servers",
hostgroup_name => "my-mariaDB-servers",
target => "/etc/nagios3/conf.d/ppt_services.cfg",
check_command => "check_mysql_cmdlinecred!nagios!mypasswd",
max_check_attempts => 3,
retry_check_interval => 1,
normal_check_interval => 5,
check_period => "24x7",
notification_interval => 30,
notification_period => "24x7",
notification_options => "w,u,c",
contact_groups => "admins",
mode => "0444",
}

# Adding new service for disk check via NRPE

nagios_service {"root_disk_check":
  service_description => "Root Disk Space",
  hostgroup_name => "Remote-Disks",
  target => "/etc/nagios3/conf.d/ppt_services.cfg",
  check_command => "check_nrpe!check_sda1",
  max_check_attempts => 3,
  retry_check_interval => 1,
  normal_check_interval => 5,
  check_period => "24x7",
  notification_interval => 30,
  notification_period => "24x7",
  notification_options => "w,u,c",
  contact_groups => "admins",
  mode => "0444",
}

nagios_service { "total_procs_check":
  service_description => "Total Processes",
  hostgroup_name => "Remote-Procs",
  target => "/etc/nagios3/conf.d/ppt_services.cfg",
  check_command => "check_nrpe!check_total_procs",
  max_check_attempts => 3,
  retry_check_interval => 1,
  normal_check_interval => 5,
  check_period => "24x7",
  notification_interval => 30,
  notification_period => "24x7",
  notification_options => "w,u,c,r",
  contact_groups => "admins",
  mode => "0444",
}

Apply module to app and backup servers

image

image

image

Manually testing the disk space regarding the NRPE command to check disk

command[check_hda1]=/usr/lib/nagios/plugins/check_disk -w 20% -c 10% -p /dev/hda1

Going into this directory:

cd /usr/lib/nagios/plugins/

image

inside that directory, run this command:

./check_disk -w 20% -c 10% -p /dev/hda1

image

Says the disk is not accessible so using sudo df -h command

image

Testing using the sda1

image

Added the mgmt-c to allowed hosts in the nrpe.cfg file

image

Copied the nrpe.cfg file from the DB-c server then reran the puppet command

image