Ticket 262 ‐ Update Nagios Hostgroup to Include All Monitored Servers (Part 1 ‐ check_disk, check_users and check_load) - SupaHotBall/OE2-Group-D GitHub Wiki

Task

Updated the remote-disks Nagios hostgroup to include all relevant client servers (db-d, apps-d, backup-d) in order to ensure consistent NRPE-based service monitoring across the infrastructure.

Steps Taken

Edited the nagios::config manifest in config.pp to update the remote-disks hostgroup members line from

"db-d"

"db-d,apps-d,backup-d"

Ensure that the service groups are defined with the same format as previously defined service groups

nagios_service { "check-disk-nrpe":
  service_description     => "Root Disk Usage",
  hostgroup_name          => "remote-disks",
  target                  => "/etc/nagios4/conf.d/services.cfg",
  check_command           => "check_nrpe!check_sdb1",
  max_check_attempts      => 3,
  retry_check_interval    => 1,
  normal_check_interval   => 5,
  check_period            => "24x7",
  notification_interval   => 30,
  notification_period     => "24x7",
  notification_options    => "w,u,c",
  contact_groups          => "admins",
  mode                    => "0444",
}

nagios_service { "check-users-nrpe":
  service_description     => "Logged-in Users",
  hostgroup_name          => "remote-disks",
  target                  => "/etc/nagios4/conf.d/services.cfg",
  check_command           => "check_nrpe!check_users",
  max_check_attempts      => 3,
  retry_check_interval    => 1,
  normal_check_interval   => 5,
  check_period            => "24x7",
  notification_interval   => 30,
  notification_period     => "24x7",
  notification_options    => "w,u,c",
  contact_groups          => "admins",
  mode                    => "0444",
}

nagios_service { "check-load-nrpe":
  service_description     => "System Load",
  hostgroup_name          => "remote-disks",
  target                  => "/etc/nagios4/conf.d/services.cfg",
  check_command           => "check_nrpe!check_load",
  max_check_attempts      => 3,
  retry_check_interval    => 1,
  normal_check_interval   => 5,
  check_period            => "24x7",
  notification_interval   => 30,
  notification_period     => "24x7",
  notification_options    => "w,u,c",
  contact_groups          => "admins",
  mode                    => "0444",
}

Ensure that the file is saved and then run sudo puppet agent --test in the mgmt server

Once the configuration has been applied, restart the nagios4 service on the mgmt server to view the disk checks on the website.

Challenges

N/A

External Resources

N/A

Ticket Reference

https://rt.dataraster.com/Ticket/Display.html?id=262