Issues for NRPE setup - KeegMitch/Operations-Engineering-group-c GitHub Wiki
Problems we're facing
For this lab
- Getting puppet to generate nrpe.cfg in the DB server
/usr/lib/nagios/plugins/check_nrpe -H mgmt-c -c check_sda1
/usr/lib/nagios/plugins/check_nrpe -H db-c -c check_sda1
Update: turns out we're meant to have the nagios-nrpe-plugin installed on the mgmt server (not the db) and on the nagios module not the nrpe module on puppet!
- As it stand our nagios3 service isn't running properly because of nagios-plugins config errors:
Fixed: I fixed error #2 in the problems section, I got nagios3 running again, turns out when i comment out the hostgroups and services for the nrpe lab (did this to fix other errors) it was still in the ppt_hostgroups.cfg and ppt_services.cfg file so i edited those files and just removed the remote disk service and the associated hostgroup.
- Currently we have our Remote-Disks hostgroup showing up on the web server with just db-c for now:
Issue here is that when we apply the remote disk service, not only it doesn't apply immediately but when restarting nagios the whole thing crashes with errors. The issue is likely something to do with our remotedisk service in config.pp but at this stage we don't know what's causing it to crash.
We can manually run the check_nrpe command with the warning, however doesn't show up on our web server:
Mgmt-c is showing the critical warning on the dashboard, I think this is because it can not run the check_sda1 on itself
Attempted solutions
This is regarding the connection error
iptables firewall
Double check that connections are allowed in port 5666
sudo iptables -L -n
Adding the rule from db server:
sudo iptables -I INPUT -p tcp --dport 5666 -s 10.2.0.6 -j ACCEPT
sudo iptables-save
Check:
Restart the nagios-nrpe-server
Tried the ufw firewall but that didn't work either
The reason it wasn't working is because the nrpe service was not installed on the mgmt server, had to update the site.pp to include nrpe under mgmt-c and remove the ensure nagios user and group from the nrpe install.pp because it caused a conflict with nagios install.pp