Monitoring - PSJoshi/Notes GitHub Wiki

Typical monitoring metrics are:

  • Disk Space (This check monitors the available space on the hard disks)
  • CPU Usage (This check measures the CPU utilization of the devices and loads this information into timeseries database like InfluxDB so you can view it on a dashboard in Grafana)
  • Failover status (This check monitors the cluster status of the infrastructure)
  • Pool Member Status (This check monitors the status of individual pool members)
  • Sync Status (This check monitors the status of device synchronization)
  • Hardware Status (This checks monitors the hardware status of the appliances)
  • Interface Table (This check monitors the network ports, their status and the traffic on the ports)
  • Uptime (This check indicates the elapsed time since the last reboot)
  • Ready-to-use alarms for health monitoring - https://github.com/firehol/netdata/tree/master/conf.d/health.d
  • Wazuh rulesets - https://github.com/wazuh/wazuh-ruleset
  • Wazuh rules - https://github.com/wazuh/wazuh/tree/master/etc/rules