Device Monitoring - chuckablack/quokka GitHub Wiki
Quokka monitors devices in two ways:
- Availability/Response Time: Quokka goes out to the configured devices and polls them using the appropriate device communication mechanism, e.g. NAPALM or ncclient.
- Compliance: Quokka attempts to read the device's software version, and configuration, and compares these against a stored 'standard' value for each. If they match, quokka considers it to be compliant, and if not, quokka considers that it is out of compliance.
Availability
If the device replies, it is considered to be 'available'. Further analysis could be done, e.g. a heuristic considering memory or CPU utilization, perhaps the state of forwarding buffers, etc., but quokka does not attempt to do this, at this time.
Response time
Quokka times how long it takes from initiating the request for information from the device, e.g. via NAPALM, to the time the response is received, and the difference is the "response time". Note that this number if going to be way larger than the response time for hosts, where only a ping is done, or services, where only a brief request (e.g. grabbing an HTTP web page) is executed.
In comparison, for devices using NAPALM, the elapsed time from initiation until completion will involve establishing a connection to the device (e.g. SSH login handshake), gathering of information (e.g. CPU and memory), and returning those values to quokka. This is why response times e.g. to the default Cisco sandbox devices takes exceptionally long. With your own device locally, it should be much faster.
CPU and memory utilization
Some devices are able to provide information about CPU and memory utilization, and NAPALM makes this available when you get the 'facts' for a device. So when monitoring with NAPALM, quokka is able to get this information. Also, in the simulated "SDWAN" devices, I have implemented the random values for cpu and memory, and return those in the "heartbeat" message. Using ncclient, I was unable to figure out how to parse the XML to get those values, and so they are not available.
Compliance monitoring
Quokka has a separate thread that performs compliance monitoring of devices:
- Version: Quokka reads the version of software running on the device, and compares it to a stored 'standard', for that specific device type (based on device vendor and OS). If the version is exactly the same, the device is considered compliant.
- Configuration: Quokka uses NAPALM's configuration functionality to compare the current configuration, to the 'standard' that is stored for that specific device type (vendor and OS). If the 'diff' of these configurations is empty, the configuration is considered to be compliant.
Important: The intention of this software in quokka is fairly clear, but unfortunately, it can be messy, and this functionality in quokka can get out of date, and sometimes I've noticed that NAPALM's "diff" functionality gives unexpected results. See below:
Note: Version comparisons can be tricky, since they sometimes require parsing. E.g. for the Cisco CSR devices in the sandbox, the software version is a pretty long string, e.g. "Virtual XE Software (X86_64_LINUX_IOSD-UNIVERSALK9-M), Version 16.9.3, RELEASE SOFTWARE (fc2)", whereas for the Nexus device it is merely "9.3(3)".
Note: Configuration comparisons done by NAPALM have not always worked as I expected - in fact, it seems like NAPALM gets confused by some of the 'message' values in the Cisco sandbox devices. So this functionality for comparison is perhaps a good start, but more work needs to be done.