monitoring IPMI - hilbix/netdata GitHub Wiki
netdata has a freeipmi plugin.
FreeIPMI provides in-band and out-of-band IPMI software based on the IPMI v1.5/2.0 specification. The IPMI specification defines a set of interfaces for platform management and is implemented by a number vendors for system management. The features of IPMI that most users will be interested in are sensor monitoring, system event monitoring, power control, and serial-over-LAN (SOL).
freeipmi.plugin
compile -
install
libipmimonitoring-dev
orlibipmimonitoring-devel
(freeipmi-devel
on RHEL based OS) using the package manager of your system. -
re-install netdata from source. The installer will detect that the required libraries are now available and will also build
freeipmi.plugin
.
Keep in mind IPMI requires root access, so the plugin is setuid to root.
If you just installed the required IPMI tools, please run at least once the command ipmimonitoring
and verify it returns sensors information. This command initialises IPMI configuration, so that the netdata plugin will be able to work.
netdata use
The plugin creates (up to) 8 charts, based on the information collected from IPMI:
- number of sensors by state
- number of events in SEL
- Temperatures CELCIUS
- Temperatures FAHRENHEIT
- Voltages
- Currents
- Power
- Fans
It also adds 2 alarms:
- Sensors in non-nominal state (i.e. warning and critical)
- SEL is non empty
The plugin does a speed test when it starts, to find out the duration needed by the IPMI processor to respond. Depending on the speed of your IPMI processor, charts may need several seconds to show up on the dashboard.
freeipmi.plugin
configuration
The plugin supports a few options. To see them, run:
# /usr/libexec/netdata/plugins.d/freeipmi.plugin -h
netdata freeipmi.plugin 1.8.0-546-g72ce5d6b_rolling
Copyright (C) 2016-2017 Costa Tsaousis <[email protected]>
Released under GNU General Public License v3 or later.
All rights reserved.
This program is a data collector plugin for netdata.
Available command line options:
SECONDS data collection frequency
minimum: 5
debug enable verbose output
default: disabled
sel
no-sel enable/disable SEL collection
default: enabled
hostname HOST
username USER
password PASS connect to remote IPMI host
default: local IPMI processor
sdr-cache-dir PATH directory for SDR cache files
default: /tmp
sensor-config-file FILE filename to read sensor configuration
default: system default
ignore N1,N2,N3,... sensor IDs to ignore
default: none
-v
-V
version print version and exit
Linux kernel module for IPMI is CPU hungry.
On Linux run this to lower kipmiN CPU utilization:
# echo 10 > /sys/module/ipmi_si/parameters/kipmid_max_busy_us
or create: /etc/modprobe.d/ipmi.conf with these contents:
options ipmi_si kipmid_max_busy_us=10
For more information:
https://github.com/hilbix/netdata/netdata/wiki/monitoring-IPMI
You can set these options in /etc/netdata/netdata.conf
at this section:
[plugin:freeipmi]
update every = 5
command options =
Append to command options =
the settings you need. The minimum update every
is 5 (enforced internally by the plugin). IPMI is slow and CPU hungry. So, once every 5 seconds is pretty acceptable.
ignoring specific sensors
Specific sensor IDs can be excluded from freeipmi tools by editing /etc/freeipmi/freeipmi.conf
and setting the IDs to be ignored at ipmi-sensors-exclude-record-ids
. However this file is not used by libipmimonitoring
(the library used by netdata's freeipmi.plugin
).
So, freeipmi.plugin
supports the option ignore
that accepts a comma separated list of sensor IDs to ignore. To configure it, edit /etc/netdata/netdata.conf
and set:
[plugin:freeipmi]
command options = ignore 1,2,3,4,...
To find the IDs to ignore, run the command ipmimonitoring
. The first column is the wanted ID:
ID | Name | Type | State | Reading | Units | Event
1 | Ambient Temp | Temperature | Nominal | 26.00 | C | 'OK'
2 | Altitude | Other Units Based Sensor | Nominal | 480.00 | ft | 'OK'
3 | Avg Power | Current | Nominal | 100.00 | W | 'OK'
4 | Planar 3.3V | Voltage | Nominal | 3.29 | V | 'OK'
5 | Planar 5V | Voltage | Nominal | 4.90 | V | 'OK'
6 | Planar 12V | Voltage | Nominal | 11.99 | V | 'OK'
7 | Planar VBAT | Voltage | Nominal | 2.95 | V | 'OK'
8 | Fan 1A Tach | Fan | Nominal | 3132.00 | RPM | 'OK'
9 | Fan 1B Tach | Fan | Nominal | 2150.00 | RPM | 'OK'
10 | Fan 2A Tach | Fan | Nominal | 2494.00 | RPM | 'OK'
11 | Fan 2B Tach | Fan | Nominal | 1825.00 | RPM | 'OK'
12 | Fan 3A Tach | Fan | Nominal | 3538.00 | RPM | 'OK'
13 | Fan 3B Tach | Fan | Nominal | 2625.00 | RPM | 'OK'
14 | Fan 1 | Entity Presence | Nominal | N/A | N/A | 'Entity Present'
15 | Fan 2 | Entity Presence | Nominal | N/A | N/A | 'Entity Present'
...
debugging
You can run the plugin by hand:
# become user netdata
sudo su -s /bin/sh netdata
# run the plugin in debug mode
/usr/libexec/netdata/plugins.d/freeipmi.plugin 5 debug
You will get verbose output on what the plugin does.
kipmi0 CPU usage
There have been reports that kipmi is showing increased CPU when the IPMI is queried.
IBM has given a few explanations.
Check also this stackexchange post.
To lower the CPU consumption of the system you can issue this command:
echo 10 > /sys/module/ipmi_si/parameters/kipmid_max_busy_us
You can also permanently set the above setting by creating the file /etc/modprobe.d/ipmi.conf
with this content:
# prevent kipmi from consuming 100% CPU
options ipmi_si kipmid_max_busy_us=10
This instructs the kernel IPMI module to pause for a tick between checking IPMI. Querying IPMI will be a lot slower now (e.g. several seconds for IPMI to respond), but kipmi
will not use any noticeable CPU. You can also use a higher number (this is the number of microseconds to poll IPMI for a response, before waiting for a tick).
If you need to disable IPMI for netdata, edit /etc/netdata/netdata.conf
and set:
[plugins]
freeipmi = no