Monitoring the HEALTH of an IBM Storage Scale System with Grafana dashboard - IBM/ibm-spectrum-scale-bridge-for-grafana GitHub Wiki

Starting with version 5.2.2, a new sensor called GPFSmmhealth has been added to the IBM Storage Scale performance monitoring tool. This sensor collects health state metrics for various GPFS cluster components every 30 seconds.These metrics provide a numerical representation of the states defined for "mmhealth", a command-line interface used to monitor the health of the node and services hosted on the node in IBM Storage Scale. The exact mapping between states and numerical values could be found in the GPFSmmhealth sensor documentation available in the IBM Storage Scale Knowledge Center.

Example dashboards in the gpfs cluster health overview sub-folder allow you to monitor the aggregated health state of each cluster node, drill down to the components health state view of a particular node, and then go even deeper to view the health of individual entities within a component.

/gifs/gpfs_cluster_health_status_overview.gif

The 'gpfs cluster health overview' example dashboards are available for both, the for the OpenTSDB and the Prometheus Datasource types. Drilldown does happen via hyperlinks between the individual dashboards: “GPFS Cluster overview”, “Component health view by node selection” and “Component entities health details”, which are based on metadata selection. For this reason, you must import all three dashboards: from the Datasource type with which the Grafana Bridge is registered with Grafana.