Why - absalon-james/grafanizer GitHub Wiki

#Why Grafanizer?

First, a bit of vocabulary. Within cloud monitoring, there exist entities. Entities have checks. Checks have metrics.

Consider a situation where 4 entities all run a cpu check which reports a cpu average metric. If I wanted to graph the metric using Grafana, I would add a query to the dashboard similar to:

rackspace.monitoring.entities.*.checks.agent.cpu.*.usage_average

This would give me 4 lines on my graph each unhelpfully labeled as: rackspace.monitoring.entities.enaaaaaaaa.checks.agent.cpu.chaaaaaaaa.usage_average rackspace.monitoring.entities.enbbbbbbbb.checks.agent.cpu.chbbbbbbbb.usage_average rackspace.monitoring.entities.encccccccc.checks.agent.cpu.chcccccccc.usage_average rackspace.monitoring.entities.endddddddd.checks.agent.cpu.chdddddddd.usage_average

We could clean it up a bit and use the aliasByNode() function.

aliasByNode(rackspace.monitoring.entities.*.checks.agent.cpu.*.usage_average, 3)

Our lines would then be labeled as: enaaaaaaaa enbbbbbbbb encccccccc endddddddd

A little bit cleaner but still unhelpful. The values above are entity ids. Entities also have labels. Labels tend to be more user friendly but are generally unavailable through querying.

Grafanizer works by gathering pieces of information, such as entity labels and making that information available to a template. The template can use this information to create query strings in the dashboard such as

alias(rackspace.monitoring.entities.enaaaaaaaa.checks.agent.cpu.chaaaaaaaa.usage_average, 'web-server-001')
alias(rackspace.monitoring.entities.enbbbbbbbb.checks.agent.cpu.chbbbbbbbb.usage_average, 'web-server-002')
alias(rackspace.monitoring.entities.encccccccc.checks.agent.cpu.chcccccccc.usage_average, 'web-server-003')
alias(rackspace.monitoring.entities.endddddddd.checks.agent.cpu.chdddddddd.usage_average, 'web-server-004')

which would then graph the same 4 lines but each would be labeled as 'web-server-00n'

Furthermore, Grafanizer allows us to group types of entities together. We can look at cpu metrics of only infrastructure nodes and ignore compute nodes if we wish.