Adding option for consider different dimension in a clustered service with "Best of" Logic
A service has n dimensions. In cluster mode, a group of dimensions on node A may be in some state (wich some of them are crit for example), and a distinct group of dimensions on node B may be on other state (some crit but different from the one one node A). Currently, first both nodes are individually taken to be WARN or CRIT, then aggregation mode applies.
In certain clusters you'd expect that for each dimension an aggregation mode 'Best' would apply, thus the grouping of such dimensions in a service seems arbitrary and currently disallows correct cluster modelling.
Option that I ask is that aggregation mode ruleset would be: 'apply aggregation mode to the service's dimensions and not to the service's total state'.
An example
Assuming we have one service "FooBar", monitoring the Foo and the Bar. Assuming we have the following outputs for 2 nodes in active-active cluster:
Node1: Foo: alright, Bar: too cold (CRIT)
Node2: Foo: too hot (CRIT), Bar: alright
The cluster should be ok
Comments: 2
-
18 Jan
Robert SanderI am afraid that this is technically not possible at all.
A service check has always just one state. The cluster aggregation can only operate on the final service state.
What you see as "dimensions" are just hints from the check implementation on how it came to the final state. There is no identification on these and therefor they cannot be compared to the "dimensions" from another host's check. -
19 Mar
FabioI support Biagio's suggestion.
11
The server state should consider the “dimension” result.
This should be done for the “Worst of” too
Example:
Node1: Foo: alright (value 5), Bar: too cold (value 31235)(CRIT)
Node2: Foo: too hot (value 5123)(CRIT), Bar: alright (value 2)
With “Best of” management the cluster should be ok
Measure should be:
Foo: alright (value 5)
Bar: alright (value 2)
With “Worst of” management the cluster should be KO
Measure should be:
Bar: too cold (value 31235)(CRIT)
Foo: too hot (value 5123)(CRIT)