Event History State and Value Changes
For local checks, CheckMK monitors based on state changes of a Service and we get an event history as this happens.
As part of any Fault finding the sequence of events is quite important along with the values of other services through Data Analytics. More often than not the service that went CRIT is usually the end product of other events and it would be great to know the status and values of other services in the period prior to the CRIT.
I'm proposing adding a capability (OFF by default) that an individual Service can be configured to generate more low level event history based on either a state change or a change in value of the service.
See attached example.
In the current system we only get 1 record to indicate Room temp went from OK to CRIT. We don't get the history of cooling to see that it didn't come on because its State was always OK.
Comments: 4
-
23 Aug, '22
Marcel Arentz AdminI guess, this is already solvable by the Business Intelligence of Checkmk?
-
12 Nov, '22
Thomas Lippert AdminHi Ray,
some questions about your feature request. First it sounds like you would need a metric about this value, which would help with a graph showing a long term trend. The other part of your description sounds more like you need service dependencies, so the temperature also depends on e.g. the ventilation, which is monitored as another service. Can you provide more insights?
Thomas -
03 Apr, '23
Raymond DonohoeThomas. i used the example of a Room Temperature above for familiarity but i will try to describe in our real world example. We monitor Lighthouses with Checkmk and one of the major events that happens is the light going on and off at particular times in the day. At the moment if anyone asks when did the light come on? we can only get this information by hovering over the history of the Service Graph. We do not get any entries in the service event history because there was no change in state for the service . What is happening is that the Light status service stays OK but the value of the service toggles between 1 and 0 to represent the light on/off. To help with fault finding we want some of the OK->OK recorded in the service event history and this would be based on the change of value and not the change in state.. I understand this could potentially generate massive amounts of data, hence its only turned on for certain services. I have a document done on this
-
18 Oct, '23
Martin Hirschvogel AdminHello,
Thank you for your idea. On this portal, we carefully evaluate ideas to ensure that they will benefit a wide range of users. Thus, we close ideas not fulfilling certain criteria:
- Suggestions with low user interest: created more than 1 year ago with 5 votes or less
- Suggestions with no momentum: no votes in the last 6 months
Unfortunately, this suggestion doesn't meet these criteria, so we’re closing it (based on the data available until 2023-10-17). We appreciate your contribution and encourage you to continue to share your ideas. Your input plays a vital role in helping us improve our product for everyone.
Thank you for your understanding and continued support!
Warm regards,
Your Checkmk Team