Make Linux agent configurable regarding waitmax time for livestatus queries
The Linux agent automatically detects CMK sites at the local system and runs several livestatus queries against these. For those queries, it has configured a maximum time (in seconds) waiting for the response.
If e.g. the OMD.*performance service reports "site currently not running", it may be caused by simply not getting back a response in 3s (command is started with "waitmax 3").
If this wait time would be configurable (manually as well as in agent bakery), the user could decide to set another value here, in order to have the chance to get less situations where the agent thinks the site is not running:
* default value if not set: 3s (like till now)
* maybe even with a maximum, let's say 10s
* written in agent configuration file in /etc/check_mk/, used by the agent, e.g. "waitmax $WAITMAX_SECONDS" instead of "waitmax 3"
I'm open for any suggestions and/or discussions...
Comments: 5
-
28 Jun, '22
Lars SörensenPlease move the CMK/OMD monitoring part to a dedicated special agent. It makes no sense to bloat the linux agent with dead weight and distribute this to all Linux servers only to be able to monitor a CMK Site automatically.
-
29 Jun, '22
Daniel Roettgermannconsider sepcial agent or plugin - both better then rolling it out default for every monitored server.
Why are there plugins? Because not every server has this type of application to monitor... Not every Server is an OMD/CMK Server
Keep it small/clean and simple -
29 Jun, '22
Robin GiersePlease do keep in mind that those livestatus queries are part of the agent - not plugins, they are only run if the 'omd' command can be found in the system, and they can be disabled through a rule.
Additionally, Checkmk needs that information to detect, if a server is a Checkmk server to handle it accordingly (think of the default Dashboards and so on).
If you do not want those services being discovered, simply disable them through 'Disabled sections'. -
20 Jul, '22
Daniel RoettgermannIn my opinion it needs to be completly reworked - not only change the waitmax problem, also remove it from the main Agent, or at least make it run parallel from the beginning - so its not causing any delay in the whole Agent runtime.
https://features.checkmk.com/suggestions/306004/let-unix-agents-execute-sectionspluginslocal-checks-in-parallel
@Robin
Why shipping code for 5000 Servers if there is no need? you know all your CMK Server like you know your Oracle Server where you need to deploy your Oracle Plugin. This was one of the arguments in the beginning when moving code from the main agent into dedicated plugins.
And of coursee we need the d ata for the CMK Server, but as written, you know your CMK Servers like your Oracle -
18 Oct, '23
Martin Hirschvogel AdminHello,
Thank you for your idea. On this portal, we carefully evaluate ideas to ensure that they will benefit a wide range of users. Thus, we close ideas not fulfilling certain criteria:
- Suggestions with low user interest: created more than 1 year ago with 5 votes or less
- Suggestions with no momentum: no votes in the last 6 months
Unfortunately, this suggestion doesn't meet these criteria, so we’re closing it (based on the data available until 2023-10-17). We appreciate your contribution and encourage you to continue to share your ideas. Your input plays a vital role in helping us improve our product for everyone.
Thank you for your understanding and continued support!
Warm regards,
Your Checkmk Team