-
Notifications
You must be signed in to change notification settings - Fork 8.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[DOC] Temporarily disable Kibana Rules #122573
Conversation
🙏🏼 per #116017, adds insight on how to temporarily disable Kibana Rules for clusters which need breathing room.
Pinging @elastic/kibana-docs (Team:Docs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this Stef! I think we can improve this a bit more, with just a bit more text. And I'm not sure who's handling our docs review at this point, but I'll try to figure it out ...
-------------------------------------------------- | ||
xpack.task_manager.max_workers: 1 | ||
xpack.task_manager.poll_interval: 1m | ||
-------------------------------------------------- |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should extend with two things:
- what this is doing
- remember to reset it after resolving the problem!
what this is doing
Setting xpack.task_manager.max_workers: 1
will limit each {kib} instance to claiming one task per polling cycle. The default is 10.
Setting xpack.task_manager.poll_interval: 1m
sets the {kib} Task Manager polling cycle to 1 minute. The default is 3 seconds (3s
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like a duplicate of what we already have in https://www.elastic.co/guide/en/kibana/master/task-manager-settings-kb.html
Perhaps instead we send them to those docs for more details, rather than explaining them here again?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added a link to the settings page in e9fc3a5. However, I agree that it's not clear from a glance what these recommended changes accomplish. And I think we need to reiterate that you should revert these changes when your troubleshooting is complete.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Lisa! 👍
|
||
A cluster may become unresponsive or sluggish if too many or expensive {kib} rules | ||
are attempting to run. As a stop gap measure, you may consider temporarily overriding | ||
the {kib} Task Manager to gain breathing room to resolve your situation by restarting |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is "breathing room" plain enough? Seems kinda american-ish, but not sure. Maybe something more descriptive would be better anyway? Something like
As a stop gap measure, you may consider temporarily overriding the {kib} Task Manager to have it run fewer tasks less frequently. This should provide some time to make changes to the system to resolve your situation. The relevant {kib} configuration keys are:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for putting this together Stef!
I agree, we should definitely avoid figures of speech like that.
Perhaps @gchaps can help us find a good way of expressing this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can take a look. IMO we can make this more concise with something like this:
To temporarily reduce the workload, you can change the Task Manager settings to run fewer tasks less frequently:
FYI: This new section was appearing in the Test connectors page, which I think must have been unintentional so I've moved it to the "Troubleshooting" page instead. If that's not correct, just let me know! |
@@ -190,6 +190,22 @@ When diagnosing the health state of the task, you will most likely be interested | |||
|
|||
Investigating the underlying task can help you gauge whether the problem you’re seeing is rooted in the rule not running at all, whether it’s running and failing, or whether it is running, but exhibiting behavior that is different than what was expected (at which point you should focus on the rule itself, rather than the task). | |||
|
|||
[discrete] | |||
[[alerting-kibana-disable]] | |||
=== Temporarily disable |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This title seems incomplete. What are we disabling?
=== Temporarily disable | |
=== Temporarily disable |
[[alerting-kibana-disable]] | ||
=== Temporarily disable | ||
|
||
A cluster may become unresponsive or sluggish if too many or expensive {kib} rules |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is not obvious what an "expensive" rule is. I recommend removing that term or else adding information about what makes a particular rule expensive:
A cluster may become unresponsive or sluggish if too many or expensive {kib} rules | |
A cluster may become unresponsive or sluggish if too many {kib} rules |
=== Temporarily disable | ||
|
||
A cluster may become unresponsive or sluggish if too many or expensive {kib} rules | ||
are attempting to run. As a stop gap measure, you may consider temporarily overriding |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a stop gap measure, you may consider temporarily overriding
As a novice user, it's unclear to me from this text why/when you'd want to perform these steps as opposed to the steps described in https://www.elastic.co/guide/en/kibana/master/create-and-manage-rules.html#controlling-rules. Can we provide more context?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, I feel like the framing here is more of a support perspective than that of the user.
Perhaps we can frame it like this (my phrasing is terrible, please do rephrase as needed):
If you are experiencing sluggishness in Kibana and wish to switch your rules off temporarily to ensure they are the cause, you can do so like this.
or something along those lines.
-------------------------------------------------- | ||
xpack.task_manager.max_workers: 1 | ||
xpack.task_manager.poll_interval: 1m | ||
-------------------------------------------------- |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added a link to the settings page in e9fc3a5. However, I agree that it's not clear from a glance what these recommended changes accomplish. And I think we need to reiterate that you should revert these changes when your troubleshooting is complete.
|
||
A cluster may become unresponsive or sluggish if too many or expensive {kib} rules | ||
are attempting to run. As a stop gap measure, you may consider temporarily overriding | ||
the {kib} Task Manager to gain breathing room to resolve your situation by restarting |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can take a look. IMO we can make this more concise with something like this:
To temporarily reduce the workload, you can change the Task Manager settings to run fewer tasks less frequently:
@elasticmachine merge upstream |
💚 Build Succeeded
History
To update your PR or re-run it, just comment with: |
Closing in favor of #126869 🙏 |
👋🏼 @gchaps asked me to file a new PR since my last #122573 got too far behind. ## Summary 🙏🏼 per #116017, adds insight on how to temporarily disable Kibana Rules for clusters which need breathing room. --------- Co-authored-by: Kibana Machine <[email protected]> Co-authored-by: Lisa Cawley <[email protected]>
👋🏼 @gchaps asked me to file a new PR since my last #122573 got too far behind. ## Summary 🙏🏼 per #116017, adds insight on how to temporarily disable Kibana Rules for clusters which need breathing room. --------- Co-authored-by: Kibana Machine <[email protected]> Co-authored-by: Lisa Cawley <[email protected]> (cherry picked from commit b1d6196)
# Backport This will backport the following commits from `main` to `8.9`: - [[DOCv2] Temporarily disable Kibana Rules (#126869)](#126869) <!--- Backport version: 8.9.7 --> ### Questions ? Please refer to the [Backport tool documentation](https://github.com/sqren/backport) <!--BACKPORT [{"author":{"name":"Stef Nestor","email":"[email protected]"},"sourceCommit":{"committedDate":"2023-07-13T13:22:55Z","message":"[DOCv2] Temporarily disable Kibana Rules (#126869)\n\n👋🏼 @gchaps asked me to file a new PR since my last\r\nhttps://github.com//pull/122573 got too far behind.\r\n\r\n## Summary\r\n\r\n🙏🏼 per #116017, adds insight on how to temporarily disable Kibana Rules\r\nfor clusters which need breathing room.\r\n\r\n---------\r\n\r\nCo-authored-by: Kibana Machine <[email protected]>\r\nCo-authored-by: Lisa Cawley <[email protected]>","sha":"b1d619617a0321617636c7c1bbcbf74e393a5d9e","branchLabelMapping":{"^v8.10.0$":"main","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["Team:Docs","release_note:skip","docs","auto-backport","Feature:Alerting/RulesManagement","v8.9.0","v8.10.0"],"number":126869,"url":"https://github.com/elastic/kibana/pull/126869","mergeCommit":{"message":"[DOCv2] Temporarily disable Kibana Rules (#126869)\n\n👋🏼 @gchaps asked me to file a new PR since my last\r\nhttps://github.com//pull/122573 got too far behind.\r\n\r\n## Summary\r\n\r\n🙏🏼 per #116017, adds insight on how to temporarily disable Kibana Rules\r\nfor clusters which need breathing room.\r\n\r\n---------\r\n\r\nCo-authored-by: Kibana Machine <[email protected]>\r\nCo-authored-by: Lisa Cawley <[email protected]>","sha":"b1d619617a0321617636c7c1bbcbf74e393a5d9e"}},"sourceBranch":"main","suggestedTargetBranches":["8.9"],"targetPullRequestStates":[{"branch":"8.9","label":"v8.9.0","labelRegex":"^v(\\d+).(\\d+).\\d+$","isSourceBranch":false,"state":"NOT_CREATED"},{"branch":"main","label":"v8.10.0","labelRegex":"^v8.10.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/126869","number":126869,"mergeCommit":{"message":"[DOCv2] Temporarily disable Kibana Rules (#126869)\n\n👋🏼 @gchaps asked me to file a new PR since my last\r\nhttps://github.com//pull/122573 got too far behind.\r\n\r\n## Summary\r\n\r\n🙏🏼 per #116017, adds insight on how to temporarily disable Kibana Rules\r\nfor clusters which need breathing room.\r\n\r\n---------\r\n\r\nCo-authored-by: Kibana Machine <[email protected]>\r\nCo-authored-by: Lisa Cawley <[email protected]>","sha":"b1d619617a0321617636c7c1bbcbf74e393a5d9e"}}]}] BACKPORT--> Co-authored-by: Stef Nestor <[email protected]>
Summary
🙏🏼 per #116017, adds insight on how to temporarily disable Kibana Rules for clusters which need breathing room.
Preview
https://kibana_122573.docs-preview.app.elstc.co/guide/en/kibana/master/alerting-troubleshooting.html