-
Notifications
You must be signed in to change notification settings - Fork 8.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[PoC] Attaining 10x alerting throughput (32,000 rules per minute) #182394
Closed
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
/ci |
/ci |
/ci |
/ci |
/ci |
/ci |
/ci |
/ci |
/ci |
/ci |
/ci |
💔 Build FailedFailed CI StepsHistory
To update your PR or re-run it, just comment with: |
/ci |
/ci |
1 similar comment
/ci |
/ci |
48 tasks
/ci |
/ci |
/ci |
💔 Build Failed
Failed CI StepsHistory
|
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
In this PoC, I made the improvements listed below to move the alerting scalability ceiling (rules per minute) by at least 10x. The scenario used is creating ES Query rules that run every minute on sample indices that do not detect alerts.
List of improvements
.kibana_task_manager
and.kibana_alerting_cases
index configurations to3
shards50
from10
500ms
from3s
360
partitions360
partitions in a round-robin mannerxpack.alerting.maxScheduledPerMinute
to1000000
to increase the upper bound limitmget
as the task-claiming strategydataViews
andsearchSourceClient
alerting rule executor services when not necessary (Lazy load dataViews and wrappedSearchSourceClient services when running alerting rules #184322)claiming
phase of tasks (Make the mget task claimer skip theclaiming
phase and update the task document directly torunning
#184739)Test scenario
1m
intervalNotes
_has_privileges
API calls to Elasticsearch, we need to set the.security
index settings to haveauto_expand_replicas: 0-all
so not only one node is capable of performing the requestsxpack.security.authc.api_key.cache.max_keys: 50000
)Conclusion
These optimizations have shown that we can attain a 10x scale with the alerting system. However, during further testing, I was able to push the limits even further, attaining much more than 10x in various ES and Kibana configurations, confirming that this approach will break the horizontal scalability ceiling that we previously had.