-
Notifications
You must be signed in to change notification settings - Fork 8.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Task Manager] Add partitions to tasks and assigns those task partitions to Kibana nodes #188758
Conversation
Resolves #187698 ## Summary This PR does the following: - Adds a new `partition` field to the task manager index - Assigns a partition to a task if there is not one when creating or updating ### Checklist - [ ] [Unit or functional tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html) were updated or added to match the most common scenarios ### To verify New tasks - Create a rule and verify that the newly created tasks have the `partition` field Old tasks - Checkout main and create a new rule, let it run - Stop kibana - Checkout this branch and restart kibana - Verify that the old tasks get updated with the `partition` field ex. the query I use to look at the ES query rule task ``` POST .kibana_task_manager*/_search { "query": { "bool": { "filter": [ { "term": { "task.taskType": { "value": "alerting:.es-query" } } } ] } } } ``` --------- Co-authored-by: kibanamachine <[email protected]>
Resolves #187700 ## Summary This PR uses the discovery service assign a subset of the partitions to each Kibana node so only two Kibana nodes fight for the same tasks. ### Checklist - [ ] [Unit or functional tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html) were updated or added to match the most common scenarios ### To verify This change is only for mget, so add the following to `kibana.yml` ``` xpack.task_manager.claim_strategy: 'unsafe_mget' ``` **Testing locally** Old tasks - Checkout main and create a new rule, let it run - Stop kibana - Checkout this branch and restart kibana - Verify that the on first run after restarting (when the task does not have a partition) the rule runs. It might be helpful to create a rule with a long interval and use run soon. <details> <summary>New tasks, but it might be easier to just test on cloud</summary> - Start Kibana - Replace this [line](https://github.com/elastic/kibana/pull/188368/files#diff-46ca6f79fdc2b69e1d6ddc2401eab6469f8dfb9521f93f90132de624a9693aa5R48) with the following ``` return [this.podName, 'w', 'x', 'y', 'z']; ``` - Create a few rules and check their partition values using the example query below: ``` POST .kibana_task_manager*/_search { "query": { "bool": { "filter": [ { "term": { "task.taskType": { "value": "alerting:.es-query" } } } ] } } } ``` - Using the the partition map that is expected to be generated for the current kibana node, verify that the tasks with partitions in the map run and tasks with partitions that are not in the map do not run. ``` [ 0, 2, 5, 7, 10, 12, 15, 17, 20, 22, 25, 27, 30, 32, 35, 37, 40, 42, 45, 47, 50, 52, 55, 57, 60, 62, 65, 67, 70, 72, 75, 77, 80, 82, 85, 87, 90, 92, 95, 97, 100, 102, 105, 107, 110, 112, 115, 117, 120, 122, 125, 127, 130, 132, 135, 137, 140, 142, 145, 147, 150, 152, 155, 157, 160,162, 165, 167, 170, 172, 175, 177, 180, 182, 185, 187, 190, 192, 195, 197, 200, 202, 205, 207, 210, 212, 215, 217, 220, 222, 225, 227, 230, 232, 235, 237, 240, 242, 245, 247, 250, 252, 255 ] ``` </details> **Testing on cloud** - The PR has been deployed to cloud, and you can create multiple rules and verify that they all run. If some reason they do not run, that means the nodes are not picking up their assigned partitions correctly.
/ci |
💚 Build Succeeded
Metrics [docs]Public APIs missing comments
|
Pinging @elastic/response-ops (Team:ResponseOps) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Core code changes (new mappings) are identical to those in #188001 and LGTM.
Resolves #187700
Resolves #187698
Summary
This is a feature branch PR to main. Merging the following PRs that have already been approved, #188001 and #188368