Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rollout client-side updates w/ mget as the default task claimer #181327

Closed
2 tasks
mikecote opened this issue Apr 22, 2024 · 2 comments
Closed
2 tasks

Rollout client-side updates w/ mget as the default task claimer #181327

mikecote opened this issue Apr 22, 2024 · 2 comments
Assignees
Labels
Feature:Task Manager Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams)

Comments

@mikecote
Copy link
Contributor

Part of #155770

Blocked on https://github.com/elastic/response-ops-team/issues/137

Feature Description

Change the task manager task claiming algorithm to use a _search to retrieve candidate tasks, a _mget to prune the docs whose version number doesn't match, and then a _bulk to claim the tasks. This will increase the background task capacity in Serverless.

Business Value

Increased background task capacity, reducing the COGS for running alerting rules and actions, and providing a lower MTTD/MTTR.

Rollout strategy (with bake periods in between)

  1. Select internal serverless projects have the new task claiming logic enabled
  2. Turn on for all canary serverless projects
  3. Turn on for select non canary serverless projects
  4. Turn on for all serverless projects
  5. Turn on for ESS and on-prem

Definition of Done

  • Rollout is done in a gradual manner
  • Client-side updates w/ mget to prune stale docs is the default way tasks are claimed
@mikecote mikecote added blocked Feature:Task Manager Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams) labels Apr 22, 2024
@elasticmachine
Copy link
Contributor

Pinging @elastic/response-ops (Team:ResponseOps)

@mikecote
Copy link
Contributor Author

Rollout to serverless is complete, we are onboarding 12.5% of ECH clusters in 8.16 + internal clusters and by 8.17 we'll have this enabled by default via #194625. Closing..

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Feature:Task Manager Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams)
Projects
None yet
Development

No branches or pull requests

2 participants