Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Response Ops][Alerting] Fixing bug with using runSoon on pre-8.x rule #142505

Merged
merged 4 commits into from
Oct 4, 2022

Conversation

ymao1
Copy link
Contributor

@ymao1 ymao1 commented Oct 3, 2022

Resolves #142293

Summary

Updates runSoon function to use scheduled task id to run rule instead of rule ID. Using the rule ID causes issues for pre-8.x rules where the scheduled task ID was randomly generated and did not match the rule ID. Also added some additional try/catches to provide more graceful error handling in the case of missing tasks.

To Verify

  1. Create a rule in a pre-8.x branch
  2. Using the same ES data, run this branch. Navigate to the Rules UI and select Run Rule from the menu dropdown. This should return a success toast saying that the rule has been scheduled to run.

Checklist

@ymao1 ymao1 changed the title Running task using scheduled task id. Adding functional test [Response Ops][Alerting] Fixing bug with using runSoon on pre-8.x rule Oct 3, 2022
@ymao1 ymao1 self-assigned this Oct 3, 2022
@ymao1 ymao1 added Feature:Alerting release_note:skip Skip the PR/issue when compiling release notes Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams) v8.5.0 v8.6.0 labels Oct 3, 2022
@ymao1 ymao1 marked this pull request as ready for review October 3, 2022 18:35
@ymao1 ymao1 requested a review from a team as a code owner October 3, 2022 18:35
@elasticmachine
Copy link
Contributor

Pinging @elastic/response-ops (Team:ResponseOps)

? await this.taskManager.get(attributes.scheduledTaskId)
: null;
} catch (err) {
return i18n.translate('xpack.alerting.rulesClient.runSoon.getTaskError', {
Copy link
Contributor

@XavierM XavierM Oct 3, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we really need to return an error at this point? I am wondering if we can just ignore attributes.scheduledTaskId and just use id

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there is an error getting the task doc at this point, it likely means the task doc doesn't exist, so trying to run it later will result in an error as well. Returning at this point avoids us checking for the task again, which happens when you call taskManager.runSoon

Copy link
Member

@pmuellr pmuellr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM; left one question about using the alert id if the task id can't be obtained.

@@ -2989,7 +3011,16 @@ export class RulesClient {
});
}

await this.taskManager.runSoon(id);
try {
await this.taskManager.runSoon(attributes.scheduledTaskId ? attributes.scheduledTaskId : id);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a case where attributes.scheduledTaskId is falsy but id would actually be valid? It seems like it would work, but wondering how the rule could get in this state.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm...I guess this could happen previously if something occurred when the rule was getting disabled and the rule SO gets updated with a null scheduledTaskId but then deleting the associated document ends in an error. If that occurs, the user would be seeing a lot of Rule ran after disabled errors.

Copy link
Contributor

@XavierM XavierM left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you about tell me the history around scheduledTaskId to match the rule ID

@kibana-ci
Copy link
Collaborator

💚 Build Succeeded

Metrics [docs]

✅ unchanged

History

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

cc @ymao1

@ymao1 ymao1 merged commit e007ad6 into elastic:main Oct 4, 2022
@ymao1 ymao1 deleted the alerting/run-rule-bug branch October 4, 2022 03:28
kibanamachine pushed a commit to kibanamachine/kibana that referenced this pull request Oct 4, 2022
…ule (elastic#142505)

* Running task using scheduled task id. Adding functional test

* dont run if rule is disable

* Fixing i18n

(cherry picked from commit e007ad6)
@kibanamachine
Copy link
Contributor

💚 All backports created successfully

Status Branch Result
8.5

Note: Successful backport PRs will be merged automatically after passing CI.

Questions ?

Please refer to the Backport tool documentation

kibanamachine added a commit that referenced this pull request Oct 4, 2022
…ule (#142505) (#142550)

* Running task using scheduled task id. Adding functional test

* dont run if rule is disable

* Fixing i18n

(cherry picked from commit e007ad6)

Co-authored-by: Ying Mao <[email protected]>
WafaaNasr pushed a commit to WafaaNasr/kibana that referenced this pull request Oct 11, 2022
…ule (elastic#142505)

* Running task using scheduled task id. Adding functional test

* dont run if rule is disable

* Fixing i18n
WafaaNasr pushed a commit to WafaaNasr/kibana that referenced this pull request Oct 14, 2022
…ule (elastic#142505)

* Running task using scheduled task id. Adding functional test

* dont run if rule is disable

* Fixing i18n
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Feature:Alerting release_note:skip Skip the PR/issue when compiling release notes Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams) v8.5.0 v8.6.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Hit a "saved object not found" error while running a rule on UI after upgrade from 7.17.4
6 participants