Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix bug in calculating when ad-hoc tasks are out of attempts #192907

Merged
merged 1 commit into from
Sep 13, 2024

Conversation

mikecote
Copy link
Contributor

@mikecote mikecote commented Sep 13, 2024

In this PR, I'm fixing a bug where ad-hoc tasks would have one fewer attempts to retry in failure scenarios when using mget.

To verify

  1. Apply the following diff to your code
diff --git a/x-pack/plugins/stack_connectors/server/connector_types/server_log/index.ts b/x-pack/plugins/stack_connectors/server/connector_types/server_log/index.ts
index 0275b2bdc2f..d481c3820a1 100644
--- a/x-pack/plugins/stack_connectors/server/connector_types/server_log/index.ts
+++ b/x-pack/plugins/stack_connectors/server/connector_types/server_log/index.ts
@@ -77,6 +77,10 @@ export function getConnectorType(): ServerLogConnectorType {
 async function executor(
   execOptions: ServerLogConnectorTypeExecutorOptions
 ): Promise<ConnectorTypeExecutorResult<void>> {
+
+  console.log('*** Server log execution');
+  throw new Error('Fail');
+
   const { actionId, params, logger } = execOptions;

   const sanitizedMessage = withoutControlCharacters(params.message);
diff --git a/x-pack/plugins/task_manager/server/config.ts b/x-pack/plugins/task_manager/server/config.ts
index db07494ef4f..07e277f8d16 100644
--- a/x-pack/plugins/task_manager/server/config.ts
+++ b/x-pack/plugins/task_manager/server/config.ts
@@ -202,7 +202,7 @@ export const configSchema = schema.object(
       max: 100,
       min: 1,
     }),
-    claim_strategy: schema.string({ defaultValue: CLAIM_STRATEGY_UPDATE_BY_QUERY }),
+    claim_strategy: schema.string({ defaultValue: CLAIM_STRATEGY_MGET }),
     request_timeouts: requestTimeoutsConfig,
   },
   {
diff --git a/x-pack/plugins/task_manager/server/lib/get_retry_at.ts b/x-pack/plugins/task_manager/server/lib/get_retry_at.ts
index 278ba18642d..c8fb911d500 100644
--- a/x-pack/plugins/task_manager/server/lib/get_retry_at.ts
+++ b/x-pack/plugins/task_manager/server/lib/get_retry_at.ts
@@ -54,6 +54,7 @@ export function getRetryDate({
 }

 export function calculateDelayBasedOnAttempts(attempts: number) {
+  return 10 * 1000;
   // Return 30s for the first retry attempt
   if (attempts === 1) {
     return 30 * 1000;
  1. Create an always firing rule that runs every hour, triggering a server log on check intervals
  2. Let the rule run and observe the server log action running and failing three times (as compared to two)

@mikecote mikecote added release_note:skip Skip the PR/issue when compiling release notes Feature:Task Manager Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams) v9.0.0 backport:prev-minor Backport to (8.x) the previous minor version (i.e. one version back from main) v8.16.0 labels Sep 13, 2024
@mikecote mikecote self-assigned this Sep 13, 2024
@mikecote mikecote marked this pull request as ready for review September 13, 2024 17:21
@mikecote mikecote requested a review from a team as a code owner September 13, 2024 17:21
@elasticmachine
Copy link
Contributor

Pinging @elastic/response-ops (Team:ResponseOps)

@mikecote mikecote enabled auto-merge (squash) September 13, 2024 17:22
@kibana-ci
Copy link
Collaborator

💛 Build succeeded, but was flaky

Failed CI Steps

Metrics [docs]

✅ unchanged

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

cc @mikecote

@mikecote mikecote merged commit 36eedc1 into elastic:main Sep 13, 2024
45 checks passed
kibanamachine pushed a commit to kibanamachine/kibana that referenced this pull request Sep 13, 2024
…#192907)

In this PR, I'm fixing a bug where ad-hoc tasks would have one fewer
attempts to retry in failure scenarios when using mget.

## To verify

1. Apply the following diff to your code
```
diff --git a/x-pack/plugins/stack_connectors/server/connector_types/server_log/index.ts b/x-pack/plugins/stack_connectors/server/connector_types/server_log/index.ts
index 0275b2bdc2f..d481c3820a1 100644
--- a/x-pack/plugins/stack_connectors/server/connector_types/server_log/index.ts
+++ b/x-pack/plugins/stack_connectors/server/connector_types/server_log/index.ts
@@ -77,6 +77,10 @@ export function getConnectorType(): ServerLogConnectorType {
 async function executor(
   execOptions: ServerLogConnectorTypeExecutorOptions
 ): Promise<ConnectorTypeExecutorResult<void>> {
+
+  console.log('*** Server log execution');
+  throw new Error('Fail');
+
   const { actionId, params, logger } = execOptions;

   const sanitizedMessage = withoutControlCharacters(params.message);
diff --git a/x-pack/plugins/task_manager/server/config.ts b/x-pack/plugins/task_manager/server/config.ts
index db07494ef4f..07e277f8d16 100644
--- a/x-pack/plugins/task_manager/server/config.ts
+++ b/x-pack/plugins/task_manager/server/config.ts
@@ -202,7 +202,7 @@ export const configSchema = schema.object(
       max: 100,
       min: 1,
     }),
-    claim_strategy: schema.string({ defaultValue: CLAIM_STRATEGY_UPDATE_BY_QUERY }),
+    claim_strategy: schema.string({ defaultValue: CLAIM_STRATEGY_MGET }),
     request_timeouts: requestTimeoutsConfig,
   },
   {
diff --git a/x-pack/plugins/task_manager/server/lib/get_retry_at.ts b/x-pack/plugins/task_manager/server/lib/get_retry_at.ts
index 278ba18642d..c8fb911d500 100644
--- a/x-pack/plugins/task_manager/server/lib/get_retry_at.ts
+++ b/x-pack/plugins/task_manager/server/lib/get_retry_at.ts
@@ -54,6 +54,7 @@ export function getRetryDate({
 }

 export function calculateDelayBasedOnAttempts(attempts: number) {
+  return 10 * 1000;
   // Return 30s for the first retry attempt
   if (attempts === 1) {
     return 30 * 1000;
```
2. Create an always firing rule that runs every hour, triggering a
server log on check intervals
3. Let the rule run and observe the server log action running and
failing three times (as compared to two)

(cherry picked from commit 36eedc1)
@kibanamachine
Copy link
Contributor

💔 Some backports could not be created

Status Branch Result
8.16 The branch "8.16" does not exist
8.x

Note: Successful backport PRs will be merged automatically after passing CI.

Manual backport

To create the backport manually run:

node scripts/backport --pr 192907

Questions ?

Please refer to the Backport tool documentation

kibanamachine added a commit that referenced this pull request Sep 13, 2024
…192907) (#192918)

# Backport

This will backport the following commits from `main` to `8.x`:
- [Fix bug in calculating when ad-hoc tasks are out of attempts
(#192907)](#192907)

<!--- Backport version: 9.4.3 -->

### Questions ?
Please refer to the [Backport tool
documentation](https://github.com/sqren/backport)

<!--BACKPORT [{"author":{"name":"Mike
Côté","email":"[email protected]"},"sourceCommit":{"committedDate":"2024-09-13T18:40:29Z","message":"Fix
bug in calculating when ad-hoc tasks are out of attempts (#192907)\n\nIn
this PR, I'm fixing a bug where ad-hoc tasks would have one
fewer\r\nattempts to retry in failure scenarios when using
mget.\r\n\r\n## To verify\r\n\r\n1. Apply the following diff to your
code\r\n```\r\ndiff --git
a/x-pack/plugins/stack_connectors/server/connector_types/server_log/index.ts
b/x-pack/plugins/stack_connectors/server/connector_types/server_log/index.ts\r\nindex
0275b2bdc2f..d481c3820a1 100644\r\n---
a/x-pack/plugins/stack_connectors/server/connector_types/server_log/index.ts\r\n+++
b/x-pack/plugins/stack_connectors/server/connector_types/server_log/index.ts\r\n@@
-77,6 +77,10 @@ export function getConnectorType():
ServerLogConnectorType {\r\n async function executor(\r\n execOptions:
ServerLogConnectorTypeExecutorOptions\r\n ):
Promise<ConnectorTypeExecutorResult<void>> {\r\n+\r\n+ console.log('***
Server log execution');\r\n+ throw new Error('Fail');\r\n+\r\n const {
actionId, params, logger } = execOptions;\r\n\r\n const sanitizedMessage
= withoutControlCharacters(params.message);\r\ndiff --git
a/x-pack/plugins/task_manager/server/config.ts
b/x-pack/plugins/task_manager/server/config.ts\r\nindex
db07494ef4f..07e277f8d16 100644\r\n---
a/x-pack/plugins/task_manager/server/config.ts\r\n+++
b/x-pack/plugins/task_manager/server/config.ts\r\n@@ -202,7 +202,7 @@
export const configSchema = schema.object(\r\n max: 100,\r\n min: 1,\r\n
}),\r\n- claim_strategy: schema.string({ defaultValue:
CLAIM_STRATEGY_UPDATE_BY_QUERY }),\r\n+ claim_strategy: schema.string({
defaultValue: CLAIM_STRATEGY_MGET }),\r\n request_timeouts:
requestTimeoutsConfig,\r\n },\r\n {\r\ndiff --git
a/x-pack/plugins/task_manager/server/lib/get_retry_at.ts
b/x-pack/plugins/task_manager/server/lib/get_retry_at.ts\r\nindex
278ba18642d..c8fb911d500 100644\r\n---
a/x-pack/plugins/task_manager/server/lib/get_retry_at.ts\r\n+++
b/x-pack/plugins/task_manager/server/lib/get_retry_at.ts\r\n@@ -54,6
+54,7 @@ export function getRetryDate({\r\n }\r\n\r\n export function
calculateDelayBasedOnAttempts(attempts: number) {\r\n+ return 10 *
1000;\r\n // Return 30s for the first retry attempt\r\n if (attempts ===
1) {\r\n return 30 * 1000;\r\n```\r\n2. Create an always firing rule
that runs every hour, triggering a\r\nserver log on check
intervals\r\n3. Let the rule run and observe the server log action
running and\r\nfailing three times (as compared to
two)","sha":"36eedc121bff8d83fc6a4590f468397f56d0bd14","branchLabelMapping":{"^v9.0.0$":"main","^v8.16.0$":"8.x","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["release_note:skip","Feature:Task
Manager","Team:ResponseOps","v9.0.0","backport:prev-minor","v8.16.0"],"title":"Fix
bug in calculating when ad-hoc tasks are out of
attempts","number":192907,"url":"https://github.com/elastic/kibana/pull/192907","mergeCommit":{"message":"Fix
bug in calculating when ad-hoc tasks are out of attempts (#192907)\n\nIn
this PR, I'm fixing a bug where ad-hoc tasks would have one
fewer\r\nattempts to retry in failure scenarios when using
mget.\r\n\r\n## To verify\r\n\r\n1. Apply the following diff to your
code\r\n```\r\ndiff --git
a/x-pack/plugins/stack_connectors/server/connector_types/server_log/index.ts
b/x-pack/plugins/stack_connectors/server/connector_types/server_log/index.ts\r\nindex
0275b2bdc2f..d481c3820a1 100644\r\n---
a/x-pack/plugins/stack_connectors/server/connector_types/server_log/index.ts\r\n+++
b/x-pack/plugins/stack_connectors/server/connector_types/server_log/index.ts\r\n@@
-77,6 +77,10 @@ export function getConnectorType():
ServerLogConnectorType {\r\n async function executor(\r\n execOptions:
ServerLogConnectorTypeExecutorOptions\r\n ):
Promise<ConnectorTypeExecutorResult<void>> {\r\n+\r\n+ console.log('***
Server log execution');\r\n+ throw new Error('Fail');\r\n+\r\n const {
actionId, params, logger } = execOptions;\r\n\r\n const sanitizedMessage
= withoutControlCharacters(params.message);\r\ndiff --git
a/x-pack/plugins/task_manager/server/config.ts
b/x-pack/plugins/task_manager/server/config.ts\r\nindex
db07494ef4f..07e277f8d16 100644\r\n---
a/x-pack/plugins/task_manager/server/config.ts\r\n+++
b/x-pack/plugins/task_manager/server/config.ts\r\n@@ -202,7 +202,7 @@
export const configSchema = schema.object(\r\n max: 100,\r\n min: 1,\r\n
}),\r\n- claim_strategy: schema.string({ defaultValue:
CLAIM_STRATEGY_UPDATE_BY_QUERY }),\r\n+ claim_strategy: schema.string({
defaultValue: CLAIM_STRATEGY_MGET }),\r\n request_timeouts:
requestTimeoutsConfig,\r\n },\r\n {\r\ndiff --git
a/x-pack/plugins/task_manager/server/lib/get_retry_at.ts
b/x-pack/plugins/task_manager/server/lib/get_retry_at.ts\r\nindex
278ba18642d..c8fb911d500 100644\r\n---
a/x-pack/plugins/task_manager/server/lib/get_retry_at.ts\r\n+++
b/x-pack/plugins/task_manager/server/lib/get_retry_at.ts\r\n@@ -54,6
+54,7 @@ export function getRetryDate({\r\n }\r\n\r\n export function
calculateDelayBasedOnAttempts(attempts: number) {\r\n+ return 10 *
1000;\r\n // Return 30s for the first retry attempt\r\n if (attempts ===
1) {\r\n return 30 * 1000;\r\n```\r\n2. Create an always firing rule
that runs every hour, triggering a\r\nserver log on check
intervals\r\n3. Let the rule run and observe the server log action
running and\r\nfailing three times (as compared to
two)","sha":"36eedc121bff8d83fc6a4590f468397f56d0bd14"}},"sourceBranch":"main","suggestedTargetBranches":["8.x"],"targetPullRequestStates":[{"branch":"main","label":"v9.0.0","branchLabelMappingKey":"^v9.0.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/192907","number":192907,"mergeCommit":{"message":"Fix
bug in calculating when ad-hoc tasks are out of attempts (#192907)\n\nIn
this PR, I'm fixing a bug where ad-hoc tasks would have one
fewer\r\nattempts to retry in failure scenarios when using
mget.\r\n\r\n## To verify\r\n\r\n1. Apply the following diff to your
code\r\n```\r\ndiff --git
a/x-pack/plugins/stack_connectors/server/connector_types/server_log/index.ts
b/x-pack/plugins/stack_connectors/server/connector_types/server_log/index.ts\r\nindex
0275b2bdc2f..d481c3820a1 100644\r\n---
a/x-pack/plugins/stack_connectors/server/connector_types/server_log/index.ts\r\n+++
b/x-pack/plugins/stack_connectors/server/connector_types/server_log/index.ts\r\n@@
-77,6 +77,10 @@ export function getConnectorType():
ServerLogConnectorType {\r\n async function executor(\r\n execOptions:
ServerLogConnectorTypeExecutorOptions\r\n ):
Promise<ConnectorTypeExecutorResult<void>> {\r\n+\r\n+ console.log('***
Server log execution');\r\n+ throw new Error('Fail');\r\n+\r\n const {
actionId, params, logger } = execOptions;\r\n\r\n const sanitizedMessage
= withoutControlCharacters(params.message);\r\ndiff --git
a/x-pack/plugins/task_manager/server/config.ts
b/x-pack/plugins/task_manager/server/config.ts\r\nindex
db07494ef4f..07e277f8d16 100644\r\n---
a/x-pack/plugins/task_manager/server/config.ts\r\n+++
b/x-pack/plugins/task_manager/server/config.ts\r\n@@ -202,7 +202,7 @@
export const configSchema = schema.object(\r\n max: 100,\r\n min: 1,\r\n
}),\r\n- claim_strategy: schema.string({ defaultValue:
CLAIM_STRATEGY_UPDATE_BY_QUERY }),\r\n+ claim_strategy: schema.string({
defaultValue: CLAIM_STRATEGY_MGET }),\r\n request_timeouts:
requestTimeoutsConfig,\r\n },\r\n {\r\ndiff --git
a/x-pack/plugins/task_manager/server/lib/get_retry_at.ts
b/x-pack/plugins/task_manager/server/lib/get_retry_at.ts\r\nindex
278ba18642d..c8fb911d500 100644\r\n---
a/x-pack/plugins/task_manager/server/lib/get_retry_at.ts\r\n+++
b/x-pack/plugins/task_manager/server/lib/get_retry_at.ts\r\n@@ -54,6
+54,7 @@ export function getRetryDate({\r\n }\r\n\r\n export function
calculateDelayBasedOnAttempts(attempts: number) {\r\n+ return 10 *
1000;\r\n // Return 30s for the first retry attempt\r\n if (attempts ===
1) {\r\n return 30 * 1000;\r\n```\r\n2. Create an always firing rule
that runs every hour, triggering a\r\nserver log on check
intervals\r\n3. Let the rule run and observe the server log action
running and\r\nfailing three times (as compared to
two)","sha":"36eedc121bff8d83fc6a4590f468397f56d0bd14"}},{"branch":"8.x","label":"v8.16.0","branchLabelMappingKey":"^v8.16.0$","isSourceBranch":false,"state":"NOT_CREATED"}]}]
BACKPORT-->

Co-authored-by: Mike Côté <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport:prev-minor Backport to (8.x) the previous minor version (i.e. one version back from main) Feature:Task Manager release_note:skip Skip the PR/issue when compiling release notes Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams) v8.16.0 v9.0.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants