Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix bug in capacity warning logs when assumedAverageRecurringRequiredThroughputPerMinutePerKibana was sufficient #200578

Merged
merged 4 commits into from
Nov 22, 2024

Conversation

mikecote
Copy link
Contributor

@mikecote mikecote commented Nov 18, 2024

In this PR, I'm fixing a bug where the task manager unhealthy logs were showing the wrong reason. Because the if condition was checking assumedAverageRecurringRequiredThroughputPerMinutePerKibana to be less than capacityPerMinutePerKibana, it would log this as the reason task manager is healthy. However whenever the value is less, it means task manager is running fine from a recurring task perspective. Changing the if condition allows the next step in the code to log which then logs assumedRequiredThroughputPerMinutePerKibana as the real reason why task manager is unhealthy.

@mikecote mikecote added release_note:skip Skip the PR/issue when compiling release notes Feature:Task Manager Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams) v9.0.0 backport:version Backport to applied version labels v8.17.0 v8.16.1 labels Nov 18, 2024
@mikecote mikecote self-assigned this Nov 18, 2024
@@ -248,13 +248,13 @@ function getHealthStatus(
return { status: HealthStatus.OK, reason };
}

if (assumedAverageRecurringRequiredThroughputPerMinutePerKibana < capacityPerMinutePerKibana) {
const reason = `Task Manager is unhealthy, the assumedAverageRecurringRequiredThroughputPerMinutePerKibana (${assumedAverageRecurringRequiredThroughputPerMinutePerKibana}) < capacityPerMinutePerKibana (${capacityPerMinutePerKibana})`;
if (assumedAverageRecurringRequiredThroughputPerMinutePerKibana > capacityPerMinutePerKibana) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logic was wrong, this should only log if we exceed capacityPerMinutePerKibana in this case.

logger.warn(reason);
return { status: HealthStatus.OK, reason };
}

const reason = `Task Manager is unhealthy, the assumedRequiredThroughputPerMinutePerKibana (${assumedRequiredThroughputPerMinutePerKibana}) >= capacityPerMinutePerKibana (${capacityPerMinutePerKibana}) AND assumedAverageRecurringRequiredThroughputPerMinutePerKibana (${assumedAverageRecurringRequiredThroughputPerMinutePerKibana}) >= capacityPerMinutePerKibana (${capacityPerMinutePerKibana})`;
const reason = `Task Manager is unhealthy, the assumedRequiredThroughputPerMinutePerKibana (${assumedRequiredThroughputPerMinutePerKibana}) >= capacityPerMinutePerKibana (${capacityPerMinutePerKibana})`;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given the code above logs whenever something is wrong with assumedAverageRecurringRequiredThroughputPerMinutePerKibana, I'm removing it from the message here.

@mikecote mikecote marked this pull request as ready for review November 19, 2024 15:21
@mikecote mikecote requested a review from a team as a code owner November 19, 2024 15:21
@elasticmachine
Copy link
Contributor

Pinging @elastic/response-ops (Team:ResponseOps)

@elasticmachine
Copy link
Contributor

💚 Build Succeeded

Metrics [docs]

✅ unchanged

History

cc @mikecote

@mikecote mikecote merged commit 9fa8ec7 into elastic:main Nov 22, 2024
42 checks passed
@kibanamachine
Copy link
Contributor

Starting backport for target branches: 8.16, 8.17

https://github.com/elastic/kibana/actions/runs/11977346140

kibanamachine pushed a commit to kibanamachine/kibana that referenced this pull request Nov 22, 2024
…ThroughputPerMinutePerKibana was sufficient (elastic#200578)

In this PR, I'm fixing a bug where the task manager unhealthy logs were
showing the wrong reason. Because the if condition was checking
`assumedAverageRecurringRequiredThroughputPerMinutePerKibana` to be less
than `capacityPerMinutePerKibana`, it would log this as the reason task
manager is healthy. However whenever the value is less, it means task
manager is running fine from a recurring task perspective. Changing the
if condition allows the next step in the code to log which then logs
`assumedRequiredThroughputPerMinutePerKibana` as the real reason why
task manager is unhealthy.

---------

Co-authored-by: kibanamachine <[email protected]>
(cherry picked from commit 9fa8ec7)
kibanamachine pushed a commit to kibanamachine/kibana that referenced this pull request Nov 22, 2024
…ThroughputPerMinutePerKibana was sufficient (elastic#200578)

In this PR, I'm fixing a bug where the task manager unhealthy logs were
showing the wrong reason. Because the if condition was checking
`assumedAverageRecurringRequiredThroughputPerMinutePerKibana` to be less
than `capacityPerMinutePerKibana`, it would log this as the reason task
manager is healthy. However whenever the value is less, it means task
manager is running fine from a recurring task perspective. Changing the
if condition allows the next step in the code to log which then logs
`assumedRequiredThroughputPerMinutePerKibana` as the real reason why
task manager is unhealthy.

---------

Co-authored-by: kibanamachine <[email protected]>
(cherry picked from commit 9fa8ec7)
@kibanamachine
Copy link
Contributor

💚 All backports created successfully

Status Branch Result
8.16
8.17

Note: Successful backport PRs will be merged automatically after passing CI.

Questions ?

Please refer to the Backport tool documentation

kibanamachine added a commit that referenced this pull request Nov 22, 2024
…equiredThroughputPerMinutePerKibana was sufficient (#200578) (#201444)

# Backport

This will backport the following commits from `main` to `8.16`:
- [Fix bug in capacity warning logs when
assumedAverageRecurringRequiredThroughputPerMinutePerKibana was
sufficient (#200578)](#200578)

<!--- Backport version: 9.4.3 -->

### Questions ?
Please refer to the [Backport tool
documentation](https://github.com/sqren/backport)

<!--BACKPORT [{"author":{"name":"Mike
Côté","email":"[email protected]"},"sourceCommit":{"committedDate":"2024-11-22T17:33:43Z","message":"Fix
bug in capacity warning logs when
assumedAverageRecurringRequiredThroughputPerMinutePerKibana was
sufficient (#200578)\n\nIn this PR, I'm fixing a bug where the task
manager unhealthy logs were\r\nshowing the wrong reason. Because the if
condition was
checking\r\n`assumedAverageRecurringRequiredThroughputPerMinutePerKibana`
to be less\r\nthan `capacityPerMinutePerKibana`, it would log this as
the reason task\r\nmanager is healthy. However whenever the value is
less, it means task\r\nmanager is running fine from a recurring task
perspective. Changing the\r\nif condition allows the next step in the
code to log which then
logs\r\n`assumedRequiredThroughputPerMinutePerKibana` as the real reason
why\r\ntask manager is
unhealthy.\r\n\r\n---------\r\n\r\nCo-authored-by: kibanamachine
<[email protected]>","sha":"9fa8ec71529faa31a44fe2eac0902fb0d4b98797","branchLabelMapping":{"^v9.0.0$":"main","^v8.18.0$":"8.x","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["release_note:skip","Feature:Task
Manager","Team:ResponseOps","v9.0.0","backport:version","v8.17.0","v8.16.1"],"title":"Fix
bug in capacity warning logs when
assumedAverageRecurringRequiredThroughputPerMinutePerKibana was
sufficient","number":200578,"url":"https://github.com/elastic/kibana/pull/200578","mergeCommit":{"message":"Fix
bug in capacity warning logs when
assumedAverageRecurringRequiredThroughputPerMinutePerKibana was
sufficient (#200578)\n\nIn this PR, I'm fixing a bug where the task
manager unhealthy logs were\r\nshowing the wrong reason. Because the if
condition was
checking\r\n`assumedAverageRecurringRequiredThroughputPerMinutePerKibana`
to be less\r\nthan `capacityPerMinutePerKibana`, it would log this as
the reason task\r\nmanager is healthy. However whenever the value is
less, it means task\r\nmanager is running fine from a recurring task
perspective. Changing the\r\nif condition allows the next step in the
code to log which then
logs\r\n`assumedRequiredThroughputPerMinutePerKibana` as the real reason
why\r\ntask manager is
unhealthy.\r\n\r\n---------\r\n\r\nCo-authored-by: kibanamachine
<[email protected]>","sha":"9fa8ec71529faa31a44fe2eac0902fb0d4b98797"}},"sourceBranch":"main","suggestedTargetBranches":["8.17","8.16"],"targetPullRequestStates":[{"branch":"main","label":"v9.0.0","branchLabelMappingKey":"^v9.0.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/200578","number":200578,"mergeCommit":{"message":"Fix
bug in capacity warning logs when
assumedAverageRecurringRequiredThroughputPerMinutePerKibana was
sufficient (#200578)\n\nIn this PR, I'm fixing a bug where the task
manager unhealthy logs were\r\nshowing the wrong reason. Because the if
condition was
checking\r\n`assumedAverageRecurringRequiredThroughputPerMinutePerKibana`
to be less\r\nthan `capacityPerMinutePerKibana`, it would log this as
the reason task\r\nmanager is healthy. However whenever the value is
less, it means task\r\nmanager is running fine from a recurring task
perspective. Changing the\r\nif condition allows the next step in the
code to log which then
logs\r\n`assumedRequiredThroughputPerMinutePerKibana` as the real reason
why\r\ntask manager is
unhealthy.\r\n\r\n---------\r\n\r\nCo-authored-by: kibanamachine
<[email protected]>","sha":"9fa8ec71529faa31a44fe2eac0902fb0d4b98797"}},{"branch":"8.17","label":"v8.17.0","branchLabelMappingKey":"^v(\\d+).(\\d+).\\d+$","isSourceBranch":false,"state":"NOT_CREATED"},{"branch":"8.16","label":"v8.16.1","branchLabelMappingKey":"^v(\\d+).(\\d+).\\d+$","isSourceBranch":false,"state":"NOT_CREATED"}]}]
BACKPORT-->

Co-authored-by: Mike Côté <[email protected]>
kibanamachine added a commit that referenced this pull request Nov 22, 2024
…equiredThroughputPerMinutePerKibana was sufficient (#200578) (#201445)

# Backport

This will backport the following commits from `main` to `8.17`:
- [Fix bug in capacity warning logs when
assumedAverageRecurringRequiredThroughputPerMinutePerKibana was
sufficient (#200578)](#200578)

<!--- Backport version: 9.4.3 -->

### Questions ?
Please refer to the [Backport tool
documentation](https://github.com/sqren/backport)

<!--BACKPORT [{"author":{"name":"Mike
Côté","email":"[email protected]"},"sourceCommit":{"committedDate":"2024-11-22T17:33:43Z","message":"Fix
bug in capacity warning logs when
assumedAverageRecurringRequiredThroughputPerMinutePerKibana was
sufficient (#200578)\n\nIn this PR, I'm fixing a bug where the task
manager unhealthy logs were\r\nshowing the wrong reason. Because the if
condition was
checking\r\n`assumedAverageRecurringRequiredThroughputPerMinutePerKibana`
to be less\r\nthan `capacityPerMinutePerKibana`, it would log this as
the reason task\r\nmanager is healthy. However whenever the value is
less, it means task\r\nmanager is running fine from a recurring task
perspective. Changing the\r\nif condition allows the next step in the
code to log which then
logs\r\n`assumedRequiredThroughputPerMinutePerKibana` as the real reason
why\r\ntask manager is
unhealthy.\r\n\r\n---------\r\n\r\nCo-authored-by: kibanamachine
<[email protected]>","sha":"9fa8ec71529faa31a44fe2eac0902fb0d4b98797","branchLabelMapping":{"^v9.0.0$":"main","^v8.18.0$":"8.x","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["release_note:skip","Feature:Task
Manager","Team:ResponseOps","v9.0.0","backport:version","v8.17.0","v8.16.1"],"title":"Fix
bug in capacity warning logs when
assumedAverageRecurringRequiredThroughputPerMinutePerKibana was
sufficient","number":200578,"url":"https://github.com/elastic/kibana/pull/200578","mergeCommit":{"message":"Fix
bug in capacity warning logs when
assumedAverageRecurringRequiredThroughputPerMinutePerKibana was
sufficient (#200578)\n\nIn this PR, I'm fixing a bug where the task
manager unhealthy logs were\r\nshowing the wrong reason. Because the if
condition was
checking\r\n`assumedAverageRecurringRequiredThroughputPerMinutePerKibana`
to be less\r\nthan `capacityPerMinutePerKibana`, it would log this as
the reason task\r\nmanager is healthy. However whenever the value is
less, it means task\r\nmanager is running fine from a recurring task
perspective. Changing the\r\nif condition allows the next step in the
code to log which then
logs\r\n`assumedRequiredThroughputPerMinutePerKibana` as the real reason
why\r\ntask manager is
unhealthy.\r\n\r\n---------\r\n\r\nCo-authored-by: kibanamachine
<[email protected]>","sha":"9fa8ec71529faa31a44fe2eac0902fb0d4b98797"}},"sourceBranch":"main","suggestedTargetBranches":["8.17","8.16"],"targetPullRequestStates":[{"branch":"main","label":"v9.0.0","branchLabelMappingKey":"^v9.0.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/200578","number":200578,"mergeCommit":{"message":"Fix
bug in capacity warning logs when
assumedAverageRecurringRequiredThroughputPerMinutePerKibana was
sufficient (#200578)\n\nIn this PR, I'm fixing a bug where the task
manager unhealthy logs were\r\nshowing the wrong reason. Because the if
condition was
checking\r\n`assumedAverageRecurringRequiredThroughputPerMinutePerKibana`
to be less\r\nthan `capacityPerMinutePerKibana`, it would log this as
the reason task\r\nmanager is healthy. However whenever the value is
less, it means task\r\nmanager is running fine from a recurring task
perspective. Changing the\r\nif condition allows the next step in the
code to log which then
logs\r\n`assumedRequiredThroughputPerMinutePerKibana` as the real reason
why\r\ntask manager is
unhealthy.\r\n\r\n---------\r\n\r\nCo-authored-by: kibanamachine
<[email protected]>","sha":"9fa8ec71529faa31a44fe2eac0902fb0d4b98797"}},{"branch":"8.17","label":"v8.17.0","branchLabelMappingKey":"^v(\\d+).(\\d+).\\d+$","isSourceBranch":false,"state":"NOT_CREATED"},{"branch":"8.16","label":"v8.16.1","branchLabelMappingKey":"^v(\\d+).(\\d+).\\d+$","isSourceBranch":false,"state":"NOT_CREATED"}]}]
BACKPORT-->

Co-authored-by: Mike Côté <[email protected]>
paulinashakirova pushed a commit to paulinashakirova/kibana that referenced this pull request Nov 26, 2024
…ThroughputPerMinutePerKibana was sufficient (elastic#200578)

In this PR, I'm fixing a bug where the task manager unhealthy logs were
showing the wrong reason. Because the if condition was checking
`assumedAverageRecurringRequiredThroughputPerMinutePerKibana` to be less
than `capacityPerMinutePerKibana`, it would log this as the reason task
manager is healthy. However whenever the value is less, it means task
manager is running fine from a recurring task perspective. Changing the
if condition allows the next step in the code to log which then logs
`assumedRequiredThroughputPerMinutePerKibana` as the real reason why
task manager is unhealthy.

---------

Co-authored-by: kibanamachine <[email protected]>
CAWilson94 pushed a commit to CAWilson94/kibana that referenced this pull request Dec 12, 2024
…ThroughputPerMinutePerKibana was sufficient (elastic#200578)

In this PR, I'm fixing a bug where the task manager unhealthy logs were
showing the wrong reason. Because the if condition was checking
`assumedAverageRecurringRequiredThroughputPerMinutePerKibana` to be less
than `capacityPerMinutePerKibana`, it would log this as the reason task
manager is healthy. However whenever the value is less, it means task
manager is running fine from a recurring task perspective. Changing the
if condition allows the next step in the code to log which then logs
`assumedRequiredThroughputPerMinutePerKibana` as the real reason why
task manager is unhealthy.

---------

Co-authored-by: kibanamachine <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport:version Backport to applied version labels Feature:Task Manager release_note:skip Skip the PR/issue when compiling release notes Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams) v8.16.1 v8.16.2 v8.17.0 v9.0.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants