Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ML] Transforms: uses health information for alerting rule #152561

Merged

Conversation

darnautov
Copy link
Contributor

@darnautov darnautov commented Mar 2, 2023

Summary

Resolves #144158

Replaces the "Errors in transform messages" check with "Unhealthy transform" that relies on the health info from the transform stats.

Alerting context has been extended with health_status and issues properties for both checks.

The "Errors in transform messages" check will be still working for existing rules if it's been set explicitly.

image

Example of the alert message:

[farequote transform health rule] farequote transform health check result:
Transforms fq_response_times_continuous, tr_fail_02 are unhealthy.
  Transform ID: fq_response_times_continuous
  Transform state: started
  Transform health status: yellow
  Issue: Transform indexer failed
  Issue count: 2
  Issue details: Failed to execute phase [query], ; org.elasticsearch.action.search.SearchPhaseExecutionException: Search rejected due to missing shards [[farequote-2019][0]]. Consider using 'allow_partial_search_results' setting to bypass this error.
  First occurrence: Mar 6, 2023 @ 09:43:16.532
  Node name: node-0

Checklist

@darnautov darnautov added release_note:enhancement :ml Feature:Transforms ML transforms Feature:Alerting/RuleTypes Issues related to specific Alerting Rules Types Team:ML Team label for ML (also use :ml) v8.8.0 labels Mar 2, 2023
@darnautov darnautov self-assigned this Mar 2, 2023
@darnautov darnautov requested a review from a team as a code owner March 2, 2023 12:15
@elasticmachine
Copy link
Contributor

Pinging @elastic/ml-ui (:ml)

@darnautov darnautov requested a review from szabosteve March 6, 2023 11:14
Copy link
Contributor

@szabosteve szabosteve left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

UI text LGTM. (Left one note on Pete's comment.)

x-pack/plugins/transform/common/constants.ts Outdated Show resolved Hide resolved
@darnautov darnautov requested a review from peteharverson March 6, 2023 16:01
@darnautov
Copy link
Contributor Author

@elasticmachine merge upstream

import { PLUGIN, TRANSFORM_RULE_TYPE } from '../../../../common/constants';
import { transformHealthRuleParams, TransformHealthRuleParams } from './schema';
import { transformHealthServiceProvider } from './transform_health_service';

export interface BaseResponse {
export interface TransformHealth {
status: 'green' | 'unknown' | 'yellow' | 'red';
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In x-pack/plugins/transform/common/constants.ts we have export type TransformHealth which you could reuse here. To avoid a naming clash we could consolidate this a bit further. In x-pack/plugins/transform/common/types/transform_stats.ts the full transform stats are defined where health is a part of. You could your interface here move to that file and make it part of TransformStats where it's defined inline now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated in d1b6e8b

/**
* TODO update types in the es client
*/
type TransformGetTransformStats = TransformGetTransformStatsTransformStats & {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this can also be combined/cleaned up with what we have in x-pack/plugins/transform/common/types/transform_stats.ts.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated in d1b6e8b

await esClient.transform.getTransformStats({
transform_id: transformIds.join(','),
})
).transforms as TransformGetTransformStats[];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getTransformStats itself has a try/catch block and might also return IHttpFetchError so I'm not sure if we can assume there will also be .transforms available.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I misread the code, it's wrapped the other way around so the try/catch is on the outer level when the stats are requested via API call. It looks like in this case there's then no try/catch at all, should we add one?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we are unable to fetch transform stats, the rule can't be executed. It's ok to throw an error because it'll end up in the Rules UI.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, thanks for clarifying!

…api-update' into ml-144158-transform-health-rule-api-update
@darnautov darnautov requested a review from walterra March 7, 2023 10:47
Copy link
Contributor

@peteharverson peteharverson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested and looks good, apart from a question I have around the behavior of mapping the enabled state of the previous error message check to the new unhealthy transform check.

Copy link
Contributor

@walterra walterra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested and latest changes LGTM (pending Pete's question).

Copy link
Contributor

@peteharverson peteharverson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Latest changes for the enabled state of rules created with the previous config LGTM.

description: i18n.translate(
'xpack.transform.alertTypes.transformHealth.healthCheckDescription',
{
defaultMessage: 'Get alerts if a transform health status is not green.',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was wondering if this should be consistent with the messaging in the transforms list, using Healthy in place of green, but Get alerts if a transform health status is not healthy. reads a bit odd.

@darnautov darnautov enabled auto-merge (squash) March 7, 2023 14:26
@kibana-ci
Copy link
Collaborator

💚 Build Succeeded

Metrics [docs]

Async chunks

Total size of all lazy-loaded chunks that will be downloaded as the user navigates the app

id before after diff
transform 371.0KB 371.1KB +102.0B

Page load bundle

Size of the bundles that are downloaded on every page load. Target size is below 100kb

id before after diff
transform 16.8KB 17.7KB +892.0B
Unknown metric groups

ESLint disabled line counts

id before after diff
securitySolution 428 430 +2
transform 31 32 +1
total +3

Total ESLint disabled count

id before after diff
securitySolution 506 508 +2
transform 34 35 +1
total +3

History

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

cc @darnautov

@darnautov darnautov merged commit 56a7851 into elastic:main Mar 7, 2023
@kibanamachine kibanamachine added the backport:skip This commit does not require backporting label Mar 7, 2023
bmorelli25 pushed a commit to bmorelli25/kibana that referenced this pull request Mar 10, 2023
@szabosteve szabosteve changed the title [ML] Transforms: use health information for alerting rule [ML] Transforms: uses health information for alerting rule Apr 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport:skip This commit does not require backporting Feature:Alerting/RuleTypes Issues related to specific Alerting Rules Types Feature:Transforms ML transforms :ml release_note:enhancement Team:ML Team label for ML (also use :ml) v8.8.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[ML] Transforms: use health information for alerting
7 participants