Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ML] Prevent duplicate notifications about the same anomaly result #91485

Merged
merged 6 commits into from
Feb 17, 2021

Conversation

darnautov
Copy link
Contributor

@darnautov darnautov commented Feb 16, 2021

Summary

Related issue #88940.

The alerting framework allows configuring when to get notified. By default, it's "Only on status change", which means that in the case of the ML Anomaly detection alert schedules Anomaly score matched the condition action on each execution multiple times in a row, the user will be notified only on the first status change, so receiving duplicates won't be an issue.
But setting it to "Every time alert is active" might result in multiple notifications for the same anomaly, depending on the check interval and the result bucket span. This PR adds a check to the ML alert executor for the existing alert instance with anomaly_score_match action group in .kibana-event-log-* index that helps to avoid it.

The alert instance key generated to check for duplicates is of the form

Buckets alerts: jobId_highestRecordTimestamp
Influencer alerts: jobId_highestRecordTimestamp_influencerFieldName_influencerFieldValue
Record alerts: jobId_highestRecordTimestamp_detectorIndex_function_entityFieldName_entityFieldValue
e.g. for a record alert instance: ecommerce_high_sum_total_sales_1613584800000_0_high_sum_customer_full_name.keyword_Rabbia Al Powell

Also, another scenario is possible when the check interval is significantly bigger than the result bucket span we look back during the alert condition execution. In that case, we risk missing the anomaly, hence using previousStartedAt time from the previous execution helps to detect the check gap and use this interval as a time range for querying anomalies.

How to test

  • Create the Anomaly detection alert and set Notify to Every time alert is active.
  • Set a frequent check interval so the alert executor check the same bucket multiple times

You should get notified based on the selected action only once for a particular anomaly result.

Checklist

@elasticmachine
Copy link
Contributor

Pinging @elastic/ml-ui (:ml)

@alvarezmelissa87
Copy link
Contributor

Code LGTM aside from Pete's comment. Will test when PR is updated 👌

} else if (source.result_type === ANOMALY_RESULT_TYPE.RECORD) {
const fieldName = getEntityFieldName(source);
const fieldValue = getEntityFieldValue(source);
alertInstanceKey += `_${source.detector_index}_${source.function}_${fieldName}_${fieldValue}`;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

source.detector_index is undefined in the key my test generated. Is this available in the source ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

forgot to include it into the source, fixed in 61764c5

Copy link
Contributor

@peteharverson peteharverson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested latest edits and LGTM

@darnautov darnautov enabled auto-merge (squash) February 17, 2021 17:17
@darnautov darnautov added the auto-backport Deprecated - use backport:version if exact versions are needed label Feb 17, 2021
Copy link
Contributor

@alvarezmelissa87 alvarezmelissa87 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM ⚡

@kibanamachine
Copy link
Contributor

💚 Build Succeeded

Metrics [docs]

Page load bundle

Size of the bundles that are downloaded on every page load. Target size is below 100kb

id before after diff
ml 68.1KB 68.0KB -32.0B

History

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

kibanamachine pushed a commit to kibanamachine/kibana that referenced this pull request Feb 17, 2021
…lastic#91485)

* [ML] check kibana even logs for existing alert instance

* [ML] create alert instance key, add check for alert id

* [ML] use anomaly_utils, check interval gap

* [ML] add detector index

* [ML] fix unit test

* [ML] include detector_index into source
@kibanamachine
Copy link
Contributor

💚 Backport successful

7.x / #91720

Successful backport PRs will be merged automatically after passing CI.

kibanamachine added a commit that referenced this pull request Feb 17, 2021
…91485) (#91720)

* [ML] check kibana even logs for existing alert instance

* [ML] create alert instance key, add check for alert id

* [ML] use anomaly_utils, check interval gap

* [ML] add detector index

* [ML] fix unit test

* [ML] include detector_index into source

Co-authored-by: Dima Arnautov <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
auto-backport Deprecated - use backport:version if exact versions are needed Feature:Alerting Feature:Anomaly Detection ML anomaly detection :ml release_note:enhancement v7.12.0 v8.0.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants