Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Alerting] Add "Group By" support to ES Query alert type #89481

Closed
ymao1 opened this issue Jan 27, 2021 · 4 comments · Fixed by #144689
Closed

[Alerting] Add "Group By" support to ES Query alert type #89481

ymao1 opened this issue Jan 27, 2021 · 4 comments · Fixed by #144689
Assignees
Labels
enhancement New value added to drive a business result estimate:medium Medium Estimated Level of Effort Feature:Alerting/RuleTypes Issues related to specific Alerting Rules Types Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams)

Comments

@ymao1
Copy link
Contributor

ymao1 commented Jan 27, 2021

This PR introduced a basic ES query alert type that allows users to specify a query and a threshold condition for the number of matches against that query. We would like to enhance this alert type by adding the ability to group by a field within the index and then threshold against the hits within each group.

@ymao1 ymao1 added Feature:Alerting Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams) labels Jan 27, 2021
@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-alerting-services (Team:Alerting Services)

@pmuellr
Copy link
Member

pmuellr commented Jan 29, 2021

For index threshold, we do do grouping via a top-N terms aggregation. Would we do the same here? IIRC, it's fairly a fairly straight-forward implementation, and seems pretty simple to understand for customers. Not sure if we'd need to do more than that though. Opening up the possibility of other bucket aggregations seems super-interesting, and kind of in line with this alert's "lower-level access" to ES bits.

We'd need to discuss what the instance id's would be - if it's a top-N terms agg, it's simple - just the term, for other aggs we'd need to check to see how useful the grouping keys would be.

@ymao1
Copy link
Contributor Author

ymao1 commented Feb 2, 2021

I think a top-N terms aggregation like the Index Threshold would make the most sense since the simple ES query implementation is comparing hit count to a threshold condition. We could extend that to comparing the count of each bucket to a threshold condition. We could review other aggregation types to see if they would fit within this paradigm. We should also think about what "hits" we'd be returning if an aggregation is selected. Would we want to return the top N hits within each bucket as well?

@ferryversteeg
Copy link

+1
We are interested in this feature too!

@gmmorris gmmorris added the Feature:Alerting/RuleTypes Issues related to specific Alerting Rules Types label Jul 1, 2021
@gmmorris gmmorris added the loe:large Large Level of Effort label Jul 14, 2021
@gmmorris gmmorris added enhancement New value added to drive a business result estimate:medium Medium Estimated Level of Effort and removed Feature:Alerting labels Aug 13, 2021
@gmmorris gmmorris removed the loe:large Large Level of Effort label Sep 2, 2021
@kobelb kobelb added the needs-team Issues missing a team label label Jan 31, 2022
@botelastic botelastic bot removed the needs-team Issues missing a team label label Jan 31, 2022
@ersin-erdal ersin-erdal self-assigned this Oct 11, 2022
@ersin-erdal ersin-erdal removed their assignment Oct 14, 2022
@ymao1 ymao1 self-assigned this Nov 2, 2022
@ymao1 ymao1 moved this from Todo to In Progress in AppEx: ResponseOps - Execution & Connectors Nov 2, 2022
@ymao1 ymao1 moved this from In Progress to In Review in AppEx: ResponseOps - Execution & Connectors Nov 22, 2022
Repository owner moved this from In Review to Done in AppEx: ResponseOps - Execution & Connectors Dec 15, 2022
ymao1 added a commit that referenced this issue Dec 15, 2022
#144689)

Resolves #89481

## Summary

Adds group by options to the ES query rule type, both DSL and KQL
options. This is the same limited group by options that are offered in
the index threshold rule type so I used the same UI components and rule
parameter names. I moved some aggregation building code to `common` so
they could be reused. All existing ES query rules are migrated to be
`count over all` rules.

## To Verify

* Create the following types of rules and verify they work as expected.
Verify for both DSL query and KQL query
* `count over all` rule - this should run the same as before, where it
counts the number of documents that matches the query and applies the
threshold condition to that value. `{{context.hits}}` is all the
documents that match the query if the threshold condition is met.
* `<metric> over all` rule - this calculates the specific aggregation
metric and applies the threshold condition to the aggregated metric (for
example, `avg event.duration`). `{{context.hits}}` is all the documents
that match the query if the threshold condition is met.
* `count over top N terms` - this will apply a term aggregation to the
query and matches the threshold condition to each term bucket (for
example, `count over top 10 event.action` will apply the threshold
condition to the count of documents within each `event.action` bucket).
`{{context.hits}}` is the result of the top hits aggregation within each
term bucket if the threshold condition is met for that bucket.
* `<metric> over top N terms` - this will apply a term aggregation and a
metric sub-aggregation to the query and matches the threshold condition
to the metric value within each term bucket (for example, `avg
event.duration over top 10 event.action` will apply the threshold
condition to the average value of `event.duration` within each
`event.action` bucket). `{{context.hits}}` is the result of the top hits
aggregation within each term bucket if the threshold condition is met
for that bucket.
* Verify the migration by creating a DSL and KQL query in an older
version of Kibana and then upgrading to this PR. The rules should still
continue running successfully.


### Checklist

- [x]
[Documentation](https://www.elastic.co/guide/en/kibana/master/development-documentation.html)
was added for features that require explanation or tutorials
- [x] [Unit or functional
tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)
were updated or added to match the most common scenarios

Co-authored-by: Kibana Machine <[email protected]>
Co-authored-by: Lisa Cawley <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New value added to drive a business result estimate:medium Medium Estimated Level of Effort Feature:Alerting/RuleTypes Issues related to specific Alerting Rules Types Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams)
Projects
No open projects
Development

Successfully merging a pull request may close this issue.

7 participants