Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Alerts] Add ability to control the action group UI list based on configured alert params #89898

Open
Zacqary opened this issue Feb 1, 2021 · 5 comments
Labels
discuss enhancement New value added to drive a business result estimate:needs-research Estimated as too large and requires research to break down into workable issues Feature:Alerting/RuleActions Issues related to the Actions attached to Rules on the Alerting Framework Feature:Alerting NeededFor:logs-metrics-ui Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams)

Comments

@Zacqary
Copy link
Contributor

Zacqary commented Feb 1, 2021

The alert action menu will always display the same list of action types for a given alert type, regardless of whether they're all relevant to the way the alert's configured.

Simple Solution

screen_shot_2021-01-29_at_3 33 13_pm

In this above example, we've got an alert type that has an optional "Warning" severity level. It's hidden behind a toggle, like this:

Screen Shot 2021-02-01 at 11 31 44 AM

If the user hasn't enabled the Warning level, we probably don't want to include the Warning action group in the dropdown to reduce confusion.

Screen Shot 2021-02-01 at 11 27 39 AM

Advanced Solution

For a more advanced customization, we might want to even change the displayed name of certain action groups depending on configured alert params. As of 7.11, the user can choose between Fired and Recovered:

Screen Shot 2021-02-01 at 11 21 51 AM

I like this because Run when Fired sounds a lot better than Run when Alert. But then this doesn't look great either:

Screen Shot 2021-02-01 at 11 24 35 AM

I think the best we could do with enumerating multiple severity options is something like this:

Screen Shot 2021-02-01 at 11 29 36 AM

But then that's jarring if you aren't using the Warning severity level. Even if we don't remove the Warning option from the dropdown, if the user is looking at a UI like:

Screen Shot 2021-02-01 at 11 31 44 AM

then mentally they'll perceive:

Screen Shot 2021-02-01 at 11 29 19 AM

And that's super weird.

So my ideal solution for this situation is to be able to display:

(when the Warning threshold isn't enabled):

  • Fired
  • Recovered

(when the Warning threshold is enabled):

  • Critical (which corresponds to the same action group ID as Fired, we're just changing the display text)
  • Warning
  • Recovered
@Zacqary Zacqary added enhancement New value added to drive a business result Feature:Alerting Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams) labels Feb 1, 2021
@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-alerting-services (Team:Alerting Services)

@pmuellr
Copy link
Member

pmuellr commented Feb 1, 2021

I'm thinking we don't have a way to do this kind of validation within the server-side bits of alerting - we have alert param validation, but that doesn't pass in the actions. So one question is, do we need this sort of validation server-side? Seems like it would be nice to have. If the UI is going to prevent some combinations of params / actions, the server should too.

I mention server-side specifically because it seems like regardless of whether we have server-side validation, we want some better control over action groups available depending on alert params. Eg, restricting the list of available action groups populated in lists, in this case. That seems like a UI-side sort of function, that probably wouldn't have an analog on server-side.

@Zacqary
Copy link
Contributor Author

Zacqary commented Feb 2, 2021

This might ultimately constitute a separate discuss issue, but another problem this is making me consider is the discoverability of action groups.

Saving an alert without configuring required action groups

Let's say I configure an alert with multiple severity thresholds. The user now has to remember to create an action for each of those severity levels. They might miss that if the severity levels are hidden beneath a dropdown in the actions menu — which is already hidden behind having to click a button for email, server log, Slack message, etc.

One possible solution to that is to make the "You haven't created any actions" warning message configurable. Right now it just warns you if you click Save without configuring any actions at all, but we could also give you the capability to warn the user if they haven't configured a certain recommended set of action groups.

That could work for the severity threshold situation described above, but it's more complex for other use cases.

Not knowing an action group exists in the first place

We also have a No Data state on our alerts which currently piggybacks on the Fired action group. It's enabled like this:

106328499-8e06ae00-6245-11eb-8353-dbd10fe20b1c

We should move this to a No Data action group so that it's compatible with the Notify on Status Change setting, but from a UI perspective, that now makes the checkbox redundant.

Except for the fact that if we hide the No Data capability within the Action Group dropdown, that feature is now significantly less discoverable.

@pmuellr
Copy link
Member

pmuellr commented Feb 2, 2021

Ya, I think may be time to re-look at our existing UI structure re: multiple action groups. See also the following relevant issues:

cc: @mdefazio @mikecote

@pmuellr
Copy link
Member

pmuellr commented Mar 4, 2021

Another type of rule-type validation of general rule params comes from the anomaly detection rule type - for that one, ML jobs are already configured with an interval of some sort, which should influence the rule interval itself. Specifically, there's no sense making it smaller than the interval in the ML job.

In this case, we'd want to somehow allow the rule to override the interval, based on some other rule-specific application data. In the particular case of anomaly detection rules, each rule can deal with multiple ML jobs, so presumably we'd collect the minimum interval of those jobs, and use it to set the minimum interval in the rule itself.

Not sure if this would just be a UI thing, or something lower-level, in the rule processing itself. But in any case, it needs to be "live", based on existing ES resources, and not just "static".

@gmmorris gmmorris added Feature:Alerting/RuleActions Issues related to the Actions attached to Rules on the Alerting Framework NeededFor:logs-metrics-ui labels Jul 1, 2021
@gmmorris gmmorris added the loe:needs-research This issue requires some research before it can be worked on or estimated label Jul 14, 2021
@gmmorris gmmorris added the estimate:needs-research Estimated as too large and requires research to break down into workable issues label Aug 18, 2021
@gmmorris gmmorris removed the loe:needs-research This issue requires some research before it can be worked on or estimated label Sep 2, 2021
@kobelb kobelb added the needs-team Issues missing a team label label Jan 31, 2022
@botelastic botelastic bot removed the needs-team Issues missing a team label label Jan 31, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discuss enhancement New value added to drive a business result estimate:needs-research Estimated as too large and requires research to break down into workable issues Feature:Alerting/RuleActions Issues related to the Actions attached to Rules on the Alerting Framework Feature:Alerting NeededFor:logs-metrics-ui Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams)
Projects
No open projects
Development

No branches or pull requests

6 participants