[Alerting] Add a required, programmatic message to actions #64349

Zacqary · 2020-04-23T19:42:35Z

Summary

Alert executors should be able to send whatever message they want when firing an action. The user-defined message should be appended to the executor's programmatic message, and the user can use this to provide additional context. This is because the information that we need to convey in an alert is often complex, dynamic, and requires product design in order to be effective.

Context

From discussions on implementing #64080, the Metrics team has realized we need to be able to have more control over what messages get sent to users. Right now the message field relies entirely on the user to configure a useful message with all relevant information, and not to delete anything that's required.

This becomes especially precarious in a case like the Logs alerts (#62806), which have a default message of:

{{context.matchingDocuments}} log entries have matched the following conditions: {{context.conditions}}

which becomes something like:

24 log entries have matched the following conditions: message matches ASL Sender Statistics

context.conditions is a highly dynamic value, and deleting it would make the alert message effectively useless.

Because of the complexity of potential alert states, conditions, and configurations, we're exploring using something even more dynamic than context.conditions in metric alerts. Perhaps removing all context variables and just writing a single context.message that formats all relevant information:

The alternative would quickly get too advanced and out of hand:

(Note the condition0 naming convention, which we already use in the 7.7 release. Users have to manually add references to condition1, condition2, etc. every time they add additional conditions, and that's aggravating and error-prone. And you may notice I already made a syntax error in my pseudocode)

We can implement the context.message approach with the alerting plugin today. The problem is, what happens if the user deletes context.message from their alert?

We don't want to rely on the user just realizing that they shouldn't do that.

Under this change, the user-defined message would no longer be to manually format and present the data coming from the alert. It would be to provide additional context relevant to whatever the user is using alerting for: e.g. instructions for the on-call person who's getting this alert about how to respond to it.

The text was updated successfully, but these errors were encountered:

elasticmachine · 2020-04-23T19:42:36Z

Pinging @elastic/kibana-alerting-services (Team:Alerting Services)

pmuellr · 2020-04-27T18:16:50Z

Trying to boil down the requirements here - seems like there's a desire for two messages - one coming from the alert, which may be non-trivial (contain lists of things) - and one that could be set in the action params when editing the alert, specific to the usage of that alert. The customer would see both - presumably the one from the alert, followed by the one set in the action params - in an email/slack message, separated by a blank line.

I've been kind of thinking about something like this in reference to figuring out how to have an notification that would include the result of another action. Eg, a theoretic GitHub issue action that would create an issue. You'd like to get the issue number / url from that action, and add it as another part of the message. Maybe at the bottom?

At some point we need to look into better Slack messaging, which means using their "blocks" stuff. Perhaps we can settle on a generic shape that looks similar, and for messaging systems that don't have "blocks" like this, we just do the best we can - eg, join the blocks with a blank line between them.

The other thing to think about, as these messages get more complex, is the formatting supported by the various actions. Today we have plain text for most services, but Slack messages can use THEIR version of markdown-like markup, and for email we expect the message to be a more typical version of markdown - and their are differences. How should an alert render a message so that it can be consumed by either? Should it create a "slack" version of a message, and a "markdown" version? Is markdown good enough to use in plain text situations as well? A simple hack is to export allow context variables like message_slack and message_markdown, and then let the action executor figure out which of the message* variables to use. Or expose all of them, let the customer decide.

Zacqary · 2020-04-27T21:50:09Z

seems like there's a desire for two messages - one coming from the alert, which may be non-trivial (contain lists of things) - and one that could be set in the action params when editing the alert, specific to the usage of that alert. The customer would see both - presumably the one from the alert, followed by the one set in the action params - in an email/slack message, separated by a blank line.

Yep, that's about what I was thinking.

As for action types that are more complex than plain text, I feel like that makes having an opinionated message from the alert even more important. Slack blocks, especially, feel like they could benefit from specific product design choices. For Metrics, just basing off what Datadog does (which is admittedly where I'm basing most of my alerting opinions), we might want to include a thumbnail of a graph, a different color depending on how far the metric has crossed over the threshold, links to the metric explorer, several other things that would be difficult to build a user-facing UI to customize.

That level of complexity could benefit emails too, if we want to start sending rich HTML.

IMO there's a large subset of action types which are basically, in some way, shape, or form, "send an alert message." Whether it's a server log, an email, a Slack message, a PagerDuty message, we can cover most bases with:

Let the user edit a plain text Title and a Message with reasonable defaults and enable some {{context variables}}

Title covers email subject, Slack block heading, etc.

Have the alert type handle styling, formatting, rich features, and non-trivial information.
- For server logs, this just means generating a text string explaining what happened in the alert
- For Slack messages and emails, we can decide to convey some of this information with graphics instead of the same text string

On the other hand, there are some action types that don't fit the bill of "send an alert message," like creating a Github issue in response to an alert. That's something a little more complicated that I don't have a frame of reference for.

Zacqary added enhancement New value added to drive a business result Feature:Alerting Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams) labels Apr 23, 2020

mikecote mentioned this issue Jun 2, 2020

Dependencies on Kibana Alerting #67992

Open

59 tasks

gmmorris added NeededFor:logs-metrics-ui Project:ImproveAlertingManagementUX Alerting team project for improving the management experience of alerting. Feature:Alerting/RuleActions Issues related to the Actions attached to Rules on the Alerting Framework labels Jun 30, 2021

gmmorris added the loe:needs-research This issue requires some research before it can be worked on or estimated label Jul 14, 2021

gmmorris added the estimate:needs-research Estimated as too large and requires research to break down into workable issues label Aug 18, 2021

gmmorris removed the loe:needs-research This issue requires some research before it can be worked on or estimated label Sep 2, 2021

mikecote added this to AppEx: ResponseOps - Rules & Alerts Management Jan 6, 2022

kobelb added the needs-team Issues missing a team label label Jan 31, 2022

botelastic bot removed the needs-team Issues missing a team label label Jan 31, 2022

XavierM assigned Zacqary Mar 10, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Alerting] Add a required, programmatic message to actions #64349

[Alerting] Add a required, programmatic message to actions #64349

Zacqary commented Apr 23, 2020 •

edited

Loading

elasticmachine commented Apr 23, 2020

pmuellr commented Apr 27, 2020

Zacqary commented Apr 27, 2020 •

edited

Loading

[Alerting] Add a required, programmatic message to actions #64349

[Alerting] Add a required, programmatic message to actions #64349

Comments

Zacqary commented Apr 23, 2020 • edited Loading

Summary

Context

elasticmachine commented Apr 23, 2020

pmuellr commented Apr 27, 2020

Zacqary commented Apr 27, 2020 • edited Loading

Zacqary commented Apr 23, 2020 •

edited

Loading

Zacqary commented Apr 27, 2020 •

edited

Loading