Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pre-configured Alerts #59813

Open
peterschretlen opened this issue Mar 10, 2020 · 16 comments
Open

Pre-configured Alerts #59813

peterschretlen opened this issue Mar 10, 2020 · 16 comments
Labels
discuss enhancement New value added to drive a business result estimate:needs-research Estimated as too large and requires research to break down into workable issues Feature:Alerting/RulesFramework Issues related to the Alerting Rules Framework Feature:Alerting NeededFor:Detections and Resp NeededFor:Monitoring Project:ImproveAlertingGettingStartedUX Alerting team project for making the experience of getting started with alerting easier. R&D Research and development ticket (not meant to produce code, but to make a decision) Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams)

Comments

@peterschretlen
Copy link
Contributor

This is the alerting equivalent of #58914

There are scenarios where we want to pre-configure the alerts that will run in the system:

  • built-in-alerts, this would include out-of-the-box cloud alerts and stack monitoring alerts.
  • templated deployments that provide a standardized "base set" of alerts in each cluster

Unlike connectors, alerts are built on top of task-management (allowing it to be distributed) which means using Kibana saved objects, spaces, and security model. This creates some additional obstacles/consideration to managing them outside of a deployment:

  • Which space do these alerts live in?
  • What actions to you attach to a pre-configured alert (perhaps it requires pre-configured connectors)
  • Who is the owner, and how should we handle authentication and authorization outside the deployment?
  • How do we make pre-configured alert setup idempotent (i.e. restarting the server should not create a second, duplicate set of alerts)
  • Are these alerts mutable, partly-mutable or completely read only? (For example you might not be allowed to edit the alert settings, but you may want to disable it, mute it, or change the actions it is connected to).
  • An alert list may become a large configuration, something like kibana.yml may not be appropriate for such large config. An alternate config, bulk API, or equivalent to a terraform provider built on existing APIs might be a better fit.

Some similar issues came up when discussing alerting for stack monitoring, see #45571

@peterschretlen peterschretlen added Feature:Alerting Feature:Actions Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams) labels Mar 10, 2020
@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-alerting-services (Team:Alerting Services)

@pmuellr
Copy link
Member

pmuellr commented Mar 12, 2020

Which space do these alerts live in?
Who is the owner, and how should we handle authentication and authorization outside the deployment?

These are the hardest questions, I think. Maybe these should be space-agnostic, which would simplify some parts of this, probably complicate the implementation as we'll need a separate saved object type for these.

For the preconfigured actions, the initial thoughts were that a user would still have to "enable" them somehow - such a gesture would be perfect here, as that would give us the user and space. So "one button click" enablement. That would give us a user.

What actions to you attach to a pre-configured alert (perhaps it requires pre-configured connectors)

Seems like only pre-configured connectors, to start with anyway. Otherwise the alert would need to be mutable somehow, and I think we'd want to prevent that as much as possible.

Are these alerts mutable, partly-mutable or completely read only? (For example you might not be allowed to edit the alert settings, but you may want to disable it, mute it, or change the actions it is connected to).

Ya, great question. Seems like you'd certainly want to disable/mute with rebooting Kibana :-) but changing actions I could see as reboot/reconfiguration-required.

@chrisronline
Copy link
Contributor

Is there any update on this thinking? The Stack Monitoring team would really like this kind of functionality to natively exist in the alerting plugin. I know you folks made efforts on allowing preconfigured actions so I'm not sure if that's a good sign of movement on this ask too.

@mikecote
Copy link
Contributor

mikecote commented Jun 2, 2020

@chrisronline the thinking is still the same, now that preconfigured actions are complete, it paves the way for preconfigured alerts though still a 7.10+ timeline. Is the stack monitoring team planning to use the current alerting APIs in the meantime or blocked until this is complete?

@pmuellr
Copy link
Member

pmuellr commented Jun 5, 2020

Pre-configured alerts still seem hard, especially associating a user.

I wonder if there is another way to do this, besides the way it was solved with actions (which don't need a user associated).

We also have this issue: #67382 regarding packaged alerts - perhaps that's part of the solution?

I guess we should collect some actual use cases - there's A LOT of discussion in issue #45571 - maybe we could boil some of those down and add them here?

@chrisronline
Copy link
Contributor

Stack monitoring, like other plugins I imagine, just want the ability to create out-of-the-box alerts, that are created and enabled without the user needing to do anything. Because of this, there is no real configuration necessary - except possibly some optional way to configure existing connectors but the stack monitoring team plans to just enable the server log connector by default and ask the user to add more if they want.

Is it not possible to have a way to create alerts using the system user, aka the user configured through elasticsearch.username? And allow these to show up in the default space? For stack monitoring, I don't think we plan to even let a user delete these (perhaps disable/deactivate) so I don't think user ownership really matters for us.

@mikecote
Copy link
Contributor

mikecote commented Jun 17, 2020

This issue #33496 could help with the implementation of this.

@pmuellr
Copy link
Member

pmuellr commented Jul 23, 2020

I think we can close this, can't find the first PR that implemented this, but we've been fixing the holes as we go.

woops, ya, I was thinking preconfigured actions - this one is different :-)

@chrisronline
Copy link
Contributor

Are those for preconfigured actions? I don't see another issue for preconfigured alerts

@mikecote
Copy link
Contributor

I think @pmuellr meant #58914 which is the closed issue for pre-configured connectors. This one is still the main issue to track pre-configured alerts that is blocking stack monitoring so yeah it'll stay open.

@mikecote mikecote added the enhancement New value added to drive a business result label Aug 19, 2020
@mikecote mikecote added the R&D Research and development ticket (not meant to produce code, but to make a decision) label Aug 26, 2020
@mikecote
Copy link
Contributor

mikecote commented Sep 9, 2020

Another note that could use some APIs without a user in context: #75875.

@mikecote
Copy link
Contributor

cc @arisonl

@YulNaumenko
Copy link
Contributor

Based on the conversation

Just to add to the conversation, Stack Monitoring also has the concept of certain alerts being only available in gold/platinum; however, we are currently creating these alerts "out of the box" (since this concept doesn't exist yet, we are creating these when the user first visits the Stack Monitoring UI) and I'm worried about applying rules for alert creation for a data point that can change over time. For example, if they are on the Basic license when they first visit the Stack Monitoring UI, then upgrade to Gold and don't revisit the Stack Monitoring UI, any gold-gated alerts will not exist. This might be something to consider when implementing #59813 cc @mikecote

@chrisronline good point. I think for pre-configured alerts, we should bypass the license check for those and handle it differently (ex: execution time). There's ongoing discussion about how we should handle expired licenses and the existing alerts. It seemed disabling the alert was the best option, but maybe not? thinking about your use case where you'd just want the alerts to "resume". This may be the same expectation with other alerts.

we should find a specific way of handling pre-configured alerts license check.

@pmuellr
Copy link
Member

pmuellr commented Dec 16, 2020

If we allowed the alert to be created, even if it wasn't valid given the current license, then those would be in the same state as alerts that "expired" due to a downgrade in the license. It would of course be nice to let the customer know ahead of time, "we're going to create these alerts, but they're only valid for license XYZ+ and you're at WXY". Then, when they upgrade the license, all of a sudden alerts that weren't running will start running.

Perhaps that's too much of a surprise factor though.

One thought for product: could we turn these "alerts that can't run because you're not at license XYZ" into CTA's - basically ads for function customers will get, if they upgrade.

@mikecote
Copy link
Contributor

Moving from 7.12 - Candidates to 7.x - Candidates.

@mikecote
Copy link
Contributor

mikecote commented Feb 4, 2021

Moving from 7.x - Candidates to 8.x - Candidates (Backlog) after the latest 7.x planning session.

@gmmorris gmmorris added NeededFor:Detections and Resp NeededFor:Monitoring Project:ImproveAlertingGettingStartedUX Alerting team project for making the experience of getting started with alerting easier. Feature:Alerting/RulesFramework Issues related to the Alerting Rules Framework and removed Feature:Actions labels Jun 30, 2021
@gmmorris gmmorris added the loe:needs-research This issue requires some research before it can be worked on or estimated label Jul 14, 2021
@gmmorris gmmorris added the estimate:needs-research Estimated as too large and requires research to break down into workable issues label Aug 18, 2021
@gmmorris gmmorris removed the loe:needs-research This issue requires some research before it can be worked on or estimated label Sep 2, 2021
@kobelb kobelb added the needs-team Issues missing a team label label Jan 31, 2022
@botelastic botelastic bot removed the needs-team Issues missing a team label label Jan 31, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discuss enhancement New value added to drive a business result estimate:needs-research Estimated as too large and requires research to break down into workable issues Feature:Alerting/RulesFramework Issues related to the Alerting Rules Framework Feature:Alerting NeededFor:Detections and Resp NeededFor:Monitoring Project:ImproveAlertingGettingStartedUX Alerting team project for making the experience of getting started with alerting easier. R&D Research and development ticket (not meant to produce code, but to make a decision) Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams)
Projects
No open projects
Development

No branches or pull requests

8 participants