Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ability to alert when there's been no data for x amount of time #67296

Closed
mikecote opened this issue May 25, 2020 · 4 comments
Closed

Ability to alert when there's been no data for x amount of time #67296

mikecote opened this issue May 25, 2020 · 4 comments
Labels
estimate:needs-research Estimated as too large and requires research to break down into workable issues Feature:Alerting/RulesFramework Issues related to the Alerting Rules Framework Feature:Alerting insight Issues related to user insight into platform operations and resilience Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams)

Comments

@mikecote
Copy link
Contributor

mikecote commented May 25, 2020

There's been conversations about the ability to fire specific alert actions when an alert hasn't seen data for a certain period of time.

There's questions around if this is an alert level and/or alert instance level thing. We will eventually have an alert level status of "no data".

The "no data for certain period of time" portion seems to be a common use case. They can always set it to 0 to fire actions immediately.

In regards to the above, @pmuellr also mentioned the idea of adding a debounce feature for the alert actions.

@mikecote mikecote added Feature:Alerting Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams) labels May 25, 2020
@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-alerting-services (Team:Alerting Services)

@pmuellr
Copy link
Member

pmuellr commented May 26, 2020

Ya, my thinking was that if we have a "status" of no-data, then we should also allow a customer to set an alert when the alert is in that status.

I think the original thinking was that this would be at the alert level, and not instance alert level. For alerts that do some kind of query and get instance ids from the data in the query, alert level seems like it makes sense. Would certainly make sense for index threshold. There may be some alerts in the future that have instance id's which are more "fixed", or not determined by the query, and are more well-known in advance. An no-data at the alert instance level might make sense there.

Right now, we don't have a way for an alert executor to inform the framework about a no-data condition at the alert level. Seems like we'd have to add a new function to the services we pass into the executor:

export interface AlertServices extends Services {
alertInstanceFactory: (id: string) => AlertInstance;
}

Presumably, if that function was called, no other actions should be scheduled for that turn, before or after the function is called.

@pmuellr
Copy link
Member

pmuellr commented May 26, 2020

The "no data for certain period of time" portion seems to be a common use case. They can always set it to 0 to fire actions immediately.

More knobs and dials! :-) The existing throttle should work fine with this alerting, but can't handle "ignoring" the condition if it was only for a short time frame - you'd always have the alert fire, at least once. Same with debounce - but presumably debounce would fire later than w/ a throttle.

Throttle / debounce don't feel right to me, a separate "no-data for XXX duration" does kinda feel right, just adds more complexity all around.

@gmmorris gmmorris added the Feature:Alerting/RulesFramework Issues related to the Alerting Rules Framework label Jul 1, 2021
@gmmorris gmmorris added the loe:needs-research This issue requires some research before it can be worked on or estimated label Jul 14, 2021
@gmmorris gmmorris added insight Issues related to user insight into platform operations and resilience estimate:needs-research Estimated as too large and requires research to break down into workable issues labels Aug 13, 2021
@gmmorris gmmorris removed the loe:needs-research This issue requires some research before it can be worked on or estimated label Sep 2, 2021
@kobelb kobelb added the needs-team Issues missing a team label label Jan 31, 2022
@botelastic botelastic bot removed the needs-team Issues missing a team label label Jan 31, 2022
@XavierM
Copy link
Contributor

XavierM commented Mar 3, 2022

we will fix this issue with this one -> #115973

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
estimate:needs-research Estimated as too large and requires research to break down into workable issues Feature:Alerting/RulesFramework Issues related to the Alerting Rules Framework Feature:Alerting insight Issues related to user insight into platform operations and resilience Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams)
Projects
No open projects
Development

No branches or pull requests

6 participants