Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to send recovered alerts according to the original alert state #160984

Open
maryam-saeidi opened this issue Jun 30, 2023 · 3 comments
Open

How to send recovered alerts according to the original alert state #160984

maryam-saeidi opened this issue Jun 30, 2023 · 3 comments
Assignees
Labels
discuss Feature:Alerting Team: Actionable Observability - DEPRECATED For Observability Alerting and SLOs use "Team:obs-ux-management", for AIops "Team:obs-knowledge" Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams) v8.10.0
Milestone

Comments

@maryam-saeidi
Copy link
Member

maryam-saeidi commented Jun 30, 2023

📝 Summary

In metric threshold and new threshold rules, we have two/three types of actions that can be generated:

As shown below, we have settings to control no data behavior for the metric threshold rule, but we are now removing this from the new threshold rule.
image

❓Question

Suppose that we have different actions for alert/warning/no data. In that case, how can we also send the recovered messages to the same group as we sent the alert?

Different action states An example scenario
image image

Previously, we had a field called originalAlertState in action context with the following logic:

const translateActionGroupToAlertState = (
  actionGroupId: string | undefined
): string | undefined => {
  if (actionGroupId === FIRED_ACTIONS.id) {
    return stateToAlertMessage[AlertStates.ALERT];
  }
  if (actionGroupId === NO_DATA_ACTIONS.id) {
    return stateToAlertMessage[AlertStates.NO_DATA];
  }
};

We don't want to save this information in AAD for the new rule, but we were wondering how this case can be covered when conditional actions are introduced.

Use-cases

  1. Separate warning recovered message
  2. Separate Recovered action conditions for Warning and Alert (145418) -> The issue was previously solved by adding an action context variable, check the related PR for more info.
@maryam-saeidi maryam-saeidi added discuss Team: Actionable Observability - DEPRECATED For Observability Alerting and SLOs use "Team:obs-ux-management", for AIops "Team:obs-knowledge" v8.10.0 labels Jun 30, 2023
@maryam-saeidi maryam-saeidi self-assigned this Jun 30, 2023
@elasticmachine
Copy link
Contributor

Pinging @elastic/actionable-observability (Team: Actionable Observability)

@maryam-saeidi maryam-saeidi added the Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams) label Jun 30, 2023
@elasticmachine
Copy link
Contributor

Pinging @elastic/response-ops (Team:ResponseOps)

@maryam-saeidi maryam-saeidi changed the title [AO] How to send recovered alerts according to the original alert state How to send recovered alerts according to the original alert state Jun 30, 2023
@maryam-saeidi
Copy link
Member Author

There were three topics that we discussed:

  1. How to send a recovered alert to the same action group
    ---> warning/critical (Metric threshold) or low/medium/high/critical (SLO)
    Based on @mikecote 's input, this topic is getting more complicated as the warning alert can turn into a critical alert, then the question is, do we want to send multiple recovered notifications or just one when it recovered?
    @shanisagiv1 will check this case to see how we want to handle it in conditional action (Here is the use-case)
  2. How to handle no data?
    @kobelb suggested changing the rule state to a warning instead of firing an alert.
    @XavierM thinks it is not an alert and it should be handled only by a notification instead of alert
    Based on @simianhacker's input, users should be notified when there is no data for the alert, and with the current features, we can only set a no data alert.
    I think we need to handle this case for all the rules (no strong opinion on whether to use alert or notification)
    Decision: I will wait for @shanisagiv1 's input about when we might have a way to notify the user if a rule is in error/warning state and then AO will decide how to handle no data in the new metric threshold rule.
  3. Missing groups no data
    It's on @katrin-freihofner's radar, she proposed to have a new rule for this use-case, and it is still under refinement.

Please let me know if something is captured wrong or if I am missing something.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discuss Feature:Alerting Team: Actionable Observability - DEPRECATED For Observability Alerting and SLOs use "Team:obs-ux-management", for AIops "Team:obs-knowledge" Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams) v8.10.0
Projects
None yet
Development

No branches or pull requests

3 participants