[Uptime] Allow logical AND in Monitor Status rule #391

justinkambic · 2021-10-21T14:26:37Z

Is your feature request related to a problem? Please describe.
We received a request from a user to be able to receive alerts only when their service is down in all locations. Today, if we configure an alert with multiple locations and any of them go down, the rule will become active.

Describe the solution you'd like
We should add an option that allows users to make it so a rule only activates when all of the specified locations are simultaneously unavailable.

Describe alternatives you've considered
N/A

Additional context
There may be some workaround possible with custom query logic, but if we want users to be able to do this it will be worthwhile to add the functionality as part of the standard UI flow.

sanjaruzic · 2021-11-19T13:33:42Z

Adding a specific customer use case:

The customer has several heartbeat instances monitoring the same URLs from different locations.
Currently it is not possible to group the alerts coming from different heartbeat instances for the same URL.

The customer wants to get an alert if 1 instance is not reporting.
The SQL equivalent would be
select count(distinct(observer.geo.name)) from 'heartbeat-7* < X

justinkambic · 2021-11-19T16:14:14Z

We're going to evaluate putting this request onto our roadmap. Earliest possible target would be 8.1, but at this point we haven't committed to working on it at all.

If it does get added you'll be able to track the board it's on (projects link on the sidebar) and the issue's target version label.

paulb-elastic · 2021-12-13T14:40:53Z

@justinkambic the main description (alert when all are down), seems different to #391 (comment) suggesting an alert if one location is down. Can you clarify?

justinkambic · 2021-12-13T15:38:57Z

the main description (alert when all are down), seems different to #391 (comment) suggesting an alert if one location is down. Can you clarify?

Per the original forum request:

Having Heartbeat deployed to multiple hosts, I would like to be alerted only when a monitor (e.g. ICMP probe on "example.org") fails on all of them.

Today we will trigger an alert for a rule when any location is down. This is the canonical case, and today we're not accounting for the more esoteric choice of wanting to know only when all are down.

paulb-elastic · 2021-12-14T10:58:39Z

Pinging @andrewvc re @justinkambic's comment

paulb-elastic · 2022-01-10T15:05:47Z

@drewpost to find out some more about the why for this, for example, to handle the unreliability of ICMP for example with retry capabilities.

huemac · 2022-01-10T23:36:01Z

Hi @paulb-elastic
This is useful when we have heartbeat deployed to multiple availability zones (e.g. in Azure) and we only want to be alerted if an endpoint is reported down on ALL the az's.
It may be acceptable for the application that is under monitoring to be inaccessible from some az's. But if all heartbeat deployments are all reporting the endpoint as "Down", then there is a real issue that needs to be acted on.

kevinnoel-be · 2023-03-01T10:11:43Z

We have the same use case here. We've heartbeat instances deployed in multiple AZs and we'd need to know when the monitor is down from all observers point of view.
Also, the current Uptime monitor status check only allows for conditions like Matching monitors are down > X times within last Y minutes which is not very useful in this setup as we may have a variable number of heartbeat instances (i.e. at least one, expected two).

paulb-elastic · 2023-12-21T13:11:25Z

Captured in elastic/kibana#153571

justinkambic added the enhancement New feature or request label Oct 21, 2021

paulb-elastic closed this as not planned Won't fix, can't repro, duplicate, stale Dec 21, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Uptime] Allow logical AND in Monitor Status rule #391

[Uptime] Allow logical AND in Monitor Status rule #391

justinkambic commented Oct 21, 2021

sanjaruzic commented Nov 19, 2021

justinkambic commented Nov 19, 2021

paulb-elastic commented Dec 13, 2021

justinkambic commented Dec 13, 2021

paulb-elastic commented Dec 14, 2021

paulb-elastic commented Jan 10, 2022

huemac commented Jan 10, 2022

kevinnoel-be commented Mar 1, 2023

paulb-elastic commented Dec 21, 2023

[Uptime] Allow logical AND in Monitor Status rule #391

[Uptime] Allow logical AND in Monitor Status rule #391

Comments

justinkambic commented Oct 21, 2021

sanjaruzic commented Nov 19, 2021

justinkambic commented Nov 19, 2021

paulb-elastic commented Dec 13, 2021

justinkambic commented Dec 13, 2021

paulb-elastic commented Dec 14, 2021

paulb-elastic commented Jan 10, 2022

huemac commented Jan 10, 2022

kevinnoel-be commented Mar 1, 2023

paulb-elastic commented Dec 21, 2023