Ruler does not consistently restore `for` state #6465

rajagopalanand · 2024-12-29T17:06:44Z

Description

Currently Prometheus rule manager only restores for state of rule groups after restarts. This is fine for Prometheus. However, in Cortex, rule groups can jump from one ruler instance (r1) to another (r2) due to resharding. If r2 happens to be evaluating rule groups for that tenant already, then the manager will not restore the for state and will result in alerts going into an incorrect state. For example, an alert can go from FIRING to PENDING

To Reproduce

Create rules for a tenant with shard size > 1. For ease of testing, all the ruler instances were running rules for the tenant
Wait for alerting rule to go into FIRING
Restart the instance that was evaluating the alerting rule. Here the assumption is the ruler takes a bit to restart giving another ruler a chance to evaluate the alerting rule at least once
The alerting rule will go to PENDING

Expected behavior

The alert rule should stay in FIRING state

Additional Context

There is a PR open for Prometheus to address this issue. Without the PR approved, it is difficult to fix this issue

The text was updated successfully, but these errors were encountered:

yeya24 · 2025-03-03T03:43:07Z

The upstream Prometheus PR has been merged. We can bring the fix by just upgrading Prometheus to latest release version v3.2.x.
Help wanted.

rajagopalanand changed the title ~~Ruler do not consistently restore for state~~ Ruler does not consistently restore for state Dec 29, 2024

dosubot bot added the component/rules Bits & bobs todo with rules and alerts: the ruler, config service etc. label Dec 29, 2024

yeya24 added the help wanted label Mar 3, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ruler does not consistently restore `for` state #6465

Ruler does not consistently restore `for` state #6465

rajagopalanand commented Dec 29, 2024 •

edited

Loading

yeya24 commented Mar 3, 2025

Ruler does not consistently restore for state #6465

Ruler does not consistently restore for state #6465

Comments

rajagopalanand commented Dec 29, 2024 • edited Loading

yeya24 commented Mar 3, 2025

Ruler does not consistently restore `for` state #6465

Ruler does not consistently restore `for` state #6465

rajagopalanand commented Dec 29, 2024 •

edited

Loading