You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm running Reaper in sidecar mode and tried to monitor it via Prometheus metrics. I've created a dashboard showing repair progress along with the time since the last scheduled repair. While repair progress is working fine, time since last repair is showing correct value for only one schedule, others are arrested to epoch.
Reaper shows on the webui that all schedules have run within 7 days
and the schedules are still active
When looking at the prometheusMetrics endpoint, however, the values exported are wrong
However, this is not a prometheus-exporter issue, the Dropwizard report contains the same problem
Looking into it I've found that millis since epoch is the fallback value for a repair schedule if no repairs from this schedule were completed.
Digging a bit deeper unveiled that the last_run field of repair_schedule_v1 contains null for all but one entries. That makes millisSinceLastRepairForSchedule to fall back to return epoch.
This metric is also not working for multiple hosts (for 3 nodes cluster), which can answer why U have only 1 KS metric. Alternatively, U can use 7days - io_cassandrareaper_service_RepairRunner_millisSinceLastRepair :)
Project board link
I'm running Reaper in sidecar mode and tried to monitor it via Prometheus metrics. I've created a dashboard showing repair progress along with the time since the last scheduled repair. While repair progress is working fine, time since last repair is showing correct value for only one schedule, others are arrested to epoch.
Reaper shows on the webui that all schedules have run within 7 days
and the schedules are still active
When looking at the prometheusMetrics endpoint, however, the values exported are wrong
However, this is not a prometheus-exporter issue, the Dropwizard report contains the same problem
Looking into it I've found that millis since epoch is the fallback value for a repair schedule if no repairs from this schedule were completed.
cassandra-reaper/src/server/src/main/java/io/cassandrareaper/service/RepairScheduleService.java
Line 190 in 0661688
┆Issue is synchronized with this Jira Story by Unito
┆Issue Number: REAP-18
The text was updated successfully, but these errors were encountered: