Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expire guages if no updates #1

Merged
merged 2 commits into from
Jan 18, 2024
Merged

Expire guages if no updates #1

merged 2 commits into from
Jan 18, 2024

Conversation

kj800x
Copy link
Owner

@kj800x kj800x commented Jan 18, 2024

Noticed that our data seemed to get "stuck" on a value and just flatlined. Debugging a bit further, it seems that the Tempest stopped sending UDP messages but unfortunately my alerting didn't fire because there were still metrics being published, just stale values. This is because the gauge will hold onto the previous value indefinitely and never expire it (which makes sense in some situations, but not others).

image

This PR wraps the prom-client Gauge instances to keep track of the "last seen" time. If a guage hasn't been updated in a given period, it will automatically remove it. Once the gauge is set again, it will reset the "last seen" time and publish the gauge's value for another timeout period. It doesn't solve the underlying issue with the Tempest failing to publish UDP messages, but it should make the rest of the ingestion / alerting pipeline better by correctly clearing those metrics if we haven't gotten an update.

Other metrics libraries have this feature built-in, such as Rust's metrics (see example usage here). It doesn't seem that npm's prom-client supports this feature, so I may consider filing a feature request to see the feasibility. This feature should be possible to add in a backwards compatible way (by default, don't expire metrics).

@kj800x kj800x merged commit 7d9f1a7 into master Jan 18, 2024
1 check passed
@kj800x kj800x deleted the kj/watchdog branch January 18, 2024 15:08
@kj800x
Copy link
Owner Author

kj800x commented Jan 18, 2024

Filed a feature request here: siimon/prom-client#607

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant