Add option to have /healthz probes to be tied to database(s) connectivity #143

alsin · 2024-11-27T15:55:20Z

It could be useful to fail sql-exporter by means of K8s health probing by simply checking all configured DB connections could be established, failing /healthz probing otherwise.

…vity

alsin · 2025-01-09T09:16:17Z

Hello, any chance this feature request could be merged? We'd pretty much appreciate it for our setup. Thank you.

dewey · 2025-01-09T10:50:19Z

Hey, I'm not sure it's a good idea to kill the service if one database can't connect. That way your metrics would drop out for all databases just because one database has a connection issue / being restarted.

A better way to do that would be to write an alert based on the absence of a metric, or an outdated timestamp if you export one through a metric.

alsin · 2025-01-09T12:54:16Z

Hi @dewey, thanks for the quick reply. The point of this behavior is that we have some database password rotation carried out regularly and having this feature enabled allows auto-restart of the sql_exporter container (we use K8s for container provisioning) which all of a sudden loses ability to connect to the database as the respective secret has changed.

I can imagine a situation when sql_exporter has only jobs configured for night metrics scraping, thus such metrics would be outdated during the day and nobody would figure out there's something wrong with sql_exporter as it doesn't need to connect to DB until night hours.

dewey · 2025-01-09T13:10:20Z

Thanks for describing your use case @alsin, I understand your reasoning! In this case I still think it's not a good fit to merge as for your use case it would work well, but for others that could be very unexpected.

I'd maybe look into triggering a restart of the service through other means in that case.

alsin · 2025-01-09T14:29:49Z

...but for others that could be very unexpected.

That's why my PR suggests an optional behavior which is off by default and to use it one should intentionally enable it by setting the db.connectivity-as-healthz flag to true. Otherwise it stays off and the healthcheck endpoint works just as previously.

alsin · 2025-01-10T14:56:33Z

As an alternative we could require all databases to be unreachable and only after that try to kill the service.

dewey · 2025-01-16T15:31:33Z

I think that would make more sense. The code would also need to be formatted a bit, I think there's a gofmt missing as there's a lot of new lines being introduced.

Alexander Sinuskin added 5 commits November 27, 2024 16:47

Add option to have /healthz probes to be tied to database(s) connecti…

012d80d

…vity

Just a NL

7e848fb

Debug-log found connections

7c01c9d

better debug logging

34bfd7a

Fix the /healthz probe handler logic

144521f

alsin marked this pull request as ready for review December 4, 2024 12:47

Revert the NL addition

dc297c7

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add option to have /healthz probes to be tied to database(s) connectivity #143

Add option to have /healthz probes to be tied to database(s) connectivity #143

alsin commented Nov 27, 2024

alsin commented Jan 9, 2025

dewey commented Jan 9, 2025

alsin commented Jan 9, 2025 •

edited

Loading

dewey commented Jan 9, 2025

alsin commented Jan 9, 2025

alsin commented Jan 10, 2025

dewey commented Jan 16, 2025

Add option to have /healthz probes to be tied to database(s) connectivity #143

Are you sure you want to change the base?

Add option to have /healthz probes to be tied to database(s) connectivity #143

Conversation

alsin commented Nov 27, 2024

alsin commented Jan 9, 2025

dewey commented Jan 9, 2025

alsin commented Jan 9, 2025 • edited Loading

dewey commented Jan 9, 2025

alsin commented Jan 9, 2025

alsin commented Jan 10, 2025

dewey commented Jan 16, 2025

alsin commented Jan 9, 2025 •

edited

Loading