Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support Healthchecks.io monitoring #933

Open
stin7 opened this issue Aug 11, 2024 · 7 comments
Open

Support Healthchecks.io monitoring #933

stin7 opened this issue Aug 11, 2024 · 7 comments

Comments

@stin7
Copy link

stin7 commented Aug 11, 2024

Summary

Add a --healthchecks_url (similar to Borgmatic) param or a more generic ping_url_on_success param

Context

I use healthchecks for monitoring important processes. I would like to integrate icloudpd into that system. The simplest way would be a param that accepts a URL that icloudpd will ping on successful download.

Perhaps there are other ways to handle this as well.

@AndreyNikiforov
Copy link
Collaborator

What information are you using for monitoring and what actions are you taking from it?

We probably need some kind of "is it working" metric and alert on it. Derivative of that would be some reliability score.

There may be a need for velocity-type metric as a guide for optimizing and adjusting behavior, something like "age of downloaded bytes"...

Architecture-wise, I am leaning towards pull mechanism for metrics (icloudpd exposes metrics on http endpoint and monitoring/alerting service pulls data; like prometheus). Note that I am looking for icloudpd as a services that keeps my iCloud collection synchronized with local storage, not a batch script that I run periodically.

@stin7
Copy link
Author

stin7 commented Aug 12, 2024

What information are you using for monitoring and what actions are you taking from it?

If a process/service doesn't successfully ping, then I get an alert about the process/service from healthchecks to go figure out what happened and get it back to green. (Healthchecks Intro: https://healthchecks.io/docs/ )

So for this project, it would be good to know that for some reason (most likely need to reauth, but it could be anything) my icloud photos aren't being backed up anymore and I should get it back online.

@AndreyNikiforov
Copy link
Collaborator

If a process/service doesn't successfully ping, then I get an alert about the process/service from healthchecks to go figure out what happened and get it back to green. (Healthchecks Intro: https://healthchecks.io/docs/ )

The service is performing periodic iCloud checks. I assume that pinging icloudpd to check if it is still running would be of little value. We would probably need to know if [last] expected check was performed. There is also a distinction between reason why expectation was not met -- if password was needed but was not provided by user, then icloudpd was technically healthy.

So for this project, it would be good to know that for some reason (most likely need to reauth, but it could be anything) my icloud photos aren't being backed up anymore and I should get it back online.

Yes, if expected check was not performed, then user needs to be notified/alerted to correct the issue. Kinda watch dog. Should probably be implemented on monitoring/alerting side, so if service is not running at all, we still notify user.

Thanks for helping brainstorming the issue. I need to dig into healthcheck.io to learn more about ideas to come up with the solution for icloudpd

@stin7
Copy link
Author

stin7 commented Aug 12, 2024

Thanks. Just to clarify one thing, healthchecks.io acts as a "dead man's switch". On healthchecks you specify how long it should wait for a successful ping from a service before sending an alert to you.

So, the change on iCloudpd would be simple. At end of sync, run "curl 'user provided healthchecks url'"

@AndreyNikiforov
Copy link
Collaborator

You can use --notification-script parameter to record in heath service the need to enter password

@stin7
Copy link
Author

stin7 commented Aug 13, 2024

Thanks, I missed that option, that seems good for when icloudpd knows there is an issue so I'll set that up to curl the /fail endpoint on healthchecks

Perhaps there could be a new --success-script option to ping healthchecks to catch when the service goes down for any reason

@isundaylee
Copy link

stumbled on this issue while looking for a way to integrate this with prometheus to set up alerts for "last icloudpd update > x days ago". i think if we had --success-script, it would allow integrating into prometheus (by having the script write a node_exporter textfile to be picked up by prometheus), as well as other monitoring solutions.

i can also potentially take a look at implementing this if people think it's a reasonable approach.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants