-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
be able to persist failed probe history longer #350
Comments
The history is only intended for low volume recent debugging. If you want something more medium to long term, you should enable debug logging. |
I'm getting ~800kb/minute from debug logging on a pretty small instance (36 targets, 5s scrape interval). I think having a log level that only shows basic info for failed probes would be desirable - I want to know which step failed (name resolution, connection, response, content) and maybe what IP was resolved, but I don't need much more. Would you consider a PR that changes results logging in such way? |
Info-level logging is only for blackbox-exporter level issues, which a probe failure is not, while debug logging is for everything so I'm not seeing the scope for that. |
I still feel that the failed probe record need to persist longer. it is not related to logging, but recent probe html page which is super help to debug what is wrong. |
You can always change history.limit |
when probing with large amount of urls, the history will be quickly override. |
I'd like to see this feature be added. I'm running into the exact issue that @QingsongYao is describing. One of our probes has occasional failures that I would like to debug. The last instance of this was 1.5 hours ago. We currently have about 200 probes a minute, so to be able to keep this history, we'd need to keep the last 18,000 scrapes, of which 17,995 are uninteresting. Similar to how Kubernetes CronJobs have a different limit for successes and failures, I think the blackbox-exporter should support a similar feature. |
Feel free to send a PR |
sometime, we only care about the failed probe logs, however since we many succeed logs, which cause the failed log was removed due to history.limit value.
The text was updated successfully, but these errors were encountered: