-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Slack handler sometimes hangs #15
Comments
+1 |
What do you mean with 'extension'? I'd happily merge PR's for better dealing with failures. |
I have seen similar issues, mainly I would just like it if the handler would keep trying in a graceful way and let me know how many times it failed once it finally does reach slack. |
@xyntrix Did you ever happen to start your PR? |
If anyone wants to create this an extension there is a new https://github.com/sensu-extensions org and I would be more than happy to setup a repo and merge a sane PR. |
Hi. I have just created a pull request for the multi-channel version of the handler which includes, among others, the implementation of a retry & timeout strategy in contacting the slack api. Have a look at https://github.com/sensu-plugins/sensu-plugins-slack/pull/82/files#diff-0f8d1e04e6833eb356bb3b51869ef271R266 |
…lack webhook In certain scenarios the slack webhook delivery might fail due to several reasons: - network issues - rate limit exceeded - internal server errors on slack api side On those cases the call to the webhook might fail and our message not get delivered, or worse, it can leave our handler hanging for too long. This commit implements a customizable retry strategy that tries to deliver the message several times to the webhook, with a timeout to do so. It also implements a sleep time in between retries. All of these 3 settings can be customized in the json config of the handler, with defaults to 5 retries with 5 second sleeps in between, and 10 seconds timeout for each try. This should incidentally solve issue sensu-plugins#15
…lack webhook In certain scenarios the slack webhook delivery might fail due to several reasons: - network issues - rate limit exceeded - internal server errors on slack api side On those cases the call to the webhook might fail and our message not get delivered, or worse, it can leave our handler hanging for too long. This commit implements a customizable retry strategy that tries to deliver the message several times to the webhook, with a timeout to do so. It also implements a sleep time in between retries. All of these 3 settings can be customized in the json config of the handler, with defaults to 5 retries with 5 second sleeps in between, and 10 seconds timeout for each try. This should incidentally solve issue #15
The default Slack handler sometimes hangs if the network fails, leading to high load on the Sensu server.
I will be submitting a PR for rewriting this as an extension with a few protections around failure conditions.
The text was updated successfully, but these errors were encountered: