Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tweak numbers related to retries, and message delivery guarantee #106

Merged
merged 3 commits into from
Dec 27, 2018

Conversation

JeanMertz
Copy link
Contributor

I've seen situations where our cluster is returning some errors, and doesn't recover immediately, but will after a few seconds. By default, the retry logic retries after 100ms, which is too short for the cluster to recover.

With this change:

  • a processor will retry 5 times
  • there is 15 seconds between each retry, giving the broker a total of 1 minute and 15 seconds before the processor exits with an error
  • the processor now waits for an Ack from all brokers, not just the leader, before marking a publish action as a success.

All of these values can be configured, but the default favours stability and correctness, over speed. In general though, I don't hink these changes will cause a big change in performance, because the producer already batches messages (10.000 by default) before delivering them to the brokers.

This improves message delivery guarantees. The producer already has
`batch.num.messages` set to 10.000, so it won't wait for acks on each
message. If more performance is needed, at the cost of reliability, this
can be tweaked using `KafkaRequireLeaderAck`.
@blendle-hubbit
Copy link

Hello @JeanMertz, Hubbit here, Your Friendly Blendle Bot.

I've marked this Pull Request as stale as it hasn't been updated in a while 😴.

Here are some actions you can take to recover from this enormous stigma I just
placed on your Pull Request reputation:

  • Merge this Pull Request if you think it is ready to be rolled out to
    production.

  • @-mention the people that you think can move this Pull Request forward.

  • Close this Pull Request if it is no longer valid. Remember, you can always
    re-open a Pull Request at a later point in time.

Together we can improve our code quality and development flow.

Happy Hackin'!

Defaults to 15 seconds, to give the broker time to fix itself when an
error is returned.
@JeanMertz JeanMertz merged commit 1e17c97 into master Dec 27, 2018
@JeanMertz JeanMertz deleted the retry-config branch December 27, 2018 13:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants