-
Notifications
You must be signed in to change notification settings - Fork 224
Deadlock when sarama commit an out of range offset #121
Comments
Strange, I thought we are covered for that, see https://github.com/bsm/sarama-cluster/blob/master/partitions.go#L24 |
the same error is actually sent at differents times, here we test errors for newPartitiion but the error I'm talking about is actally handle here https://github.com/bsm/sarama-cluster/blob/master/partitions.go#L55 (we log the error but the partitionConsumer is never closed) |
Interesting. That error would only be logged when Return.Errors is set to
true. I'll see what I can do about it.
…On 10 Apr 2017 4:22 pm, "Gaspard Douady" ***@***.***> wrote:
the same error is actually sent at differents times, here we test errors
for newPartitiion but the error I'm talking about is actally handle here
https://github.com/bsm/sarama-cluster/blob/master/partitions.go#L55 (we
log the error but the partitionConsumer is never closed)
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#121 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AABA5pJjXqjygsyqc9oyc3inenbNy58Tks5rukkegaJpZM4M3Dqb>
.
|
Hey, sorry, but I cannot recreate the problem, could you please do me a favour and:
Thanks |
I will close this one for now, please re-open if you can find a way to re-create it consistently |
I am reopening this one as I saw a similar behavior: I was able to log the error message as well by listening on consumer.Errors() channel and got this one: |
I'm currently using sarama-cluster and I have an issue where my consumer would not consume events fast enough and would fall behind to the point of trying to commit offsets that were already deleted by Kafka. The current behavior of sarama cluster is to close the consumer on the concerned partitions and deadlock.
In the handleResponse function of sarama the choice has been made to kill the child partitionConsumer in case of an ErrOffsetOutOfRange error and send the error upward.
Sarama-cluster hence kill its own partitionConsumer and send the same Error upward.
As of right now the user only have the information "ErrOffsetOutOfRange" that could happen at several places around the code and the other partitionConsumers keep on going like nothing happened.
It feels like in this case sarama-cluster could either
The text was updated successfully, but these errors were encountered: