-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Account for possible offset skew #1037
Conversation
Ready for review. @eapache , please take a look. |
// Compacted records can lie about their offset delta, but that can be | ||
// detected comparing the last offset delta from the record batch with | ||
// actual offset delta of the last record. The difference (skew) should be | ||
// applied to all records in the batch. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure this is true? Isn't the calculated skew only guaranteed to be true for the final record in the batch? Depending on where the records got compacted out the skew could be arbitrarily distributed in the batch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right. This is not correct. The problem is most likely with v0.11 producer. Please take a look at #1040 for more details. I am closing this PR.
@@ -552,12 +562,6 @@ func (child *partitionConsumer) parseRecords(batch *RecordBatch) ([]*ConsumerMes | |||
if incomplete { | |||
return nil, ErrIncompleteResponse | |||
} | |||
|
|||
child.offset = batch.FirstOffset + int64(batch.LastOffsetDelta) + 1 | |||
if child.offset <= originalOffset { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps the condition needs to change, but don't we still need to account for some case where the offset doesn't advance? Or is that handled by the incomplete
logic now?
edit: the incomplete
logic looks like it might be wrong now anyway, I'm not sure
In our testing with Kafka v0.11.0.0 - v1.0.0 we discovered that compacted records lie about their offsets and
LastOffsetDelta
can be used to correct that error. According to the Kafka protocol document, that is indeed the intended use for this field: