Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Account for possible offset skew #1037

Closed
wants to merge 1 commit into from
Closed

Conversation

horkhe
Copy link
Contributor

@horkhe horkhe commented Feb 7, 2018

In our testing with Kafka v0.11.0.0 - v1.0.0 we discovered that compacted records lie about their offsets and LastOffsetDelta can be used to correct that error. According to the Kafka protocol document, that is indeed the intended use for this field:

LastOffsetDelta: The offset of the last message in the RecordBatch. This is used by the broker to ensure correct behavior even when Records within a batch are compacted out.

@horkhe
Copy link
Contributor Author

horkhe commented Feb 8, 2018

Ready for review. @eapache , please take a look.

// Compacted records can lie about their offset delta, but that can be
// detected comparing the last offset delta from the record batch with
// actual offset delta of the last record. The difference (skew) should be
// applied to all records in the batch.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure this is true? Isn't the calculated skew only guaranteed to be true for the final record in the batch? Depending on where the records got compacted out the skew could be arbitrarily distributed in the batch.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right. This is not correct. The problem is most likely with v0.11 producer. Please take a look at #1040 for more details. I am closing this PR.

@@ -552,12 +562,6 @@ func (child *partitionConsumer) parseRecords(batch *RecordBatch) ([]*ConsumerMes
if incomplete {
return nil, ErrIncompleteResponse
}

child.offset = batch.FirstOffset + int64(batch.LastOffsetDelta) + 1
if child.offset <= originalOffset {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps the condition needs to change, but don't we still need to account for some case where the offset doesn't advance? Or is that handled by the incomplete logic now?

edit: the incomplete logic looks like it might be wrong now anyway, I'm not sure

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants