-
Notifications
You must be signed in to change notification settings - Fork 602
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixed removing in flight write operations when processing stable offset updates #8409
Conversation
Signed-off-by: Michal Maslanka <[email protected]>
_inflight.erase(_inflight.begin(), it); | ||
_inflight.erase(_inflight.begin(), std::next(it)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed the problem by removing the iterator related with currently
processed in-flight update.
this looks correct, but despite the commit message claiming this is a problem, it doesn't seem problematic (e.g. it would have conservative semantics, presumably). perhaps stating why/what problem is fixed would be useful for others looking at git-blame in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Of course, i will include detailed description but briefly the problem is when truncation happens. the hard flush from segment appender should flush all the in flights but it does not as there is always that one which is not deleted. This one in-flight entry has physical offset larger than all subsequent ones because of truncation but the logical offset is smaller. This allow the reader to read past the stable offset.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This allow the reader to read past the stable offset.
this last part is unclear to me. if we are not removing an inflight request which otherwise could be removed, then wouldn't the affect of that be that a reader couldn't read as far as would otherwise be safe to?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think the problem is a little bit more complex and the issue still exists however the EOF returned from reader input stream is handed differently.
In a situation when we have batches exceeding single inflight write size there are multiple in flight operations for the same offset, but the first one already updates stable offset <-- this is incorrect.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this PR doesn't include the actual fix as it does not postpone updating the stable offset until last in flight operation pending for that offset finishes. I need to add this.
Removing inflight operation changes a way how input stream is created and instead of hitting file EOF we hit its internal logical limit causing short read.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
very interesting, thanks. gonna look at the reproducer
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So i looked at this once again, the behavior looks correct, the EOF errors we saw are the normal behavior when reader reaches end of file.
Signed-off-by: Michal Maslanka <[email protected]>
Signed-off-by: Michal Maslanka <[email protected]>
Signed-off-by: Michal Maslanka <[email protected]>
When batch parser returns an error indicating that batch read was unsuccessful the probe should be updated. Signed-off-by: Michal Maslanka <[email protected]>
The `absl::btree_map::erase` function erases all elements in range [start, end) (end is exclusive). In current implementation the iterator used to update the stable offset wasn't removed as only all the elements preceding it were removed. Fixed the problem by removing the iterator related with currently processed in-flight update. Signed-off-by: Michal Maslanka <[email protected]>
…ate/write Signed-off-by: Michal Maslanka <[email protected]>
8d6b539
to
0a2815d
Compare
The
absl::btree_map::erase
function erases all elements in range[start, end) (end is exclusive). In current implementation the iterator
used to update the stable offset wasn't removed as only all the elements
preceding it were removed.
Fixed the problem by removing the iterator related with currently
processed in-flight update.
Fixes: #8091
Backports Required
UX Changes
Release Notes
Bug Fixes