-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
range deletion performance improvements + cleanup #5
Conversation
This makes it easier to refer to a range positioning mode outside of the scope of a RangeDelAggregator without polluting the global scope with the individual enum constants.
The uncollapsed representation was previously only tested by the integration tests in db_range_del_test.
The range deletion aggregator supports "uncollapsed" and "collapsed" tombstone maps. Uncollapsed tombstone maps are naturally represented as a std::multiset, while collapsed tombstone maps are naturally represented as a std::map. The previous implementation used a std::multimap that was general enough to store either the uncollapsed or the collapsed representation. Confusingly, its keys and values had different meanings depending on which representation was being used. Extract a TombstoneMap interface which has two concrete implementations, uncollapsed and collapsed. This allows the implementations to use their natural representation, and keeps the code for manipulating that representation within one class. In the process, simplify and comment the CollapsedTombstoneMap::AddTombstone method. This refactor exposed a bug in the ObsoleteTombstoneCleanup test, which was installing tombstones like [dr1, dr1) which cover no keys due to the exclusivity of the upper bound.
Implement a merging iterator over stripes in a RangeDelAggregator. This resolves a TODO about keeping table-modifying code out of the RangeDelAggregator. It also paves the way to splitting output files containing tombstones during compactions when they span too many keys in the level below the compaction output level.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did anything significant change from the upstream PR (which I already reviewed)?
Reviewed 6 of 6 files at r1, 1 of 1 files at r2.
Reviewable status: 2 of 10 files reviewed, all discussions resolved (waiting on @petermattis and @bdarnell)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nope. I replaced an it++
with a ++it
. Everything else was just resolving rebase conflicts.
Reviewable status: 2 of 10 files reviewed, all discussions resolved (waiting on @petermattis and @bdarnell)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The PR title mentions performance improvements, but it's not clear from the individual commit messages where the performance improvement could come from (it's laying the groundwork for truncating the tombstones, but not doing it yet). Is it just that the new split implementation is more efficient than the hybrid multimap version?
Reviewed 6 of 6 files at r1, 1 of 1 files at r2, 4 of 4 files at r3, 6 of 6 files at r4.
Reviewable status:complete! all files reviewed, all discussions resolved
Yeah, the cleanup in "Split collapsed and uncollapsed tombstone map representations" (bd83fd0) refactors the collapsed AddTombstone function in a way that resolves a performance bug. The bug is more precisely explained in facebook#3992. I think I originally had a commit that more precisely described what was going on, but it got amended into bd83fd0 and is never coming back. I can make bd83fd0's commit message better, though. |
This just went through a round of upstream review, so I'm going to hold off on merging this for another few days to see if it lands there. 🤞 |
@benesch Ack. Do you want me to take care of the merge conflicts here? |
I'm happy to take care of them, but if you're itching to get this in, by
all means go for it! The upstream patch has been updated to address some of
Andrew's feedback, so it may be easier to redo the rebase from scratch.
…On Thu, Jul 12, 2018 at 11:14 AM, Peter Mattis ***@***.***> wrote:
@benesch <https://github.com/benesch> Ack. Do you want me to take care of
the merge conflicts here?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#5 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AA15IF5dysVOfTjXW_v7LzXmgaHICYrvks5uF2fAgaJpZM4VJ3kV>
.
|
Superseded by #8. |
Upstream PR: facebook#4014. See individual commits for details.
This change is![Reviewable](https://camo.githubusercontent.com/1541c4039185914e83657d3683ec25920c672c6c5c7ab4240ee7bff601adec0b/68747470733a2f2f72657669657761626c652e696f2f7265766965775f627574746f6e2e737667)