Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

range deletion performance improvements + cleanup #5

Closed
wants to merge 4 commits into from

Conversation

benesch
Copy link

@benesch benesch commented Jul 10, 2018

Upstream PR: facebook#4014. See individual commits for details.


This change is Reviewable

benesch added 4 commits July 10, 2018 13:51
This makes it easier to refer to a range positioning mode outside of the
scope of a RangeDelAggregator without polluting the global scope with
the individual enum constants.
The uncollapsed representation was previously only tested by the
integration tests in db_range_del_test.
The range deletion aggregator supports "uncollapsed" and "collapsed"
tombstone maps. Uncollapsed tombstone maps are naturally represented as
a std::multiset, while collapsed tombstone maps are naturally
represented as a std::map.

The previous implementation used a std::multimap that was general enough
to store either the uncollapsed or the collapsed representation.
Confusingly, its keys and values had different meanings depending on
which representation was being used.

Extract a TombstoneMap interface which has two concrete implementations,
uncollapsed and collapsed. This allows the implementations to use their
natural representation, and keeps the code for manipulating that
representation within one class.

In the process, simplify and comment the CollapsedTombstoneMap::AddTombstone
method.

This refactor exposed a bug in the ObsoleteTombstoneCleanup test, which was
installing tombstones like [dr1, dr1) which cover no keys due to the
exclusivity of the upper bound.
Implement a merging iterator over stripes in a RangeDelAggregator. This
resolves a TODO about keeping table-modifying code out of the
RangeDelAggregator. It also paves the way to splitting output files
containing tombstones during compactions when they span too many keys in
the level below the compaction output level.
@benesch benesch requested review from bdarnell and petermattis July 10, 2018 18:01
Copy link

@petermattis petermattis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Did anything significant change from the upstream PR (which I already reviewed)?

Reviewed 6 of 6 files at r1, 1 of 1 files at r2.
Reviewable status: 2 of 10 files reviewed, all discussions resolved (waiting on @petermattis and @bdarnell)

Copy link
Author

@benesch benesch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope. I replaced an it++ with a ++it. Everything else was just resolving rebase conflicts.

Reviewable status: 2 of 10 files reviewed, all discussions resolved (waiting on @petermattis and @bdarnell)

Copy link

@bdarnell bdarnell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR title mentions performance improvements, but it's not clear from the individual commit messages where the performance improvement could come from (it's laying the groundwork for truncating the tombstones, but not doing it yet). Is it just that the new split implementation is more efficient than the hybrid multimap version?

Reviewed 6 of 6 files at r1, 1 of 1 files at r2, 4 of 4 files at r3, 6 of 6 files at r4.
Reviewable status: :shipit: complete! all files reviewed, all discussions resolved

@benesch
Copy link
Author

benesch commented Jul 10, 2018

Yeah, the cleanup in "Split collapsed and uncollapsed tombstone map representations" (bd83fd0) refactors the collapsed AddTombstone function in a way that resolves a performance bug. The bug is more precisely explained in facebook#3992.

I think I originally had a commit that more precisely described what was going on, but it got amended into bd83fd0 and is never coming back. I can make bd83fd0's commit message better, though.

@benesch
Copy link
Author

benesch commented Jul 12, 2018

This just went through a round of upstream review, so I'm going to hold off on merging this for another few days to see if it lands there. 🤞

@petermattis
Copy link

@benesch Ack. Do you want me to take care of the merge conflicts here?

@benesch
Copy link
Author

benesch commented Jul 12, 2018 via email

@petermattis
Copy link

Superseded by #8.

@petermattis petermattis deleted the 5.13.4.rdperf branch July 13, 2018 18:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants