-
Notifications
You must be signed in to change notification settings - Fork 471
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
db: support range keys in delete-only compactions #1750
Conversation
d2f8d24
to
5cce5f3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looking good, thanks for doing this
Reviewed 1 of 6 files at r1, 8 of 8 files at r3, 4 of 5 files at r4, all commit messages.
Reviewable status: 8 of 9 files reviewed, 6 unresolved discussions (waiting on @itsbilal, @jbowens, and @nicktrav)
table_stats.go
line 313 at r3 (raw file):
defer iter.Close() var compactionHints []deleteCompactionHint // We iterate over the defragmented range tombstones, which ensures
nit: s/defragmented range tombstones/defragmented range tombstones and range key deletions/
table_stats.go
line 338 at r3 (raw file):
return nil, err } stats.RangeDeletionsBytesEstimate += estimate
here, and above in the level == numLevels - 1
conditional, I think we only want to populate RangeDeletionsBytesEstimate
if s
contains any keys of kind RANGEDEL
.
RangeDeletionsBytesEstimate
is the estimate of bytes within [start, end)
in point key blocks, or if tables are wholly contained, the entirety of the sstable. We want to avoid totaling this if there's only a RANGEKEYDEL
over the span.
We should also leave a comment that if s
only contains RANGEDEL
and the span wholly contains a table containing range keys, this estimate will be off, mistakenly including the size of the range key block. That's expected to be okay, because range keys are expected to be rare and not contribute significantly to sstable size.
table_stats.go
line 584 at r3 (raw file):
// corresponding to the largest and smallest sequence numbers encountered across // the range deletes and range keys deletes that comprised the merged spans. type tableRangedDeletionIter struct {
nit: do we need this type now? or could we return the merging iter as an opaque FragmentIterator
?
table_stats.go
line 589 at r3 (raw file):
// newTableRangeDeletionIter returns a new tableRangedDeletionIter. func newTableRangeDeletionIter(
nit: can we call this something that makes it a little clear that it's range deletions and range key deletions? newTableRangedDeletionIter
is better, maybe newCombinedDeletionKeyspanIter
or something is even better? The term 'range deletion' is just so ubiquitous referring specifically to RANGEDEL
that anything close to it is likely to be confused.
table_stats.go
line 674 at r3 (raw file):
// The separate iters for the range dels and range keys are wrapped in a // merging iter to join the keyspaces into a single keyspace. mIter := &keyspan.MergingIter{}
nit: now that we have MergingIter.AddLevel
, we can avoid the allocations of appending to the []FragmentIterator
slice by Init
-ing the MergingIter
higher up, and directly calling AddLevel
when we find a non-nil rangedel or range key iterator.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 5 of 9 files reviewed, 1 unresolved discussion (waiting on @itsbilal and @jbowens)
table_stats.go
line 313 at r3 (raw file):
Previously, jbowens (Jackson Owens) wrote…
nit: s/defragmented range tombstones/defragmented range tombstones and range key deletions/
Done.
table_stats.go
line 338 at r3 (raw file):
Previously, jbowens (Jackson Owens) wrote…
here, and above in the
level == numLevels - 1
conditional, I think we only want to populateRangeDeletionsBytesEstimate
ifs
contains any keys of kindRANGEDEL
.
RangeDeletionsBytesEstimate
is the estimate of bytes within[start, end)
in point key blocks, or if tables are wholly contained, the entirety of the sstable. We want to avoid totaling this if there's only aRANGEKEYDEL
over the span.We should also leave a comment that if
s
only containsRANGEDEL
and the span wholly contains a table containing range keys, this estimate will be off, mistakenly including the size of the range key block. That's expected to be okay, because range keys are expected to be rare and not contribute significantly to sstable size.
Turns out there was a bug here that wasn't calculating stats for tables with just range keys. Fixed and added an additional test case.
table_stats.go
line 584 at r3 (raw file):
Previously, jbowens (Jackson Owens) wrote…
nit: do we need this type now? or could we return the merging iter as an opaque
FragmentIterator
?
Good call. The new
function now returns the interface.
table_stats.go
line 589 at r3 (raw file):
Previously, jbowens (Jackson Owens) wrote…
nit: can we call this something that makes it a little clear that it's range deletions and range key deletions?
newTableRangedDeletionIter
is better, maybenewCombinedDeletionKeyspanIter
or something is even better? The term 'range deletion' is just so ubiquitous referring specifically toRANGEDEL
that anything close to it is likely to be confused.
I think I'd just typo'd originally - intention was for newTableRangedDeletionIter
. That highlights the problem, so let's go with newCombinedDeletionKeyspanIter
.
table_stats.go
line 674 at r3 (raw file):
Previously, jbowens (Jackson Owens) wrote…
nit: now that we have
MergingIter.AddLevel
, we can avoid the allocations of appending to the[]FragmentIterator
slice byInit
-ing theMergingIter
higher up, and directly callingAddLevel
when we find a non-nil rangedel or range key iterator.
Nice! Updated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 5 of 9 files reviewed, 4 unresolved discussions (waiting on @jbowens and @nicktrav)
table_stats.go
line 313 at r3 (raw file):
defer iter.Close() var compactionHints []deleteCompactionHint // We iterate over the defragmented range tombstones, which ensures
This would be range key deletions too, right? Not just range tombstones?
testdata/table_stats
line 264 at r3 (raw file):
# Ingest a table with range keys. # TODO(travers): Once range keys in batches are flushed to tables, this testcase
This should be the case now, right?
testdata/table_stats_deletion_iter
line 57 at r3 (raw file):
build a-z:{(#0,RANGEKEYDEL) (#0,RANGEKEYSET,@1,foo) (#0,RANGEKEYUNSET,@2)}
Maybe we should have a case where the RANGEKEYDEL has a lower sequence number than the others to confirm this is the Filter working and not the deletion shadowing the others?
4c656db
to
45aeb2c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rebased on master to pick up the recent compaction changes, and pushed a few more commits addressing feedback. I'm also doing a burn of the metamorphic tests.
I'll wait for one more LGTM before squashing, landing, and putting this one to bed ... finally.
Reviewable status: 1 of 9 files reviewed, 4 unresolved discussions (waiting on @itsbilal and @jbowens)
table_stats.go
line 313 at r3 (raw file):
Previously, itsbilal (Bilal Akhtar) wrote…
This would be range key deletions too, right? Not just range tombstones?
Updated.
testdata/table_stats
line 264 at r3 (raw file):
Previously, itsbilal (Bilal Akhtar) wrote…
This should be the case now, right?
Yeah - though I have a bunch of other test cases like this (that use the ingestion hack) that I'd like to clean up. I might address these in a follow-up, to avoid the extra noise.
testdata/table_stats_deletion_iter
line 57 at r3 (raw file):
Previously, itsbilal (Bilal Akhtar) wrote…
Maybe we should have a case where the RANGEKEYDEL has a lower sequence number than the others to confirm this is the Filter working and not the deletion shadowing the others?
Good call. Added.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 7 files at r10, 2 of 5 files at r11, 1 of 2 files at r14, 2 of 2 files at r16, 1 of 1 files at r17, 1 of 1 files at r18, all commit messages.
Reviewable status: all files reviewed, 5 unresolved discussions (waiting on @itsbilal and @nicktrav)
compaction.go
line 1849 at r18 (raw file):
} // compactionHintFromKeys returns a deleteCompactionHintType give a slice of
nit: s/give/given/
table_stats.go
line 477 at r18 (raw file):
// the RangeDeletionsBytesEstimate statistic and can't populate table stats // from just the properties. The table stats collector goroutine will populate // the stats.
nit: mind adjusting this comment to explicitly call out the need for the table stats collector to trigger delete-only compactions? It seems like there's two separate reasons we can't compute stats here:
- If a table has range deletions we can't calculate
RangeDeletionsBytesEstimate
without scanning the LSM, and we need the table stats collector to do that. - Additionally, if a table has range deletions or range key deletions, we need the table stats collector to run on the file so that it can look for opportunities for delete-only compactions.
Add a new `keyspan.FragmentIterator` implementation that iterates over the range deletion and range key blocks of a single table. The iterator returns merged spans from the table corresponding to "ranged deletions". A "ranged deletion" span contains keys that fall into one of three cases: - the span contains only range deletions, - the span contains only range key deletions, or - the span contains a mixture of both range deletions and range key deletions. Under the hood, the iterator first defragments the range deletion and range key deletion blocks separately, using the table's respective `keyspan.FragmentIterator`s. This defragmentation process also filters out non-deletion key types (i.e. range key SET and UNSETS). The two separate iterators are then combined into a `keyspan.MergingIter`. This new `tableRangedDeletionIter` will be used to construct spans necessary for calculating deletion hints on a table that takes range keys into account. Previously, only range deletion spans were considered. For table that do not contain range key deletes, the iterator returns the same set of spans that the existing logic would have returned.
Currently, when constructing delete-only compactions hints, only range delete spans are considered. With the introduction of range keys, the current hint construction logic is incorrect. Specifically, a hint constructed from a range delete may completely cover a table containing range keys. Currently, this table would be marked as eligible for deletion. However, the table should only be deleted if the range keys are also deleted. Make use of the `tableRangedDeletionIter` for construction of delete-only compaction hints. A new `deleteCompactionHintType` type is used to distinguish between hints generated by a range delete (which may not delete tables containing range keys), hints generated from a range key delete (which may not delete tables containing point keys), and hints generated from a span with both a range delete and range key delete (which may delete any covering table).
45aeb2c
to
80efae1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Squashed all the fixup commits. I'll address the test cases in a follow-up.
TFTRs!
Reviewable status: 4 of 9 files reviewed, 3 unresolved discussions (waiting on @itsbilal and @jbowens)
compaction.go
line 1849 at r18 (raw file):
Previously, jbowens (Jackson Owens) wrote…
nit: s/give/given/
Done.
table_stats.go
line 477 at r18 (raw file):
Previously, jbowens (Jackson Owens) wrote…
nit: mind adjusting this comment to explicitly call out the need for the table stats collector to trigger delete-only compactions? It seems like there's two separate reasons we can't compute stats here:
- If a table has range deletions we can't calculate
RangeDeletionsBytesEstimate
without scanning the LSM, and we need the table stats collector to do that.- Additionally, if a table has range deletions or range key deletions, we need the table stats collector to run on the file so that it can look for opportunities for delete-only compactions.
Done!
NOTE: This PR depends on #1759.Two commits in this patch - the first defines a new composite iterator, the second puts it to use.
This is a re-do of #1656, rebased on latest master to pick up various interfaces changes, and also squash all of the commits from that PR. All feedback from the original PR has been addressed.
TODOs:
db: add tableRangedDeletionIter for iteration of ranged deletion spans
Add a new
keyspan.FragmentIterator
implementation that iterates overthe range deletion and range key blocks of a single table. The iterator
returns merged spans from the table corresponding to "ranged deletions".
A "ranged deletion" span contains keys that fall into one of three
cases:
deletions.
Under the hood, the iterator first defragments the range deletion and
range key deletion blocks separately, using the table's respective
keyspan.FragmentIterator
s. This defragmentation process also filtersout non-deletion key types (i.e. range key SET and UNSETS). The two
separate iterators are then combined into a
keyspan.MergingIter
.This new
tableRangedDeletionIter
will be used to construct spansnecessary for calculating deletion hints on a table that takes range
keys into account. Previously, only range deletion spans were
considered.
For table that do not contain range key deletes, the iterator returns
the same set of spans that the existing logic would have returned.
db: consider range keys in delete-only compactions
Currently, when constructing delete-only compactions hints, only range
delete spans are considered. With the introduction of range keys, the
current hint construction logic is incorrect. Specifically, a hint
constructed from a range delete may completely cover a table containing
range keys. Currently, this table would be marked as eligible for
deletion. However, the table should only be deleted if the range keys
are also deleted.
Make use of the
tableRangedDeletionIter
for construction ofdelete-only compaction hints. A new
deleteCompactionHintType
type isused to distinguish between hints generated by a range delete (which may
not delete tables containing range keys), hints generated from a range
key delete (which may not delete tables containing point keys), and
hints generated from a span with both a range delete and range key
delete (which may delete any covering table).