-
Notifications
You must be signed in to change notification settings - Fork 475
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
*: track smallest / largest keys separately in manifest #1521
*: track smallest / largest keys separately in manifest #1521
Conversation
One concern I have with this approach is that the caller needs to remember to set all three types of bounds: points, ranges and combined bounds in the An alternative approach would introduce functions for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good. Left a few early comments
Do we need to be tracking the smallest / largest sequence numbers for point keys, range keys, and across both?
I think we can get away with the just the one {smallest,largest} pair for the entire table.
ingestUpdateSeqNum
- do we need to update sequence numbers for the separate point and range key bounds?
Yeah, we do. When the ingested table is read, both point and range keys will 'adopt' the sequence number decided at ingestion.
Are there sites other than compactions and ingest where we need to set the bounds?
I can't think of anything else. @itsbilal?
Reviewed 2 of 10 files at r1, all commit messages.
Reviewable status: 2 of 10 files reviewed, 7 unresolved discussions (waiting on @itsbilal and @nicktrav)
ingest.go, line 375 at r1 (raw file):
func ingestUpdateSeqNum(opts *Options, dirname string, seqNum uint64, meta []*fileMetadata) error { // TODO(travesrs): Do we need to update the seqnums on the point and range // smallest / largest keys? Or just the combined bounds for each table.
I think we'll need to update them all. Both point and range keys will adopt the sequence number that we ingest the sstable at.
ingest.go, line 382 at r1 (raw file):
// unintentionally extends the bounds of the table. if m.Largest.Trailer != InternalKeyRangeDeleteSentinel && m.Largest.Trailer != base.InternalKeyBoundaryRangeKey {
This can use m.Largest.IsExclusiveSentinel()
now that we have it, which will handle sentinels of RANGEKEYUNSET
and RANGEKEYDEL
kinds, once we have them.
internal/manifest/version.go, line 100 at r1 (raw file):
CreationTime int64 // SmallestPointKey and LargestPointKey are the inclusive bounds for the // internal point keys stored in the table.
Somehow we should clarify in this comment that these bounds are bounds of keys that affect the point keyspace: specifically, they include the bounds of RANGEDEL
s which alter point keys, not range keys.
internal/manifest/version.go, line 110 at r1 (raw file):
// in the table, across both point and range keys. These values can be // reconstructed from the respective point and range key fields. // TODO(travers): Should these fields be derived via functions?
They're so frequently accessed, I think we should calculate them once when the FileMetadata
is constructed. We should definitely add invariant checks that they are accurate though, by re-deriving the bounds from the {Smallest,Largest}{Point,Range}Key
fields.t
internal/manifest/version.go, line 116 at r1 (raw file):
// range keys. These values can be reconstructed from the respective point and // range key fields. // TODO(travers): Do we need separate fields for point / range key seqnums?
Good question. I don't think so. We primarily need these fields for providing an ordering of files in L0, in which case we care about the calculation across both point and range keys.
internal/manifest/version.go, line 117 at r1 (raw file):
// range key fields. // TODO(travers): Do we need separate fields for point / range key seqnums? // TODO(travers): Should these fields be derived via functions?
The seqnum fields can't be derived, because they require scanning all the contents of the table.
sstable/writer.go, line 100 at r1 (raw file):
return m.LargestRangeKey } if m.LargestRangeKey.UserKey == nil {
I'm not sure how we should represent an unset {smallest,range}{point,range} field. I think the nil UserKey
makes sense, but what about the trailer?
Maybe InternalKeyKindInvalid
kind with a 0
sequence number?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With functions for the derived Smallest
/ Largest
values looking like a no-go, I think this PR is close to its final form. I left a comment about making instantiation of FileMetadata
less brittle. It's likely we can clean that up in a follow-up.
I still want to add some more testing, so I'll keep this WIP for now. I'll ping again when it's good for a more official review.
Reviewable status: 2 of 10 files reviewed, 7 unresolved discussions (waiting on @itsbilal and @jbowens)
ingest.go, line 375 at r1 (raw file):
Previously, jbowens (Jackson Owens) wrote…
I think we'll need to update them all. Both point and range keys will adopt the sequence number that we ingest the sstable at.
Done. Clarified some commentary here.
ingest.go, line 382 at r1 (raw file):
Previously, jbowens (Jackson Owens) wrote…
This can use
m.Largest.IsExclusiveSentinel()
now that we have it, which will handle sentinels ofRANGEKEYUNSET
andRANGEKEYDEL
kinds, once we have them.
Nice. Done.
internal/manifest/version.go, line 100 at r1 (raw file):
Previously, jbowens (Jackson Owens) wrote…
Somehow we should clarify in this comment that these bounds are bounds of keys that affect the point keyspace: specifically, they include the bounds of
RANGEDEL
s which alter point keys, not range keys.
Good call. Done.
internal/manifest/version.go, line 110 at r1 (raw file):
Previously, jbowens (Jackson Owens) wrote…
They're so frequently accessed, I think we should calculate them once when the
FileMetadata
is constructed. We should definitely add invariant checks that they are accurate though, by re-deriving the bounds from the{Smallest,Largest}{Point,Range}Key
fields.t
Yeah, that's what I thought. There's a Validate
function a few lines down we can use.
internal/manifest/version.go, line 116 at r1 (raw file):
Previously, jbowens (Jackson Owens) wrote…
Good question. I don't think so. We primarily need these fields for providing an ordering of files in L0, in which case we care about the calculation across both point and range keys.
Cool. Dropped.
internal/manifest/version.go, line 117 at r1 (raw file):
Previously, jbowens (Jackson Owens) wrote…
The seqnum fields can't be derived, because they require scanning all the contents of the table.
👍
sstable/writer.go, line 100 at r1 (raw file):
Previously, jbowens (Jackson Owens) wrote…
I'm not sure how we should represent an unset {smallest,range}{point,range} field. I think the nil
UserKey
makes sense, but what about the trailer?Maybe
InternalKeyKindInvalid
kind with a0
sequence number?
This is one of the issues I have with how the code is structured currently. As the FileMetadata.{Smallest,Largest}
fields are exported, anyone can set them when constructing a new FileMetadata
. If you don't set any of these files, they have zero values, or nil
slice for the UserKey. There's a risk that new calls sites forget to set one or more of the pairs. The Validate
function would probably catch this, but I'm not 100% sure if that gives sufficient coverage.
I like this idea of having all of the Smallest / Largest fields initialized to this "unset" key (nil user key, InternalKeyKindInvalid
and seqnum zero), though it would require a rework to ensure that any time one creates a FileMetadata
, those fields are set to this non-zero "unset" key. Something like a New*
or Init
function for constructing FileMetadata
s.
That said, is it even valid to have a key with UserKey == nil
in the DB? In which case, isn't the current check sufficient for determining if that particular Smallest / Largest field is unset? It's brittle though, so I'm all for working out a better way, perhaps by way of a TODO && ticket.
b03de7b
to
e352d8e
Compare
Thanks for working on this - in-memory tracking of sst range key bounds should be pretty helpful in unblocking the levelIter, even though it won't survive a node restart just yet. And yes, I agree that a single sequence number largest/smallest pair for point and range keys should suffice.
That should be it. |
e352d8e
to
e0cc4d4
Compare
This should be good for another pass now. I've left a TODO to add the "sentinel" keys we talked about to indicate the absence of a bound. I'll tackle that separately. |
1c84b31
to
ba0ac8c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice!
Reviewed 3 of 10 files at r1, 6 of 7 files at r3, all commit messages.
Reviewable status: 10 of 11 files reviewed, 3 unresolved discussions (waiting on @itsbilal and @nicktrav)
internal/manifest/version.go, line 117 at r4 (raw file):
Largest InternalKey // Smallest and largest sequence numbers in the table, across both point and // range keys. These values can be reconstructed from the respective point and
This bit can be dropped since there are no respective point and range key fields, right?
Code quote:
// range keys. These values can be reconstructed from the respective point and
// range key fields.
internal/manifest/version.go, line 517 at r4 (raw file):
Smallest: smallest, Largest: largest, })
I guess at some point we'll need to update the encoding of Version
debug strings, and wherever else we print abbreviated FileMetadata
information to write the range key bounds too.
sstable/writer.go, line 100 at r1 (raw file):
Previously, nicktrav (Nick Travers) wrote…
This is one of the issues I have with how the code is structured currently. As the
FileMetadata.{Smallest,Largest}
fields are exported, anyone can set them when constructing a newFileMetadata
. If you don't set any of these files, they have zero values, ornil
slice for the UserKey. There's a risk that new calls sites forget to set one or more of the pairs. TheValidate
function would probably catch this, but I'm not 100% sure if that gives sufficient coverage.I like this idea of having all of the Smallest / Largest fields initialized to this "unset" key (nil user key,
InternalKeyKindInvalid
and seqnum zero), though it would require a rework to ensure that any time one creates aFileMetadata
, those fields are set to this non-zero "unset" key. Something like aNew*
orInit
function for constructingFileMetadata
s.That said, is it even valid to have a key with
UserKey == nil
in the DB? In which case, isn't the current check sufficient for determining if that particular Smallest / Largest field is unset? It's brittle though, so I'm all for working out a better way, perhaps by way of a TODO && ticket.
Yeah, I think the current check is sufficient. The zero trailer just bothers me because it'll be presented as a DEL#0
, a completely valid key-kind and may confusingly resemble a point tombstone that's lost its user key.
TODO is fine
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 10 of 11 files reviewed, 4 unresolved discussions (waiting on @itsbilal and @nicktrav)
-- commits, line 2 at r4:
nit: s/in manifest/in *fileMetadata/
since we're not yet persisting these range keys in the manifest?
sstable/writer.go, line 100 at r1 (raw file): Previously, jbowens (Jackson Owens) wrote…
Done. |
Currently, only point keys are tracked in the manifest. With the addition of range keys, the bounds of an SSTable should be computed by considering the bounds of both the point keys and the range keys, and taking the smallest or largest across both types of key. Add four additional fields, `{Smallest,Largest}{Point,Range}Key`, to `manifest.FileMetadata` to separately track the point and range key bounds. The existing `Smallest` and `Largest` fields are used to track the bounds across _both_ point and range keys. Update the existing calls sites that set the smallest and largest keys to set all three types of bounds: point keys, range keys and combined.
ba0ac8c
to
6ee53e9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 9 of 11 files reviewed, 3 unresolved discussions (waiting on @itsbilal and @jbowens)
Previously, jbowens (Jackson Owens) wrote…
nit: s/in manifest/in *fileMetadata/
since we're not yet persisting these range keys in the manifest?
Done.
internal/manifest/version.go, line 117 at r4 (raw file):
Previously, jbowens (Jackson Owens) wrote…
This bit can be dropped since there are no respective point and range key fields, right?
Holdover from previous revision. Done.
internal/manifest/version.go, line 517 at r4 (raw file):
Previously, jbowens (Jackson Owens) wrote…
I guess at some point we'll need to update the encoding of
Version
debug strings, and wherever else we print abbreviatedFileMetadata
information to write the range key bounds too.
Yeah, need to think about this one a bit more as it will affect a large number of tests. Dropped in a TODO.
TFTR! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 3 of 10 files at r1, 3 of 7 files at r3, 1 of 1 files at r5.
Reviewable status: 10 of 11 files reviewed, 6 unresolved discussions
ingest.go, line 205 at r5 (raw file):
case !hasRanges: // Table has only point keys. Use the point key bounds. meta.Smallest = meta.SmallestPointKey.Clone()
why are we cloning again in this block? Aren't these all already stable keys?
ingest.go, line 388 at r5 (raw file):
} for _, m := range meta { // TODO(travers): We currently lack a means of determining whether one of
I'm not sure I follow this (I glanced at the PR comments too).
- "A table without point keys" means a table without any point keys and without any rangedels. Can't happen prior to range keys, yes?
- With range keys we will change the manifest format. Which means we can encode if something is absent and carry that over into FileMetadata. That should also fix this specific code issue, which by itself is not a fundamental lack of information -- we know what is unset when we called
ingestLoad
, and we just don't want to carry that extra baggage through since its nice to be able to work with purely a[]*fileMetadata
.
What am I missing?
internal/manifest/version.go, line 110 at r5 (raw file):
// internal range keys stored in the table. SmallestRangeKey InternalKey LargestRangeKey InternalKey
I want to make sure I understand the following:
- why do we need to track these bounds separately: I assume the answer is that since we have a completely separate iterator tree for both point and range keys and it is wasteful to not have the bounds of the respective kinds in the corresponding level iters. That is, this is a performance optimization.
- I don't see a change to
VersionEdit.{Encode,Decode}
to persist{Smallest,Largest}RangeKey
. Is that a later PR? - Related to the previous, what is the compatibility story with existing manifests? Presumably we want to interpret those as both point key and overall bounds.
- We are not actually going to store all 6 keys, yes? That seems wasteful. I am not clear on this based on either the PR description or the "can be reconstructed ..." comment below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 10 of 11 files reviewed, 6 unresolved discussions
ingest.go, line 205 at r5 (raw file):
Previously, sumeerbhola wrote…
why are we cloning again in this block? Aren't these all already stable keys?
Good catch, thank you. A previous implementation had this being done with the keys yielded by an iterator (similar to the existing loops above), which I understand to have shorter lifetimes (correct me if I'm wrong here).
ingest.go, line 388 at r5 (raw file):
I'll restate the problem as I see it currently:
We now have three pairs of keys in FileMetadata
for smallest / largest (point keys, range keys, and overall bounds). There's a wrinkle in the API in that Smallest/Largest
can be computed given the smallest / largest pairs for points and ranges. However, an "absent" bound currently looks like a real key (DEL#0
with an empty user key), so it shouldn't be used for comparisons. The TODO (which I put in a few places) is to come up with a better way (the TODO is applicable elsewhere).
With range keys we will change the manifest format. Which means we can encode if something is absent and carry that over into FileMetadata
If I'm understanding correctly, in the manifest we'd have both the point and range bounds, but also another field that enumerates what is present (points only, ranges only, or both). On the manifest read path we can "re-hydrate" the 'Smallest' / 'Largest' fields using the point and range pairs and this new field.
I like this approach. A slight variant on it may actually another solve an issue I have currently where we can't do a comparison operation on the keys at manifest read time (as the comparator name is also encoded into the manifest and the true comparator is populated higher up). If this enumeration can just indicate explicitly which fields constitute the overall bounds, we can avoid the comparison.
don't want to carry that extra baggage through since its nice to be able to work with purely a []*fileMetadata
I think I see what you're saying. This comparison work here (and elsewhere) is redundant, as during ingestLoad
we've already performed these comparisons, so it's pointless to do it again. I shall revert this part of the patch.
internal/manifest/version.go, line 110 at r5 (raw file):
why do we need to track these bounds separately: I assume the answer is that since we have a completely separate iterator tree for both point and range keys and it is wasteful to not have the bounds of the respective kinds in the corresponding level iters. That is, this is a performance optimization.
I believe this to be true, in the same way that #1517 adds separate methods to the table cache for range key iters. @jbowens can confirm.
I don't see a change to VersionEdit.{Encode,Decode} to persist {Smallest,Largest}RangeKey. Is that a later PR?
Correct. This PR is to unblock some other work that @itsbilal is starting that depends on having the point and range key bounds tracked separately in FileMetadata
. The version edit requires some more thought, as we're changing the format slightly and will be implemented in a follow-up.
Related to the previous, what is the compatibility story with existing manifests? Presumably we want to interpret those as both point key and overall bounds.
Still working on this part and how it will be persisted. We want to tread carefully given that it would most likely be a backwards incompatible change.
We are not actually going to store all 6 keys, yes? That seems wasteful. I am not clear on this based on either the PR description or the "can be reconstructed ..." comment below.
Correct. One approach would store two pairs: one for point key bounds (re-purposing the existing fields), and a pair for range keys.
The "can be reconstructed" comment refers to being able to compute what the Smallest
and Largest
using just the point and range key bounds.
Let me know if any of that warrants further explanation, either here, or in some commentary.
During ingest loads, the smallest and largest bounds for point and ranges are calculated. The point and range key bounds are cloned from an iterator. When setting the overall bounds for the table, the point and / or range key bounds can be reused for `Smallest` and `Largest`, as the former are stable. Remove the unnecessary key clones. Follow up from cockroachdb#1521.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 10 of 11 files reviewed, 6 unresolved discussions
ingest.go, line 388 at r5 (raw file):
If I'm understanding correctly, in the manifest we'd have both the point and range bounds, but also another field that enumerates what is present (points only, ranges only, or both). On the manifest read path we can "re-hydrate" the 'Smallest' / 'Largest' fields using the point and range pairs and this new field.
Yes, that's what I meant to say.
I shall revert this part of the patch.
I wasn't suggesting reverting this.
I was just saying that because of the preceding code path, and the forthcoming changes to the MANIFEST format, we can explicitly encode in FileMetadata what bounds are present or absent. I don't have an opinion on whether nil should be used for the absence indicator. Then you can have methods on FileMetadata like pointBoundsPresent(), rangeBoundsPresent(), and not have TODOs like this.
During ingest loads, the smallest and largest bounds for point and ranges are calculated. The point and range key bounds are cloned from an iterator. When setting the overall bounds for the table, the point and / or range key bounds can be reused for `Smallest` and `Largest`, as the former are stable. Remove the unnecessary key clones. Follow up from #1521.
Fix a test-only bug introduced in cockroachdb#1521 where the point key bounds were not appropriately set using `{Smallest,Largest}PointKey`. Instead the fields were referenced directly. Update test cases.
Fix a test-only bug introduced in cockroachdb#1521 where the point key bounds were not appropriately set using `{Smallest,Largest}PointKey`. Instead the fields were referenced directly. Update test cases.
Fix a test-only bug introduced in #1521 where the point key bounds were not appropriately set using `{Smallest,Largest}PointKey`. Instead the fields were referenced directly. Update test cases.
Currently, `FileMetadata` exposes three pairs of keys representing bounds for a table - one each for point keys, range keys and the overall bounds for a table. The pair for the overall bounds for the table are a function of the point and range pairs and could technically be derived, but instead these overall bounds are materialized on the `FileMetadata` struct directly for performance reasons (it is faster to dereference the struct fields than making a function call to perform one or more comparisons). With the addition of two pairs of bounds in cockroachdb#1521 for the point and range bounds, there is a risk that a caller forgets to set a pair of bounds, violating the invariant that the overall point and range bounds are internally consistent with the point and range key bounds. To mitigate this risk, introduce methods on `manifest.FileMetadata` that can be used to "extend" the point or range key bounds for a table. These methods will maintain the invariant that the overall bounds for the table are computed directly from the point and range bounds. The methods, `MaybeExtend{Point,Range}KeyBounds`, are intended for "optimistic" use - a table's point, range and / or overall bounds may not necessarily be extended by a call to the respective methods if the existing bounds are smaller or larger than the provided bounds. This confines the comparisons to these method calls, as opposed to having the caller perform the checks to determine when bounds should be updated. As these fields are set once and read multiple times, this is considered a reasonable compromise. Update all existing call sites that directly set the existing `FileMetadata` smallest / largest bound fields to instead use the new methods for extending bounds. The only remaining call sites that set fields directly are limited to the `manifest` package and one call site in `ingest.go` that needs to mutate sequence numbers. These calls sites that store values in the fields directly are explicitly commented to indicate why the fields are being set directly. One alternative considered was to unexport the smallest / largest fields and gate access behind methods. However, as mentioned above, for performance reasons it beneficial to keep these fields exported.
Currently, `FileMetadata` exposes three pairs of keys representing bounds for a table - one each for point keys, range keys and the overall bounds for a table. The pair for the overall bounds for the table are a function of the point and range pairs and could technically be derived, but instead these overall bounds are materialized on the `FileMetadata` struct directly for performance reasons (it is faster to dereference the struct fields than making a function call to perform one or more comparisons). With the addition of two pairs of bounds in cockroachdb#1521 for the point and range bounds, there is a risk that a caller forgets to set a pair of bounds, violating the invariant that the overall point and range bounds are internally consistent with the point and range key bounds. To mitigate this risk, introduce methods on `manifest.FileMetadata` that can be used to "extend" the point or range key bounds for a table. These methods will maintain the invariant that the overall bounds for the table are computed directly from the point and range bounds. The methods, `MaybeExtend{Point,Range}KeyBounds`, are intended for "optimistic" use - a table's point, range and / or overall bounds may not necessarily be extended by a call to the respective methods if the existing bounds are smaller or larger than the provided bounds. This confines the comparisons to these method calls, as opposed to having the caller perform the checks to determine when bounds should be updated. As these fields are set once and read multiple times, this is considered a reasonable compromise. Update all existing call sites that directly set the existing `FileMetadata` smallest / largest bound fields to instead use the new methods for extending bounds. The only remaining call sites that set fields directly are limited to the `manifest` package and one call site in `ingest.go` that needs to mutate sequence numbers. These calls sites that store values in the fields directly are explicitly commented to indicate why the fields are being set directly. One alternative considered was to unexport the smallest / largest fields and gate access behind methods. However, as mentioned above, for performance reasons it beneficial to keep these fields exported.
Currently, `FileMetadata` exposes three pairs of keys representing bounds for a table - one each for point keys, range keys and the overall bounds for a table. The pair for the overall bounds for the table are a function of the point and range pairs and could technically be derived, but instead these overall bounds are materialized on the `FileMetadata` struct directly for performance reasons (it is faster to dereference the struct fields than making a function call to perform one or more comparisons). With the addition of two pairs of bounds in cockroachdb#1521 for the point and range bounds, there is a risk that a caller forgets to set a pair of bounds, violating the invariant that the overall point and range bounds are internally consistent with the point and range key bounds. To mitigate this risk, introduce methods on `manifest.FileMetadata` that can be used to "extend" the point or range key bounds for a table. These methods will maintain the invariant that the overall bounds for the table are computed directly from the point and range bounds. The methods, `MaybeExtend{Point,Range}KeyBounds`, are intended for "optimistic" use - a table's point, range and / or overall bounds may not necessarily be extended by a call to the respective methods if the existing bounds are smaller or larger than the provided bounds. This confines the comparisons to these method calls, as opposed to having the caller perform the checks to determine when bounds should be updated. As these fields are set once and read multiple times, this is considered a reasonable compromise. Update all existing call sites that directly set the existing `FileMetadata` smallest / largest bound fields to instead use the new methods for extending bounds. The only remaining call sites that set fields directly are limited to the `manifest` package and one call site in `ingest.go` that needs to mutate sequence numbers. These calls sites that store values in the fields directly are explicitly commented to indicate why the fields are being set directly. One alternative considered was to unexport the smallest / largest fields and gate access behind methods. However, as mentioned above, for performance reasons it beneficial to keep these fields exported.
Currently, `FileMetadata` exposes three pairs of keys representing bounds for a table - one each for point keys, range keys and the overall bounds for a table. The pair for the overall bounds for the table are a function of the point and range pairs and could technically be derived, but instead these overall bounds are materialized on the `FileMetadata` struct directly for performance reasons (it is faster to dereference the struct fields than making a function call to perform one or more comparisons). With the addition of two pairs of bounds in cockroachdb#1521 for the point and range bounds, there is a risk that a caller forgets to set a pair of bounds, violating the invariant that the overall point and range bounds are internally consistent with the point and range key bounds. To mitigate this risk, introduce methods on `manifest.FileMetadata` that can be used to "extend" the point or range key bounds for a table. These methods will maintain the invariant that the overall bounds for the table are computed directly from the point and range bounds. The methods, `MaybeExtend{Point,Range}KeyBounds`, are intended for "optimistic" use - a table's point, range and / or overall bounds may not necessarily be extended by a call to the respective methods if the existing bounds are smaller or larger than the provided bounds. This confines the comparisons to these method calls, as opposed to having the caller perform the checks to determine when bounds should be updated. As these fields are set once and read multiple times, this is considered a reasonable compromise. Update all existing call sites that directly set the existing `FileMetadata` smallest / largest bound fields to instead use the new methods for extending bounds. The only remaining call sites that set fields directly are limited to the `manifest` package and one call site in `ingest.go` that needs to mutate sequence numbers. These calls sites that store values in the fields directly are explicitly commented to indicate why the fields are being set directly. One alternative considered was to unexport the smallest / largest fields and gate access behind methods. However, as mentioned above, for performance reasons it beneficial to keep these fields exported.
Currently, `FileMetadata` exposes three pairs of keys representing bounds for a table - one each for point keys, range keys and the overall bounds for a table. The pair for the overall bounds for the table are a function of the point and range pairs and could technically be derived, but instead these overall bounds are materialized on the `FileMetadata` struct directly for performance reasons (it is faster to dereference the struct fields than making a function call to perform one or more comparisons). With the addition of two pairs of bounds in #1521 for the point and range bounds, there is a risk that a caller forgets to set a pair of bounds, violating the invariant that the overall point and range bounds are internally consistent with the point and range key bounds. To mitigate this risk, introduce methods on `manifest.FileMetadata` that can be used to "extend" the point or range key bounds for a table. These methods will maintain the invariant that the overall bounds for the table are computed directly from the point and range bounds. The methods, `MaybeExtend{Point,Range}KeyBounds`, are intended for "optimistic" use - a table's point, range and / or overall bounds may not necessarily be extended by a call to the respective methods if the existing bounds are smaller or larger than the provided bounds. This confines the comparisons to these method calls, as opposed to having the caller perform the checks to determine when bounds should be updated. As these fields are set once and read multiple times, this is considered a reasonable compromise. Update all existing call sites that directly set the existing `FileMetadata` smallest / largest bound fields to instead use the new methods for extending bounds. The only remaining call sites that set fields directly are limited to the `manifest` package and one call site in `ingest.go` that needs to mutate sequence numbers. These calls sites that store values in the fields directly are explicitly commented to indicate why the fields are being set directly. One alternative considered was to unexport the smallest / largest fields and gate access behind methods. However, as mentioned above, for performance reasons it beneficial to keep these fields exported.
Currently, only point keys are tracked in the manifest. With the
addition of range keys, the bounds of an SSTable should be computed by
considering the bounds of both the point keys and the range keys, and
taking the smallest or largest across both types of key, respectively.
Add four additional fields,
{Smallest,Largest}{Point,Range}Key
, tomanifest.FileMetadata
to separately track the point and range keybounds. The existing
Smallest
andLargest
fields are used to trackthe bounds across both point and range keys.
Update the existing calls sites that set the smallest and largest keys
to set all three types of bounds: point keys, range keys and combined.