-
Notifications
You must be signed in to change notification settings - Fork 476
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sstable: add RangeKeyIter iterator #1513
Conversation
Add a range key iterator type intended to be used externally for reading the range-keys contained within an sstable. The interface is simple, providing forward iteration from beginning to end only. Each individual range key is surfaced separately, even if coalesced into single physical fragments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@erikgrinaker A simple forward iteration interface is enough, right?
Reviewable status: 0 of 6 files reviewed, all discussions resolved (waiting on @erikgrinaker, @nicktrav, and @sumeerbhola)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A simple forward iteration interface is enough, right?
Yep, that'll do, thanks! Would it be easy to allow lower/upper key bounds? Otherwise, it's trivial to just do the filtering in the MVCC range key iterator too.
Reviewable status: 0 of 6 files reviewed, all discussions resolved (waiting on @nicktrav and @sumeerbhola)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On second thought, we'd ideally want the CRDB SSTIterator
to satisfy the SimpleMVCCIterator
interface, which uses interleaved point/range key iteration. This is based on sstable.Reader.NewIter()
. Would it be possible to reuse the interleaving logic that we get from pebble.Iterator
?
If not, nbd -- I can create separate SST iterator implementations for point and range keys, which covers my needs for now, and we can revisit this later.
Reviewable status: 0 of 6 files reviewed, all discussions resolved (waiting on @nicktrav and @sumeerbhola)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 6 of 6 files at r1, all commit messages.
Reviewable status: complete! all files reviewed, all discussions resolved (waiting on @sumeerbhola)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On second thought, we'd ideally want the CRDB SSTIterator to satisfy the SimpleMVCCIterator interface, which uses interleaved point/range key iteration.
Where is this sst originating from? We can't defragment across ssts in this context, so we would need to expose range keys that are not prefix keys. I'm worried about polluting the SimpleMVCCIterator
interface -- I thought we had managed to come up with a scheme such that our common iterator interfaces in the storage package would expose [start, end) pairs that were roachpb.Key
s.
Reviewed 6 of 6 files at r1, all commit messages.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @jbowens)
sstable/external.go, line 100 at r1 (raw file):
endKey, value, ok := rangekey.DecodeEndKey(i.iterKey.Kind(), v) if !ok { panic("pebble: unable to decode rangekey end")
I believe this can happen due to a corruption, in which case we should not panic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where is this sst originating from? We can't defragment across ssts in this context, so we would need to expose range keys that are not prefix keys. I'm worried about polluting the SimpleMVCCIterator interface -- I thought we had managed to come up with a scheme such that our common iterator interfaces in the storage package would expose [start, end) pairs that were roachpb.Keys.
These will e.g. be emitted by Export
or ingested by AddSSTable
, so they're generated by client and server logic. Consider e.g. streaming replication, which will reads events off of a rangefeed, build SSTs which include range tombstones, and ingest these into the target cluster. Requiring range keys to have roachpb.Key
bounds seems perfectly fine for these, and we can assert that in AddSSTable
during race test builds (or maybe always, if it's cheap enough).
The iteration is needed e.g. by backup restoration reading exported SSTs from an S3 bucket somewhere, combining them to figure out the latest visible point keys, then building an SST containing the final visible point keys to be ingested and restored. It's also needed for tests.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @jbowens)
Closing in favor of #1529. |
Add a range key iterator type intended to be used externally for reading the
range-keys contained within an sstable. The interface is simple, providing
forward iteration from beginning to end only. Each individual range key is
surfaced separately, even if coalesced into single physical fragments.