Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sstable: add RangeKeyIter iterator #1513

Closed
wants to merge 1 commit into from

Conversation

jbowens
Copy link
Collaborator

@jbowens jbowens commented Feb 15, 2022

Add a range key iterator type intended to be used externally for reading the
range-keys contained within an sstable. The interface is simple, providing
forward iteration from beginning to end only. Each individual range key is
surfaced separately, even if coalesced into single physical fragments.

Add a range key iterator type intended to be used externally for reading the
range-keys contained within an sstable. The interface is simple, providing
forward iteration from beginning to end only. Each individual range key is
surfaced separately, even if coalesced into single physical fragments.
@cockroach-teamcity
Copy link
Member

This change is Reviewable

Copy link
Collaborator Author

@jbowens jbowens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@erikgrinaker A simple forward iteration interface is enough, right?

Reviewable status: 0 of 6 files reviewed, all discussions resolved (waiting on @erikgrinaker, @nicktrav, and @sumeerbhola)

Copy link
Contributor

@erikgrinaker erikgrinaker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A simple forward iteration interface is enough, right?

Yep, that'll do, thanks! Would it be easy to allow lower/upper key bounds? Otherwise, it's trivial to just do the filtering in the MVCC range key iterator too.

Reviewable status: 0 of 6 files reviewed, all discussions resolved (waiting on @nicktrav and @sumeerbhola)

Copy link
Contributor

@erikgrinaker erikgrinaker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On second thought, we'd ideally want the CRDB SSTIterator to satisfy the SimpleMVCCIterator interface, which uses interleaved point/range key iteration. This is based on sstable.Reader.NewIter(). Would it be possible to reuse the interleaving logic that we get from pebble.Iterator?

If not, nbd -- I can create separate SST iterator implementations for point and range keys, which covers my needs for now, and we can revisit this later.

Reviewable status: 0 of 6 files reviewed, all discussions resolved (waiting on @nicktrav and @sumeerbhola)

Copy link
Contributor

@nicktrav nicktrav left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice. :lgtm:

Reviewed 6 of 6 files at r1, all commit messages.
Reviewable status: :shipit: complete! all files reviewed, all discussions resolved (waiting on @sumeerbhola)

Copy link
Collaborator

@sumeerbhola sumeerbhola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On second thought, we'd ideally want the CRDB SSTIterator to satisfy the SimpleMVCCIterator interface, which uses interleaved point/range key iteration.

Where is this sst originating from? We can't defragment across ssts in this context, so we would need to expose range keys that are not prefix keys. I'm worried about polluting the SimpleMVCCIterator interface -- I thought we had managed to come up with a scheme such that our common iterator interfaces in the storage package would expose [start, end) pairs that were roachpb.Keys.

Reviewed 6 of 6 files at r1, all commit messages.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @jbowens)


sstable/external.go, line 100 at r1 (raw file):

	endKey, value, ok := rangekey.DecodeEndKey(i.iterKey.Kind(), v)
	if !ok {
		panic("pebble: unable to decode rangekey end")

I believe this can happen due to a corruption, in which case we should not panic.

Copy link
Contributor

@erikgrinaker erikgrinaker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is this sst originating from? We can't defragment across ssts in this context, so we would need to expose range keys that are not prefix keys. I'm worried about polluting the SimpleMVCCIterator interface -- I thought we had managed to come up with a scheme such that our common iterator interfaces in the storage package would expose [start, end) pairs that were roachpb.Keys.

These will e.g. be emitted by Export or ingested by AddSSTable, so they're generated by client and server logic. Consider e.g. streaming replication, which will reads events off of a rangefeed, build SSTs which include range tombstones, and ingest these into the target cluster. Requiring range keys to have roachpb.Key bounds seems perfectly fine for these, and we can assert that in AddSSTable during race test builds (or maybe always, if it's cheap enough).

The iteration is needed e.g. by backup restoration reading exported SSTs from an S3 bucket somewhere, combining them to figure out the latest visible point keys, then building an SST containing the final visible point keys to be ingested and restored. It's also needed for tests.

Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @jbowens)

@jbowens
Copy link
Collaborator Author

jbowens commented Feb 23, 2022

Closing in favor of #1529.

@jbowens jbowens closed this Feb 23, 2022
@jbowens jbowens deleted the rk-reader-external branch September 13, 2022 20:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants