-
Notifications
You must be signed in to change notification settings - Fork 472
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
internal/manifest: efficient range-key fileMetadata seeks #1742
Comments
I'm leaning towards the second option. I worry the first won't end up being performant enough. |
itsbilal
added a commit
to itsbilal/pebble
that referenced
this issue
Jun 16, 2022
This change creates a new LevelMetadata btree just for sstables containing range keys. This btree is incrementally updated with sstable additions/deletions that contain range keys. This should have a relatively light overhead, as range keys and files containing range keys are expected to be rare. This also has the benefit of reducing the time it takes to determine whether files contain range keys. Fixes cockroachdb#1742.
itsbilal
added a commit
to itsbilal/pebble
that referenced
this issue
Jun 16, 2022
This change creates a new LevelMetadata btree just for sstables containing range keys. This btree is incrementally updated with sstable additions/deletions that contain range keys. This should have a relatively light overhead, as range keys and files containing range keys are expected to be rare. This also has the benefit of reducing the time it takes to determine whether files contain range keys. Fixes cockroachdb#1742.
itsbilal
added a commit
that referenced
this issue
Jun 16, 2022
This change creates a new LevelMetadata btree just for sstables containing range keys. This btree is incrementally updated with sstable additions/deletions that contain range keys. This should have a relatively light overhead, as range keys and files containing range keys are expected to be rare. This also has the benefit of reducing the time it takes to determine whether files contain range keys. Fixes #1742.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
In designing range keys, we decided to store range keys in the same sstables as point keys to preserve the LSM invariant relationship between the two. However this intermingling poses a performance obstacle.
The range key and point key iterator stacks are distinct. A seek on a combined iterator needs to seek both iterators, which twice need to seek through a level's metadata to find the correct file. Because the two iterator stacks are separate, these two iterators care about separate but overlapping sets of sstables (eg, only those that hold point keys and only those that hold range keys)
For range keys which are expected to be rare, wading through many point key-only sstables adds unnecessary key comparisons. Additionally, some levels may not contain any range keys at all, and it would be preferable to exclude those levels from the range key iterator stack altogether.
A couple of potential solutions:
manifest.LevelIterator
and itsFilter
method, these annotations could also be used to skip subtrees that contain no range keys. The details of this approach might be tricky.[NumLevels]LevelMetadata
onmanifest.Version
. This parallel LSM would be a projection of the main LSM, holding only sstables that contain range keys. It would be incrementally updated alongside the combined LSM.The text was updated successfully, but these errors were encountered: