-
Notifications
You must be signed in to change notification settings - Fork 473
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sstable: fix twoLevelIterator SeekGE optimization #2051
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I owe some unit tests for this fix. This fixes the specific metamorphic test failure. I'll let stressmeta run for a few hours.
Reviewable status: 0 of 4 files reviewed, all discussions resolved (waiting on @jbowens)
f1ccf3b
to
aa77dab
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added some longer commentary about our iterator optimizations and cleaned up some invariant checking. I am feeling more confident about this now. Could use a quick look while I work on adding unit tests.
Reviewable status: 0 of 4 files reviewed, all discussions resolved (waiting on @jbowens)
a456811
to
49fb17a
Compare
Added a unit test |
Metamorphic testing with the temporary changes mentioned in #2059 was ok for 16+ hours, but failed without this fix (the failure mentioned in that issue). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 2 of 4 files at r1, 8 of 11 files at r2, 5 of 6 files at r3, 2 of 2 files at r4, all commit messages.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @sumeerbhola)
sstable/reader.go
line 1692 at r4 (raw file):
// This invariant checking is important enough that we do not gate it // behind invariants.Enabled. if true {
Is this if true
block temporary?
sstable/reader.go
line 1694 at r4 (raw file):
if true { if i.boundsCmp > 0 { flags = flags.DisableTrySeekUsingNext()
should we leave a TODO to fix TestIterRandomizedMaybeFilteredKeys
?
The current TrySeekUsingNext optimization had a bug where a first SeekGE could set exhaustedBounds=+1, and the next monotonic twoLevelIterator.SeekGE (without changing bounds) would set exhaustedBounds back to 0, call singleLevelIterator.SeekGE which would immediately return because of i.data.isDataInvalidated() (and TrySeekUsingNext), which would cause the twoLevelIterator to step the top-level index iterator. This woud break a subsequent optimization when the bounds were monotonically advanced, since the top-level index iterator had been advanced too far. See cockroachdb#2036 (comment) for an example of the bug. Given that both singleLevelIterator and twoLevelIterator use the exhaustedBounds value from the previous SeekGE call for the TrySeekUsingNext optimization, the twoLevelIterator should be selective on when it resets exhaustedBounds to 0. The logic behind when this can be done is not complex, so should be maintainable. This bug should only occur when using block property filters with the monotonic bound optimization: the second SeekGE in the above, which moves the top-level index too far forward would not happen if the singleLevelIterator was actually loading data blocks, since the singleLevelIterator would do some work and set exhaustedBound back to +1. The PR also constrains this TrySeekUsingNext fast path to the case where i.err == nil. This was an oversight, though very unlikely to happen in practice. There are todos added to make further changes after we backport this change: - The twoLevelIterator.SeekPrefixGE code does not currently utilize TrySeekUsingNext. There is no reason to hold back on this optimization. - The cases where the twoLevelIterator is already exhausted are not optimized. We will wastefully reseek the top level index. Fixes cockroachdb#2036
49fb17a
to
fc37e52
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TFTR!
Reviewable status: 16 of 17 files reviewed, 2 unresolved discussions (waiting on @jbowens)
sstable/reader.go
line 1692 at r4 (raw file):
Previously, jbowens (Jackson Owens) wrote…
Is this
if true
block temporary?
Removed
sstable/reader.go
line 1694 at r4 (raw file):
Previously, jbowens (Jackson Owens) wrote…
should we leave a TODO to fix
TestIterRandomizedMaybeFilteredKeys
?
Added a TODO
Cannot reproduce the |
b26ee3c7 wip 78369717 debug 5c65b3d5 Fix lint errors from cockroachdb/pebble#2051 4dc277f9 crossversion: use op-count distribution with low lower bound e1da16a7 *: Fix race in compactMarkedFilesLocked with concurrent compactions dd12a228 vfs: default to Unix semantics in MemFS caa5fcde db: adaptive size-based rollover logic for manifest e3a60a87 sstable: fix twoLevelIterator SeekGE optimization 181258e4 db: integrate elastic cpu granter with pebble f34af25a .github: renable crossversion metamorphic CI action
b26ee3c7 wip 78369717 debug 5c65b3d5 Fix lint errors from cockroachdb/pebble#2051 4dc277f9 crossversion: use op-count distribution with low lower bound e1da16a7 *: Fix race in compactMarkedFilesLocked with concurrent compactions dd12a228 vfs: default to Unix semantics in MemFS caa5fcde db: adaptive size-based rollover logic for manifest e3a60a87 sstable: fix twoLevelIterator SeekGE optimization 181258e4 db: integrate elastic cpu granter with pebble f34af25a .github: renable crossversion metamorphic CI action Release note: None
37cf5274 crossversion: explicitly set artifacts directory fd7e9660 vfs: use `Frsize` for computing total size on Linux 5c65b3d5 Fix lint errors from cockroachdb/pebble#2051
37cf5274 crossversion: explicitly set artifacts directory fd7e9660 vfs: use `Frsize` for computing total size on Linux 5c65b3d5 Fix lint errors from cockroachdb/pebble#2051 Release note: None
86968: sql: SHOW QUERIES lazily interpolates placeholders r=jordanlewis a=jordanlewis SHOW QUERIES (and crdb_internal.node_queries, cluster_queries) interpolates placeholder values into the statement so that it is possible to see the placeholder values of a prepared statement - but it used to do this unconditionally during statement execution. This is an expensive process that spends a lot of CPU for little reason, since the interpolation was happening in the hot path of every query. Now, we include the placeholder values as a separate array in the internal representation of active queries, and interpolate the values only when they're being pulled out to examine, to avoid the unconditional runtime interpolation costs. As a side effect of this change, the original comments in a query are now included in SHOW QUERIES and the two active queries tables. Release note (sql change): the query field in the crdb_internal.node_queries, crdb_internal.cluster_queries, and SHOW QUERIES commands now includes the original comments in the queries. Informs CRDB-17299 89753: diagrams: Diagrams job no longer fails when no diagrams change r=nickvigilante a=nickvigilante If there are PRs that don't affect the diagrams or their related files, the Publish SQL Grammar Diagrams job will fail, because `git commit` produces a non-zero exit code if there are no files changed in the working directory. This fixes that problem. Release note: None Fixes DOC-5642 90784: ui: simplify insights sagas, fix contention query filter r=ericharmeling a=ericharmeling This PR simplifies the insights sagas by moving the insights store refresh interval to the components and removing the root-level reset saga. The commit also updates the SQL query for the transaction contention events to not JOIN with the internal insights table, and to filter out all unresolved txn fingerprint IDs. Fixes #90142. Release note: None Loom: https://www.loom.com/share/f12685c712124560b326f211b44a06d1 90800: server: use GRANT statement for test server auth user role grant r=knz,rafiss a=andyyang890 Previously, the admin role is granted to the auth user by manually inserting a row into system.role_members. This change replaces the row insertion with a GRANT statement to abstract away the interaction with system tables. Part of #87079 Release note: None 90812: bazel: set correct flags for `printf` analyzer r=rail,healthy-pod a=rickystewart Closes #80006. Release note: None Epic: CRDB-8349 90838: vendor: bump Pebble to 37cf5274896b r=nicktrav a=jbowens ``` 37cf5274 crossversion: explicitly set artifacts directory fd7e9660 vfs: use `Frsize` for computing total size on Linux 5c65b3d5 Fix lint errors from cockroachdb/pebble#2051 ``` Epic: none Release note: None Co-authored-by: Jordan Lewis <[email protected]> Co-authored-by: Nick Vigilante <[email protected]> Co-authored-by: Eric Harmeling <[email protected]> Co-authored-by: Andy Yang <[email protected]> Co-authored-by: Ricky Stewart <[email protected]> Co-authored-by: Jackson Owens <[email protected]>
65c7d5d5 Fix lint errors from cockroachdb/pebble#2051 29627fc3 vfs: use `Frsize` for computing total size on Linux
``` 65c7d5d5 Fix lint errors from cockroachdb/pebble#2051 29627fc3 vfs: use `Frsize` for computing total size on Linux ``` Release note (bug fix): When running Cockroach inside of a Docker container on macOS, when mounting a host filesystem into the container, the total available capacity calculation of the filesystem could be reported incorrectly.
…73801-b33d6e173cae to v0.0.0-20221103174942-7486bbcf2861 * 7486bbcf db: fix levelIter TrySeekUsingNext optimization * 13fbed20 sstable: optimize SeekPrefixGE for two-level indices to use trySeekUsingNext * 948d4a01 db: allow for tuning point tombstone weight * 8b31184d db: add Experimental.LevelMultiplier option * 418733ae db: add LazyValue for a value that may not be stored in-place with the key * b53472d0 db: apply TrySeekUsingNext to level metadata * b5deee5d tool: print input, output, deleted and moved bytes in compaction tool * 37cf5274 crossversion: explicitly set artifacts directory * fd7e9660 vfs: use `Frsize` for computing total size on Linux * 5c65b3d5 Fix lint errors from cockroachdb/pebble#2051 * 4dc277f9 crossversion: use op-count distribution with low lower bound * e1da16a7 *: Fix race in compactMarkedFilesLocked with concurrent compactions * dd12a228 vfs: default to Unix semantics in MemFS * caa5fcde db: adaptive size-based rollover logic for manifest * e3a60a87 sstable: fix twoLevelIterator SeekGE optimization * 181258e4 db: integrate elastic cpu granter with pebble * f34af25a .github: renable crossversion metamorphic CI action * b481a1e2 crossversion: copy artifacts to artifacts dir on failure * 3d3cb0bf crossversion: propagate verbosity flag * d436e7fa scripts: write test binaries to a temporary dir * b3a9ee4d scripts: allow running crossversion script from non-master branches * cc0dd372 manifest: unref files with range keys when version refcnt is zero * 653147d0 metamorphic: Flush before close if DisableWAL is true * ff933d45 scripts: default to a 30m timeout in non-stress run-crossversion-meta.sh * 24f67a4b db: fix FormatPrePebblev1Marked migration * 1424c12d metamorphic: Copy data/wal dirs from initial state * 5ed983e5 sstable: adjust sizeEstimate logic for forthcoming value blocks * 0090519b metrics: expose fsync latency as prometheus histogram * bdc28bc8 db: truncate range tombstones to containing file's atomic unit * 53ab7685 scripts: add script for running the crossversion metamorphic tests * 7e037ade tool: add --filter-{start,end} flags * b1828107 metamorphic/crossversion: avoid linking test binaries * f180ea4a metamorphic/crossversion: copy temp dir on failure * b6b30d5f metamorphic/crossversion: use LinkOrCopy * 5fdb3ead metric: expose pebble fsync latency with a callback * a4b203ef metrics: remove deprecated InternalIntervalMetrics method * f122ff49 tool: capture stdout / stderr * eec7375f db: return error from Close if open snapshots remain * 499bd887 internal/lint: remove support for Go 1.16 * 518ead0d build: remove vendor directory * dea76ba4 db: add IteratorStats.Merge * 781ebeec db: fix SeekPrefixGE TrySeekUsingNext optimization with batch * 18d98601 db: restore TrySeekUsingNext across SetOptions batch refresh * 8b2af8f1 metrics: expose InternalIntervalMetrics as cumulative metrics * 0e35983c Makefile: report diff in format-check * e321ebef db: always return RangeKeyChanged()=false when range keys disabled * c1e4a7c6 db: update datadriven iterator test format * 5707a6bc sstable: correct comment describing boundsCmp case * 829c25fa db: fix RangeKeyChanged and -WithLimit interaction * 0f464e9c sstable: fix interaction between bpf and monotonic bounds optimization * 46a6a539 sstable: remove commented code * 9a552270 db: fix TestRangeKeyMaskingRandomized * 8418ebc2 build: bump min go version to 1.17 * 5f8eb821 *: remove references to `ioutil` * 05a4b29c db: expand iter_histories test coverage * 2855ba7c db: refactor TestRangeKeys into TestIterHistories * 63300403 db: enable TrySeekUsingNext after Next in external iters * fa910870 db: add ExternalIter_NonOverlapping_SeekNextScan benchmark * 1d444f36 sstable: include more blocks' stats in BlockBytes * d8f4eb38 docs: fix date for struct zeroing annotation * c04d1287 metamorphic: always synchronize Clone on underlying Batch * 81a4342c docs: add benchmark annotation for #1822 * c65b4143 docs: move rfcs from docs/rfcs to docs/RFCs |\ | * a2506ee5 Move files from docs/rfcs to docs/RFCs |/ * 3bbd4281 db: fix interaction between no-op optimizations and RangeKeyChanged * f079ef51 docs: draft design sketch for ranged keys * d9320130 db: fix opts.Logger race in TestSetOptionsEquivalence * 10b967f6 db: ensure RangeKeyChanged is true after no-op SetBounds * c2017dee db: stress rangekey masking implementation in a randomized test * 20a40b51 db: enforce batch point visibility in the merging iterator * ade651d0 metamorphic: add test option to disable lazy-combined iteration * 3358c2be internal/keyspan: remove unnecessary allocations * eb5c079c metamorphic: synchronize iterator ops with indexed batch * 33c108eb metamorphic: add limited concurrency * acc59654 *: Add random test for iterator seek - BlockPropertyFilters interaction * 3f835a3d metamorphic: test RangeKeyChanged=false in the metamorphic tests * 4ab85a4d metamorphic: increase probability of valid SeekPrefixGE calls * c2dc3de2 *: don't close pointIter twice if bypassing merging iter * b38417b0 db: wait for table cache releasing goroutines to exit * c751cef4 *: optimize external iters for single-file, no-rangekey cases * e729e744 metamorphic: test Clone with Options * 4cc0974f *: propagate pointer to InternalIteratorStats * 9c2bc013 metamorphic: print out correct options name on test failure * b83dd2c7 *: fix error handling in simpleLevelIter * 70943038 *: use Equal for more efficient equality comparisons * 7030aaab metamorphic: fix iterSeekLTWithlimit * 89408b27 tool: support range keys * 406c1dce *: add simpleLevelIterator, reduce merging levels in external iter * cd7f076e metamorphic: reposition iterator after SetOptions * 5f6b4325 db: make RangeKeyMasking.Filter a constructor function * 358f750b db: fix Close race with WAL rotation * eff82bdf db: fix prefix iteration invariant violation * 09c6e030 internal/metamorphic: improve SetOptions coverage * 358c749e db: add option to force memtable flush of range keys * 715b5645 db: expand TestSetOptionsEquivalence code coverage * d390a1c5 metamorphic: set iters last bounds on iter creation * c04897ea db: invalidate iterator in SetOptions if range keys are present * 4e1bc304 internal/keyspan: fix SeekPrefixGE span reuse nondeterminism * 94bbf143 db: invalidate RangeKeyChanged state in SetOptions * 0b6ec868 sstable: pool range key fragment iterators * be796cfc db: handle Next on exhausted prefix iterator * 4c5e752d internal/keyspan: fix prefixed InterleavingIter.InitSeekGE * 84fa6a01 db: add Metrics.Keys.RangeKeySetsCount * f4b082db db: avoid defragmenting beyond prefix bounds during prefix iteration * c135b6df metamorphic: use pointer receiver * c06e0793 db: add format major versions and migrations for min table format * f0edf9c7 manifest: make `FileMetadata.Compacting` an enum * bce4897b db: skip TestCompactFlushQueuedMemTableAndFlushMetrics on windows * 67f552b7 internal/rangekey: enforce iterator bounds beneath defragmentation * cb25d247 db: add Iterator.RangeKeyChanged * 1e4c2b60 sstable: add BlockIntervalFilter.Init * 1afadc20 internal/rangekey: sort keys by suffix once during iteration * eb306b9c db: only switch to combined iteration RangeKeySet keys * 6cdfee97 db: fix keyspan SeekGE optimization with indexed batch * a678bde7 compaction: avoid unnecessary inuse keyspan comparisons * 26964e60 compaction: avoid compactionIter key comparison when possible * 69c6e876 internal/keyspan: fix interleaving SeekGE optimization * 04f9100f internal/keyspan: avoid unnecessarily re-seeking keyspan iter * 485b1376 ci: test against go1.19 * 9de3a89f .*: make MaxConcurrentCompactions dynamically changeable * 9f3e4216 internal/keyspan: avoid buffering empty span during DefragmentingIter seeks * a4f88b79 sstable: disallow identical point keys * a1e7a77e db: avoid unnecessary key comparisons during iteration * 1c415c00 *: update range-key masking docs * 5a2941fb db: use block properties for efficient range-key masking * 245afda9 db: add FormatMinTableFormatPebblev1 format major version * d3b3fb83 db: introduce a minimum table format for each format major version * 1b01c735 sstable: allow rewriting suffixes of tables that contain only range keys * ce87a66e internal/keyspan: avoid allocation in SortKeys * 1729bc93 db: use pointer-receiver for range key sorts * 7b78c71e db: zero mergingIter, levelIter structs before use * 3fc374e4 db: fix memtable.containsRangeKeys race * a33b5a78 internal/keyspan: tighten masking guarantees * 73eee313 internal/keyspan: add SpanChanged hook calls to datadriven tests * 89b7bc72 db: lazily construct combined iterator * 4adbba57 internal/base: add SeekLTFlags * 97fbe8e2 internal/base: replace trySeekUsingNext with SeekGEFlags * 63d55279 lint: use `go run path@version` syntax for lint binaries * 6b9be031 internal/metamorphic: add block-property filter coverage * 96254cc0 db: don't filter range deletions based on point-key block filters * 13cc5cde db: refactor levelIter/mergingIter boundary interface * 44e97f43 sstable: fix additional bounds check bugs * e9467bc8 db: export ErrCorruption * 9f3691a7 db: remove compaction/flush pacing * b821342a db: remove compaction.atomicBytesIterated * d3484a60 Revert "github: add action runner for automating issue labeling" * d68c1522 github: add action runner for automating issue labeling * 4a886b7f db: remove Experimental designation on range-key write interface * 1a0819b1 Makefile: add crossversion-meta * 1f4dfe6f *: treat manifest-related errors during logAndApply as fatal * 15565f7b db: support configuring iterator during Clone * d01a410c compaction: test rebuilding L0Sublevels on logAndApply failure * f7cda06e rangekey: use package-local return type * 3a4ad0d3 *: clean up range key test cases * 85bb1c75 rangekey: create `rangekey` package * 0d272ec1 db: fix bug in deletion estimate computations * 71d17c2a sstable: Rewrite suffixes of range keys in RewriteKeySuffixes * 0dc6267d *: Rebuild L0Sublevels if version application fails * 408de3d7 Merge pull request #1752 from joshimhoff/add_pretty_keys |\ | * 04efab2e tool: add pretty-printed keys to the LSM visualization tool * | 69ce958e sstable: Export internal/rangekey decoding functions * | fe63c089 db: add Len method to Batch (#1740) * | 20e506ca *: Allocate keyspan.LevelIters inside UserIteratorConfig * | 059c072f db: consider range keys in delete-only compactions * | 8f6675bc db: add tableRangedDeletionIter for iteration of ranged deletion spans * | 2798eb73 internal/manifest: Create new LevelMetadata for range keys * | c2c02730 db: fix SetOptions batch range{del,key} refresh logic * | 2aebabad sstable: remove obsolete linter ignore directive * | dc391ef8 docs: use local scale by default on nightly benchmarks page * | ef1ca573 compaction: elision-only compactions for tables with only range keys * | 4a952c0d db: document Batch as unsafe for concurrent access * | 9a8e4742 db: support SetOptions on iterators created through NewExternalIter * | f13de498 db: fix nil-pointer in external iterator range key iteration * | 65702e71 db: avoid allocation in newIters * | f185d7fa db: remove RangeKeysArena * | ae99f4f1 *: start reading range keys from sstables * | d72083df *: implement compaction/flushing of range keys * | a015e5a0 db: exclude empty batch range deletion, range key iterators |/ * 76a1fffe Merge pull request #1603 from msbutler/butler-nlevel-compaction |\ | * 8b7a68b8 compaction: introduce multi level compaction mechanics * | 67f2653a *: pass keyspan.Spans by pointer not value * | 83b45b0a internal/keyspan: refactor InterleavingIter bounds checking * | c3053e0b internal/keyspan: manually inline {start,end}Bound in MergingIter * | 728722c5 internal/keyspan: refactor DefragmentingIter's invalid span checks * | a5d4e00e db: add BenchmarkIteratorSeekNoRangeKeys * | 96dc71db internal/keyspan: preallocate merging iter levels, items * | 86dd6fd5 internal/keyspan: replace calls to Span.Visible |/ * ad44a62e db: ensure HasPointAndRange returns false after SetOptions * 4f3c702c db: clean up and optimize Iterator.{set,save}RangeKey * 4406dabc sstable: fix bug with two-level indexes, block properties and iter bounds * e7847d68 docs: add benchmark annotation for data quality issue * 1fdc5c05 vfs: add option to turn off sync-on-close in syncingFile * d5bc42c2 db: add Pebble interface for CPU slot granter * 478b9750 sstable: fix test suffix collector's handling of range keys * cb41b17a Replace `github.com/codahale/hdrhistogram` with `github.com/HdrHistogram/hdrhistogram-go` * bb2c1501 tool/logs: print total flush / ingestion counts and durations * d1cf3412 tool/logs: sort output by (start, node, store) * 0a77780a tool/logs: fix total time calculation * 0011c057 db: fix data race in SeekPrefixGE * 99f35c80 internal/keyspan: gate invariant checks behind build tag * 00302bfc internal/keyspan: use pointer receiver for most Span methods * 3355a02e internal/rangekey: reduce range-key iteration allocations * 4e33626f db: fix bug in interaction between Clone and indexed batches * b14ad704 metamorphic: enable writer parallelism for randomzied metamorphic runs * 9bd43143 db: reduce allocations in range key iteration * e32e94d8 db: standardize behavior of indexed batch mutations during iteration * a07efd96 internal/keyspan: lift visibility filtering out of Seek{GE,LE} * e567fec8 db: ensure Open closes opened directories on error * b8c9a560 internal/metamorphic: overwrite unused bounds buffers * 7c5f0cbb db: copy user-provided bounds and optimize unchanged bounds * f8897076 *: Add IterOption to optionally read L6 filter blocks. * d79f9617 vfs: extend disk-health checking to filesystem metadata operations * 5ae21746 db: remove newRangeKeyIter closure * 6d975489 db: add BenchmarkIteratorScan * 782d102e sstable: fix invariant check for sstable size estimation * 738a7f0b db: fix NewIter regression resulting in extra memtable levels * 37558663 *: Use keyspan.LevelIter for rangedels in compactions * e6c60c71 db: use sublevel level iters for all compactions out of L0 * d8f63bef (upstream/mufeez/flushable-ingest) db: extend documentation on ingested file overlap; improve test cases * b9e970a8 internal/keyspan: correct and document MergingIter key stability * 498177bb internal/keyspan: collapse fragmentBoundIterator into MergingIter
d3600695 db: prevent ingestions from breaking seq num invariants f35ca8f2 .github: renable crossversion metamorphic CI action 0b746936 db: fix ingest's handling of exclusive range-key boundaries 65c7d5d5 Fix lint errors from cockroachdb/pebble#2051 29627fc3 vfs: use `Frsize` for computing total size on Linux
The current TrySeekUsingNext optimization had a bug where a first SeekGE could set exhaustedBounds=+1, and the next
monotonic twoLevelIterator.SeekGE (without changing bounds) would set exhaustedBounds back to 0, call singleLevelIterator.SeekGE which would immediately return because of i.data.isDataInvalidated() (and TrySeekUsingNext), which would cause the twoLevelIterator to step the top-level index iterator.
This woud break a subsequent optimization when the bounds were monotonically advanced, since the top-level index iterator had been advanced too far. See #2036 (comment) for an example of the bug.
Given that both singleLevelIterator and twoLevelIterator use the exhaustedBounds value from the previous SeekGE call for the TrySeekUsingNext optimization, the twoLevelIterator should be selective on when it resets exhaustedBounds to 0. The logic behind when this can be done is not complex, so should be maintainable.
This bug should only occur when using block property filters with the monotonic bound optimization: the second SeekGE in the above, which moves the top-level index too far forward would not happen if the singleLevelIterator was actually loading data blocks, since the singleLevelIterator would do some work and set exhaustedBound back to +1.
The PR also constrains this TrySeekUsingNext fast path to the case where i.err == nil. This was an oversight, though very unlikely to happen in practice.
There are todos added to make further changes after we backport this change:
Fixes #2036