-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
row: bugfixes to cfetcher #38787
row: bugfixes to cfetcher #38787
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, thanks for the fix. I added a couple thoughts but looks good in general.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @jordanlewis and @solongordon)
pkg/sql/row/cfetcher.go, line 676 at r1 (raw file):
} key := kv.Key[len(rf.machine.lastRowPrefix):]
It still strikes me as a little dicey that lastRowPrefix
might be nil here. But I guess that is never the case if we're dealing with interleaves, so probably it's ok? Feels somewhat brittle. Should we consider always setting it in stateDecodeFirstKVOfRow
? We can do so without decoding the key by using keys.GetRowPrefixLength
.
pkg/sql/row/cfetcher.go, line 1077 at r1 (raw file):
return 0, nil } if keyLen-1 < int(columnFamilyIDLength) {
keys.go
writes this differently to avoid a potential overflow:
Lines 777 to 779 in 813b014
// Note how this next comparison (and by extension the code after it) is overflow-safe. There | |
// are more intuitive ways of writing this that aren't as safe. See #18628. | |
if colIDLen > uint64(n-1) { |
(Also maybe there is some potential to share code between the two.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @jordanlewis and @solongordon)
pkg/sql/row/cfetcher.go, line 676 at r1 (raw file):
Previously, solongordon (Solon) wrote…
It still strikes me as a little dicey that
lastRowPrefix
might be nil here. But I guess that is never the case if we're dealing with interleaves, so probably it's ok? Feels somewhat brittle. Should we consider always setting it instateDecodeFirstKVOfRow
? We can do so without decoding the key by usingkeys.GetRowPrefixLength
.
Actually if we did this, I think your new getCurrentColumnFamilyID
function would be unnecessary. You could just decode the next int to get the column family like the code was doing before.
ef598f5
to
1f05450
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @solongordon)
pkg/sql/row/cfetcher.go, line 676 at r1 (raw file):
Previously, solongordon (Solon) wrote…
Actually if we did this, I think your new
getCurrentColumnFamilyID
function would be unnecessary. You could just decode the next int to get the column family like the code was doing before.
Good idea, though it does make me sad that we're sacrificing even the tiniest bit of performance in the COUNT(*) case. The thing that I added I think manages to avoid doing work if the max column family is 0. But I think you're right anyway - I'll keep getCurrentColumnFamilyID
and have it assume that the row prefix is set, return 0 if it can, and otherwise decode.
pkg/sql/row/cfetcher.go, line 1077 at r1 (raw file):
Previously, solongordon (Solon) wrote…
keys.go
writes this differently to avoid a potential overflow:
Lines 777 to 779 in 813b014
// Note how this next comparison (and by extension the code after it) is overflow-safe. There // are more intuitive ways of writing this that aren't as safe. See #18628. if colIDLen > uint64(n-1) {
(Also maybe there is some potential to share code between the two.)
Ah, I was looking for this code but couldn't find it. Now we are using the shared method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I accidentally amended the fix to the wrong commit, then fixed it and force pushed, hopefully things make sense.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @solongordon)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like DecodeKeyValsToCols
is always invoked with a nil unseen
argument. Is something weird?
Reviewable status:
complete! 0 of 0 LGTMs obtained
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status:
complete! 0 of 0 LGTMs obtained
pkg/sql/row/cfetcher.go, line 842 at r3 (raw file):
table.extraTypes, nil, &rf.machine.remainingValueColsByIdx,
This is the sneaky spot where it's getting called with non-nil - reviewable is bad at showing it
1f05450
to
09d2557
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @jordanlewis)
pkg/sql/row/cfetcher.go, line 842 at r3 (raw file):
Previously, jordanlewis (Jordan Lewis) wrote…
This is the sneaky spot where it's getting called with non-nil - reviewable is bad at showing it
Aha, thanks. Looks good to go then.
System tables are special in that they might be missing their "sentinel kv": the kv that's marked as column family 0. Previously, cfetcher assumed that a sentinel kv would always exist. This led to the inability of decodinig system tables correctly. Now, if the table being read has more than one column family, we always actively decode the column family id from the first key in a row to make sure we know what values will be available in the kv. Release note: None
Previously, the cfetcher misbehaved when decoding unique secondary indexes that had null values in one or more of their index columns. This was because the encoding of the "extra columns" in a unique index (the columns that are in the primary index but missing from the secondary) is performed using the key encoding, not the value encoding, and the key decoding routines in the cfetcher hadn't been taught to keep track of which needed columns hadn't been seen in the current row yet. This commit corrects the problem by teaching the key decoding routine to optionally remove the decoded keys from the set of unseen column values so far. Release note: None
09d2557
to
4859dec
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TFTR!
(I uploaded a fix to an unfinished comment in the first commit, hence the final push.)
bors r+
Reviewable status:
complete! 0 of 0 LGTMs obtained
bors r+ |
38787: row: bugfixes to cfetcher r=jordanlewis a=jordanlewis Fixes #38755. Fixes #38611. Fixes #38609. Two bugfixes to the cfetcher. - row: teach cfetcher about missing sentinel kvs System tables are special in that they might be missing their "sentinel kv": the kv that's marked as column family 0. Previously, cfetcher assumed that a sentinel kv would always exist. This led to the inability of decoding system tables correctly. Now, if the table being read has more than one column family, we always actively decode the column family id from the first key in a row to make sure we know what values will be available in the kv. - row: cfetcher bugfix to unique idxs w/ nulls Previously, the cfetcher misbehaved when decoding unique secondary indexes that had null values in one or more of their index columns. This was because the encoding of the "extra columns" in a unique index (the columns that are in the primary index but missing from the secondary) is performed using the key encoding, not the value encoding, and the key decoding routines in the cfetcher hadn't been taught to keep track of which needed columns hadn't been seen in the current row yet. This commit corrects the problem by teaching the key decoding routine to optionally remove the decoded keys from the set of unseen column values so far. Release note: None Co-authored-by: Jordan Lewis <[email protected]>
Build succeeded |
Fixes #38755.
Fixes #38611.
Fixes #38609.
Two bugfixes to the cfetcher.
row: teach cfetcher about missing sentinel kvs
System tables are special in that they might be missing their "sentinel
kv": the kv that's marked as column family 0. Previously, cfetcher
assumed that a sentinel kv would always exist. This led to the inability
of decoding system tables correctly.
Now, if the table being read has more than one column family, we always
actively decode the column family id from the first key in a row to make
sure we know what values will be available in the kv.
row: cfetcher bugfix to unique idxs w/ nulls
Previously, the cfetcher misbehaved when decoding unique secondary
indexes that had null values in one or more of their index columns. This
was because the encoding of the "extra columns" in a unique index (the
columns that are in the primary index but missing from the secondary) is
performed using the key encoding, not the value encoding, and the key
decoding routines in the cfetcher hadn't been taught to keep track of
which needed columns hadn't been seen in the current row yet.
This commit corrects the problem by teaching the key decoding routine to
optionally remove the decoded keys from the set of unseen column values
so far.
Release note: None