Skip to content

Commit

Permalink
coldata: fix updating offsets of bytes in Batch.SetLength
Browse files Browse the repository at this point in the history
In `SetLength` method we are maintaining the invariant of `Bytes`
vectors that the offsets are non-decreasing sequences. Previously, this
was done incorrectly when a selection vector is present on the batch
which could lead to out of bounds errors (caught by our panic-catcher)
some time later. This is now fixed by correctly paying attention to the
selection vector.

I neither can easily come up with an example query that would trigger
this condition nor can I prove that it won't occur, but I think we have
seen a single sentry report that could be explained by this bug, so I
think it's worth backporting.

Release note (bug fix): Previously, CockroachDB could encounter an
internal error when executing queries with BYTES or STRING types via the
vectorized engine in rare circumstances, and now this is fixed.
yuzefovich committed Jan 21, 2021
1 parent 3d762ac commit e7a589b
Showing 1 changed file with 20 additions and 3 deletions.
23 changes: 20 additions & 3 deletions pkg/col/coldata/batch.go
Original file line number Diff line number Diff line change
@@ -208,9 +208,26 @@ func (m *MemBatch) SetSelection(b bool) {
// SetLength implements the Batch interface.
func (m *MemBatch) SetLength(n int) {
m.n = n
for _, v := range m.b {
if v != nil && v.Type() == coltypes.Bytes {
v.Bytes().UpdateOffsetsToBeNonDecreasing(n)
if n > 0 {
// In order to maintain the invariant of Bytes vectors we need to update
// offsets up to the element with the largest index that can be accessed
// by the batch.
maxIdx := n - 1
if m.useSel && m.sel[n-1] > maxIdx {
// Note that here we rely on the fact that selection vectors are
// increasing sequences.
//
// This assumption is only enforced by the invariantsChecker
// starting from 21.1 branches, so we have a "safe" conditional to
// not have a correctness regression, yet we deliberately do not
// want to iterate over the selection vector to find the largest
// index since that could be a performance regression.
maxIdx = m.sel[n-1]
}
for _, v := range m.b {
if v != nil && v.Type() == coltypes.Bytes {
v.Bytes().UpdateOffsetsToBeNonDecreasing(maxIdx + 1)
}
}
}
}

0 comments on commit e7a589b

Please sign in to comment.