Skip to content

Commit

Permalink
opt: prevent duplicate scalar IDs when assigning placeholders
Browse files Browse the repository at this point in the history
Previously, two different scalar expressions could share the same scalar
ID when placeholders were assigned to a memo copied from a cached memo.
This was possible because the cached memo's `curID` was not copied to
the new memo. Instead it remained the default value of `0`. When new
scalar expressions were constructed while assigning placeholders,
`NextID()` could hand out an ID that already existed in the cached memo.

This behavior breaks the invariant that all scalar expressions have a
unique ID. Most catastrophically, this could cause
`DeduplicateSelectFilters` to discard filters that were not actually
duplicated, causing incorrect query results. Interestingly,
`ConsolidateFilters`, which was added long before
`DeduplicateSelectFilters`, also eliminates filter expressions with
matching scalar IDs, but we haven't yet seen an example of this causing
problems with cached memos.

This commit fixes the issue by simply copying the `curID` from the
cached memo to the new memo. This ensures that `NextID()` will not
return an ID that was already returned while building the cached memo.

I made several attempts to add test build checkers to uphold the
invariant that all scalar IDs are unique. Unfortunately, this is
difficult because there is no single code path that will catch all
violations.

Adding a check that visits all expressions in the memo after
optimization will not catch cases where a scalar expression with a
duplicate ID was removed or folded. A similar check made after the
optbuilder has built the canonical plan wouldn't work for the same
reason: normalization rules could have removed the expression from the
tree.

Adding a check while expressions are being constructed is not possible
because they don't have references to all other expressions.

Finally, having the memo keep track of all issued scalar IDs to detect
violations when `NextID()` is called is not ideal because it will be
ineffective if we forget to copy the tracking data structure from cached
memos to new memos.

Fixes #71002

Release note (bug fix): A bug has been fixed that caused the optimizer
to erroneously discard `WHERE` filters when executed prepared
statements, causing incorrect results to be returned. This bug was
present since version 21.1.9.
  • Loading branch information
mgartner authored and JuanLeon1 committed Oct 6, 2021
1 parent 5442262 commit a6daab1
Show file tree
Hide file tree
Showing 3 changed files with 32 additions and 0 deletions.
5 changes: 5 additions & 0 deletions pkg/sql/opt/memo/memo.go
Original file line number Diff line number Diff line change
Expand Up @@ -363,6 +363,11 @@ func (m *Memo) NextID() opt.ScalarID {
return m.curID
}

// CopyNextIDFrom copies the next ScalarID from the other memo.
func (m *Memo) CopyNextIDFrom(other *Memo) {
m.curID = other.curID
}

// RequestColStat calculates and returns the column statistic calculated on the
// relational expression.
func (m *Memo) RequestColStat(
Expand Down
5 changes: 5 additions & 0 deletions pkg/sql/opt/norm/factory.go
Original file line number Diff line number Diff line change
Expand Up @@ -215,6 +215,11 @@ func (f *Factory) CopyAndReplace(
panic(errors.AssertionFailedf("destination memo must be empty"))
}

// Copy the next scalar ID to the target memo so that new scalar expressions
// built with the new memo will not share scalar IDs with existing
// expressions.
f.mem.CopyNextIDFrom(from.Memo())

// Copy all metadata to the target memo so that referenced tables and
// columns can keep the same ids they had in the "from" memo. Scalar
// expressions in the metadata cannot have placeholders, so we simply copy
Expand Down
22 changes: 22 additions & 0 deletions pkg/sql/opt/norm/testdata/rules/select
Original file line number Diff line number Diff line change
Expand Up @@ -549,6 +549,28 @@ select
├── (u:1 = 10) OR (v:2 = 20) [outer=(1,2)]
└── (v:2 = 20) OR (u:1 = 10) [outer=(1,2)]

# Regression test for #71002. DeduplicateSelectFilters should not remove filters
# that are not duplicated.
exec-ddl
create table t71002 (
a BOOL,
b BOOL,
c BOOL
)
----

assign-placeholders-norm query-args=(false)
SELECT * FROM t71002 WHERE a AND (b OR ($1 OR c))
----
select
├── columns: a:1!null b:2 c:3
├── fd: ()-->(1)
├── scan t71002
│ └── columns: a:1 b:2 c:3
└── filters
├── a:1 [outer=(1), constraints=(/1: [/true - /true]; tight), fd=()-->(1)]
└── b:2 OR c:3 [outer=(2,3)]

# --------------------------------------------------
# MergeSelects
# --------------------------------------------------
Expand Down

0 comments on commit a6daab1

Please sign in to comment.