Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
opt: prevent duplicate scalar IDs when assigning placeholders
Previously, two different scalar expressions could share the same scalar ID when placeholders were assigned to a memo copied from a cached memo. This was possible because the cached memo's `curID` was not copied to the new memo. Instead it remained the default value of `0`. When new scalar expressions were constructed while assigning placeholders, `NextID()` could hand out an ID that already existed in the cached memo. This behavior breaks the invariant that all scalar expressions have a unique ID. Most catastrophically, this could cause `DeduplicateSelectFilters` to discard filters that were not actually duplicated, causing incorrect query results. Interestingly, `ConsolidateFilters`, which was added long before `DeduplicateSelectFilters`, also eliminates filter expressions with matching scalar IDs, but we haven't yet seen an example of this causing problems with cached memos. This commit fixes the issue by simply copying the `curID` from the cached memo to the new memo. This ensures that `NextID()` will not return an ID that was already returned while building the cached memo. I made several attempts to add test build checkers to uphold the invariant that all scalar IDs are unique. Unfortunately, this is difficult because there is no single code path that will catch all violations. Adding a check that visits all expressions in the memo after optimization will not catch cases where a scalar expression with a duplicate ID was removed or folded. A similar check made after the optbuilder has built the canonical plan wouldn't work for the same reason: normalization rules could have removed the expression from the tree. Adding a check while expressions are being constructed is not possible because they don't have references to all other expressions. Finally, having the memo keep track of all issued scalar IDs to detect violations when `NextID()` is called is not ideal because it will be ineffective if we forget to copy the tracking data structure from cached memos to new memos. Fixes #71002 Release note (bug fix): A bug has been fixed that caused the optimizer to erroneously discard `WHERE` filters when executed prepared statements, causing incorrect results to be returned. This bug was present since version 21.1.9.
- Loading branch information