sql: Planner drops WHERE condition if conditions are in a specific order #71002

lucacri · 2021-10-01T17:54:01Z

Describe the problem

I noticed that the order of simple WHERE conditions ( field = X ) on the same table changes the result of the query, resulting in the field condition being ignored.

The query in question (simplified) is:

-- "bad" query
SELECT
  id
FROM
  clients
WHERE
  team_id = $1:::INT8
  AND (
      EXISTS(
        SELECT
          *
        FROM
          visits
        WHERE
          clients.id = visits.client_id
          AND clients.team_id = visits.team_id
          AND (
              EXISTS(
                SELECT
                  *
                FROM
                  class_models
                WHERE
                  visits.class_model_id = class_models.id
                  AND visits.team_id = class_models.team_id
                  AND is_virtual = $9:::BOOL
                  AND event_type = $11:::INT4
              )
            )
      )
    );

-- 27 results

Notice the order of the WHERE conditions inside the class_models subquery (is_virtual and event_type). This query runs as if the is_virtual condition wasn't present at all, giving wrong data in return.

I then tried to swap the fields, like so:

-- good query
SELECT
  id
FROM
  clients
WHERE
  team_id = $1:::INT8
  AND (
      EXISTS(
        SELECT
          *
        FROM
          visits
        WHERE
          clients.id = visits.client_id
          AND clients.team_id = visits.team_id
          AND (
              EXISTS(
                SELECT
                  *
                FROM
                  class_models
                WHERE
                  visits.class_model_id = class_models.id
                  AND visits.team_id = class_models.team_id
                  AND event_type = $9:::INT4
                  AND is_virtual = $12:::BOOL
              )
            )
      )
    )

-- 9 results

Even checking the planner on the dashboard, I can see that the condition is not applied in the "bad" query.

To Reproduce

I have the diagnostic zip files for both queries.

Expected behavior
The order of the where clauses shouldn't matter in the final result, and definitely the planner shouldn't drop one condition

Additional data / screenshots

Sent to Cockroach already

Environment:

CockroachDB version: 21.1.9
Server OS: GKE instances (containerOS)
Client app: any

The text was updated successfully, but these errors were encountered:

lucacri · 2021-10-01T18:27:01Z

~~I tried rolling back to 21.1.7 and the issue is still present~~

An hour or so after the rollback to 21.1.7, I can happily report that the issue is NOT present on 21.1.7.

RaduBerinde · 2021-10-01T19:20:18Z

Thanks for the report! We have been able to reproduce and we are investigating.

Previously, two different scalar expressions could share the same scalar ID when placeholders were assigned to a memo copied from a cached memo. This was possible because the cached memo's `curID` was not copied to the new memo. Instead it remained the default value of `0`. When new scalar expressions were constructed while assigning placeholders, `NextID()` could hand out an ID that already existed in the cached memo. This behavior breaks the invariant that all scalar expressions have a unique ID. Most catastrophically, this could cause `DeduplicateSelectFilters` to discard filters that were not actually duplicated, causing incorrect query results. Interestingly, `ConsolidateFilters`, which was added long before `DeduplicateSelectFilters`, also eliminates filter expressions with matching scalar IDs, but we haven't yet seen an example of this causing problems with cached memos. This commit fixes the issue by simply copying the `curID` from the cached memo to the new memo. This ensures that `NextID()` will not return an ID that was already returned while building the cached memo. I made several attempts to add test build checkers to uphold the invariant that all scalar IDs are unique. Unfortunately, this is difficult because there is no single code path that will catch all violations. Adding a check that visits all expressions in the memo after optimization will not catch cases where a scalar expression with a duplicate ID was removed or folded. A similar check made after optbuilder wouldn't work for the same reason: normalization rules could have removed the expression from the tree. Adding a check while expressions are being constructed is not possible because they don't have references to all other expressions. Finally, having the memo keep track of all issued scalar IDs to detect violations when `NextID()` is called is not ideal because it will be ineffective if we forget to copy the tracking data structure from cached memos to new memos. Fixes cockroachdb#71002 Release note (bug fix): A bug has been fixed that caused the optimizer to erroneously discard `WHERE` filters when executed prepared statements, causing incorrect results to be returned. This bug was present since version 21.1.9.

Previously, every scalar expression (except lists and list items) had an ID that was said to be unique within the context of a memo. These IDs were originally added as a way to canonically order filters. Being named "IDs", their use later expanded to check for equality of two scalar expressions. Maintaining this uniqueness invariant is difficult in practice and has dangerous implications when it is violated, as seen in cockroachdb#71002. While two different scalar expressions with the same ID could certainly cause problems for sorting filters, using these IDs to check for scalar expression equality can be catastrophic. For example, a filter expression that shares an ID with another expression could be completely removed from the filter. Unfortunately, there's no obvious way to add test build assertions that scalar IDs are in fact unique, as explained in cockroachdb#71035. In order to lessen the blast radius of breaking this invariant, this commit renames "scalar ID" to "scalar rank". The comment for this attribute does not explicitly guarantee its uniqueness. This renaming should urge contributors to only use this value for ordering scalar expressions canonically, not for scalar expression equality. Instead, pointer equality should be used to check if two scalar expressions are the same. Release note: None

70890: sql: Enable telemetry query_sampling by default r=knz a=logston This commit changes the default for the "sql.telemetry.query_sampling.enabled" setting to true. The CC team would like this setting enabled by default as turning it on per tenant cluster during cluster creation and before any SQL is processed is inefficient in a number of ways. Release note: None Closes #70775 71035: opt: prevent duplicate scalar IDs when assigning placeholders r=mgartner a=mgartner Previously, two different scalar expressions could share the same scalar ID when placeholders were assigned to a memo copied from a cached memo. This was possible because the cached memo's `curID` was not copied to the new memo. Instead it remained the default value of `0`. When new scalar expressions were constructed while assigning placeholders, `NextID()` could hand out an ID that already existed in the cached memo. This behavior breaks the invariant that all scalar expressions have a unique ID. Most catastrophically, this could cause `DeduplicateSelectFilters` to discard filters that were not actually duplicated, causing incorrect query results. Interestingly, `ConsolidateFilters`, which was added long before `DeduplicateSelectFilters`, also eliminates filter expressions with matching scalar IDs, but we haven't yet seen an example of this causing problems with cached memos. This commit fixes the issue by simply copying the `curID` from the cached memo to the new memo. This ensures that `NextID()` will not return an ID that was already returned while building the cached memo. I made several attempts to add test build checkers to uphold the invariant that all scalar IDs are unique. Unfortunately, this is difficult because there is no single code path that will catch all violations. Adding a check that visits all expressions in the memo after optimization will not catch cases where a scalar expression with a duplicate ID was removed or folded. A similar check made after the optbuilder has built the canonical plan wouldn't work for the same reason: normalization rules could have removed the expression from the tree. Adding a check while expressions are being constructed is not possible because they don't have references to all other expressions. Finally, having the memo keep track of all issued scalar IDs to detect violations when `NextID()` is called is not ideal because it will be ineffective if we forget to copy the tracking data structure from cached memos to new memos. Fixes #71002 Release note (bug fix): A bug has been fixed that caused the optimizer to erroneously discard `WHERE` filters when executed prepared statements, causing incorrect results to be returned. This bug was present since version 21.1.9. 71057: util/tracing: remove DeprecatedInternalStructured r=andreimatei a=andreimatei This recording field was necessary for compatibility with 21.1. Now that 21.2 is out, we no longer need it, and it costs allocations. Release note: None 71096: importccl: fix bazel failure r=adityamaru a=otan Release note: None 71112: dev: add a few more build target alises r=rail a=rickystewart I imagine people might find these useful. :) Release note: None Co-authored-by: Paul Logston <[email protected]> Co-authored-by: Marcus Gartner <[email protected]> Co-authored-by: Andrei Matei <[email protected]> Co-authored-by: Oliver Tan <[email protected]> Co-authored-by: Ricky Stewart <[email protected]>

Previously, two different scalar expressions could share the same scalar ID when placeholders were assigned to a memo copied from a cached memo. This was possible because the cached memo's `curID` was not copied to the new memo. Instead it remained the default value of `0`. When new scalar expressions were constructed while assigning placeholders, `NextID()` could hand out an ID that already existed in the cached memo. This behavior breaks the invariant that all scalar expressions have a unique ID. Most catastrophically, this could cause `DeduplicateSelectFilters` to discard filters that were not actually duplicated, causing incorrect query results. Interestingly, `ConsolidateFilters`, which was added long before `DeduplicateSelectFilters`, also eliminates filter expressions with matching scalar IDs, but we haven't yet seen an example of this causing problems with cached memos. This commit fixes the issue by simply copying the `curID` from the cached memo to the new memo. This ensures that `NextID()` will not return an ID that was already returned while building the cached memo. I made several attempts to add test build checkers to uphold the invariant that all scalar IDs are unique. Unfortunately, this is difficult because there is no single code path that will catch all violations. Adding a check that visits all expressions in the memo after optimization will not catch cases where a scalar expression with a duplicate ID was removed or folded. A similar check made after the optbuilder has built the canonical plan wouldn't work for the same reason: normalization rules could have removed the expression from the tree. Adding a check while expressions are being constructed is not possible because they don't have references to all other expressions. Finally, having the memo keep track of all issued scalar IDs to detect violations when `NextID()` is called is not ideal because it will be ineffective if we forget to copy the tracking data structure from cached memos to new memos. Fixes #71002 Release note (bug fix): A bug has been fixed that caused the optimizer to erroneously discard `WHERE` filters when executed prepared statements, causing incorrect results to be returned. This bug was present since version 21.1.9.

mgartner · 2021-10-05T13:37:15Z

@lucacri thanks again for the detailed report. We have a fix for the issue that should be included in the next 21.1 release, 21.1.10.

lucacri · 2021-10-05T16:09:23Z

That was fast :) thank you so much, @mgartner, @ajwerner and the whole team!

Previously, two different scalar expressions could share the same scalar ID when placeholders were assigned to a memo copied from a cached memo. This was possible because the cached memo's `curID` was not copied to the new memo. Instead it remained the default value of `0`. When new scalar expressions were constructed while assigning placeholders, `NextID()` could hand out an ID that already existed in the cached memo. This behavior breaks the invariant that all scalar expressions have a unique ID. Most catastrophically, this could cause `DeduplicateSelectFilters` to discard filters that were not actually duplicated, causing incorrect query results. Interestingly, `ConsolidateFilters`, which was added long before `DeduplicateSelectFilters`, also eliminates filter expressions with matching scalar IDs, but we haven't yet seen an example of this causing problems with cached memos. This commit fixes the issue by simply copying the `curID` from the cached memo to the new memo. This ensures that `NextID()` will not return an ID that was already returned while building the cached memo. I made several attempts to add test build checkers to uphold the invariant that all scalar IDs are unique. Unfortunately, this is difficult because there is no single code path that will catch all violations. Adding a check that visits all expressions in the memo after optimization will not catch cases where a scalar expression with a duplicate ID was removed or folded. A similar check made after the optbuilder has built the canonical plan wouldn't work for the same reason: normalization rules could have removed the expression from the tree. Adding a check while expressions are being constructed is not possible because they don't have references to all other expressions. Finally, having the memo keep track of all issued scalar IDs to detect violations when `NextID()` is called is not ideal because it will be ineffective if we forget to copy the tracking data structure from cached memos to new memos. Fixes #71002 Release note (bug fix): A bug has been fixed that caused the optimizer to erroneously discard `WHERE` filters when executed prepared statements, causing incorrect results to be returned. This bug was present since version 21.1.9.

Previously, every scalar expression (except lists and list items) had an ID that was said to be unique within the context of a memo. These IDs were originally added as a way to canonically order filters. Being named "IDs", their use later expanded to check for equality of two scalar expressions. Maintaining this uniqueness invariant is difficult in practice and has dangerous implications when it is violated, as seen in cockroachdb#71002. While two different scalar expressions with the same ID could certainly cause problems for sorting filters, using these IDs to check for scalar expression equality can be catastrophic. For example, a filter expression that shares an ID with another expression could be completely removed from the filter. Unfortunately, there's no obvious way to add test build assertions that scalar IDs are in fact unique, as explained in cockroachdb#71035. In order to lessen the blast radius of breaking this invariant, this commit renames "scalar ID" to "scalar rank". The comment for this attribute does not explicitly guarantee its uniqueness. This renaming should urge contributors to only use this value for ordering scalar expressions canonically, not for scalar expression equality. Instead, pointer equality should be used to check if two scalar expressions are the same. Release note: None

Previously, two different scalar expressions could share the same scalar ID when placeholders were assigned to a memo copied from a cached memo. This was possible because the cached memo's `curID` was not copied to the new memo. Instead it remained the default value of `0`. When new scalar expressions were constructed while assigning placeholders, `NextID()` could hand out an ID that already existed in the cached memo. This behavior breaks the invariant that all scalar expressions have a unique ID. Most catastrophically, this could cause `DeduplicateSelectFilters` to discard filters that were not actually duplicated, causing incorrect query results. Interestingly, `ConsolidateFilters`, which was added long before `DeduplicateSelectFilters`, also eliminates filter expressions with matching scalar IDs, but we haven't yet seen an example of this causing problems with cached memos. This commit fixes the issue by simply copying the `curID` from the cached memo to the new memo. This ensures that `NextID()` will not return an ID that was already returned while building the cached memo. I made several attempts to add test build checkers to uphold the invariant that all scalar IDs are unique. Unfortunately, this is difficult because there is no single code path that will catch all violations. Adding a check that visits all expressions in the memo after optimization will not catch cases where a scalar expression with a duplicate ID was removed or folded. A similar check made after the optbuilder has built the canonical plan wouldn't work for the same reason: normalization rules could have removed the expression from the tree. Adding a check while expressions are being constructed is not possible because they don't have references to all other expressions. Finally, having the memo keep track of all issued scalar IDs to detect violations when `NextID()` is called is not ideal because it will be ineffective if we forget to copy the tracking data structure from cached memos to new memos. Fixes #71002 Release note (bug fix): A bug has been fixed that caused the optimizer to erroneously discard `WHERE` filters when executed prepared statements, causing incorrect results to be returned. This bug was present since version 21.1.9.

71037: opt: rename ScalarID to ScalarRank r=mgartner a=mgartner Previously, every scalar expression (except lists and list items) had an ID that was said to be unique within the context of a memo. These IDs were originally added as a way to canonically order filters. Being named "IDs", their use later expanded to check for equality of two scalar expressions. Maintaining this uniqueness invariant is difficult in practice and has dangerous implications when it is violated, as seen in #71002. While two different scalar expressions with the same ID could certainly cause problems for sorting filters, using these IDs to check for scalar expression equality can be catastrophic. For example, a filter expression that shares an ID with another expression could be completely removed from the filter. Unfortunately, there's no obvious way to add test build assertions that scalar IDs are in fact unique, as explained in #71035. In order to lessen the blast radius of breaking this invariant, this commit renames "scalar ID" to "scalar rank". The comment for this attribute does not explicitly guarantee its uniqueness. This renaming should urge contributors to only use this value for ordering scalar expressions canonically, not for scalar expression equality. Instead, pointer equality should be used to check if two scalar expressions are the same. Release note: None 71056: util/tracing: make some span options in singletons r=andreimatei a=andreimatei A couple of span creation options are empty structs implementing the SpanOption interface. Being an empty struct, putting it in an interface doesn't allocate as the compiler optimizes small types in interfaces. Still, the output of `gcflags=-m` lists the value as escaping to the heap, very confusingly. This patch introduces singletons for the structs to make it clear that there's no allocation. Release note: None 71108: server,*: untangle the Tracer from the Settings r=andreimatei a=andreimatei See individual commits. Co-authored-by: Marcus Gartner <[email protected]> Co-authored-by: Andrei Matei <[email protected]>

Previously, we did not have statements with placeholders in TLP queries. This could have caught a correctness bug (cockroachdb#71002). In this PR, support for prepared queries is added. Fixes cockroachdb#71216. Release note: None

Previously, two different scalar expressions could share the same scalar ID when placeholders were assigned to a memo copied from a cached memo. This was possible because the cached memo's `curID` was not copied to the new memo. Instead it remained the default value of `0`. When new scalar expressions were constructed while assigning placeholders, `NextID()` could hand out an ID that already existed in the cached memo. This behavior breaks the invariant that all scalar expressions have a unique ID. Most catastrophically, this could cause `DeduplicateSelectFilters` to discard filters that were not actually duplicated, causing incorrect query results. Interestingly, `ConsolidateFilters`, which was added long before `DeduplicateSelectFilters`, also eliminates filter expressions with matching scalar IDs, but we haven't yet seen an example of this causing problems with cached memos. This commit fixes the issue by simply copying the `curID` from the cached memo to the new memo. This ensures that `NextID()` will not return an ID that was already returned while building the cached memo. I made several attempts to add test build checkers to uphold the invariant that all scalar IDs are unique. Unfortunately, this is difficult because there is no single code path that will catch all violations. Adding a check that visits all expressions in the memo after optimization will not catch cases where a scalar expression with a duplicate ID was removed or folded. A similar check made after the optbuilder has built the canonical plan wouldn't work for the same reason: normalization rules could have removed the expression from the tree. Adding a check while expressions are being constructed is not possible because they don't have references to all other expressions. Finally, having the memo keep track of all issued scalar IDs to detect violations when `NextID()` is called is not ideal because it will be ineffective if we forget to copy the tracking data structure from cached memos to new memos. Fixes #71002 Release note (bug fix): A bug has been fixed that caused the optimizer to erroneously discard `WHERE` filters when executed prepared statements, causing incorrect results to be returned. This bug was present since version 21.1.9.

71323: sqlsmith, tests, tree: add prepared queries to tlp r=nehageorge a=nehageorge Previously, we did not have statements with placeholders in TLP queries. This could have caught a correctness bug (#71002). In this PR, support for prepared queries is added. Fixes #71216. Release note: None Co-authored-by: Neha George <[email protected]>

Previously, every scalar expression (except lists and list items) had an ID that was said to be unique within the context of a memo. These IDs were originally added as a way to canonically order filters. Being named "IDs", their use later expanded to check for equality of two scalar expressions. Maintaining this uniqueness invariant is difficult in practice and has dangerous implications when it is violated, as seen in cockroachdb#71002. While two different scalar expressions with the same ID could certainly cause problems for sorting filters, using these IDs to check for scalar expression equality can be catastrophic. For example, a filter expression that shares an ID with another expression could be completely removed from the filter. Unfortunately, there's no obvious way to add test build assertions that scalar IDs are in fact unique, as explained in cockroachdb#71035. In order to lessen the blast radius of breaking this invariant, this commit renames "scalar ID" to "scalar rank". The comment for this attribute does not explicitly guarantee its uniqueness. This renaming should urge contributors to only use this value for ordering scalar expressions canonically, not for scalar expression equality. Instead, pointer equality should be used to check if two scalar expressions are the same. Release note: None

lucacri added the C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. label Oct 1, 2021

blathers-crl bot added O-community Originated from the community X-blathers-triaged blathers was able to find an owner labels Oct 1, 2021

ajwerner changed the title ~~Planner drops WHERE condition if conditions are in a specific order~~ sql: Planner drops WHERE condition if conditions are in a specific order Oct 1, 2021

blathers-crl bot added the T-sql-queries SQL Queries Team label Oct 1, 2021

cockroachdb deleted a comment from blathers-crl bot Oct 1, 2021

mgartner mentioned this issue Oct 2, 2021

opt: prevent duplicate scalar IDs when assigning placeholders #71035

Merged

mgartner mentioned this issue Oct 2, 2021

opt: rename ScalarID to ScalarRank #71037

Merged

craig bot closed this as completed in bb4100e Oct 5, 2021

blathers-crl bot mentioned this issue Oct 5, 2021

release-21.1: opt: prevent duplicate scalar IDs when assigning placeholders #71116

Merged

blathers-crl bot mentioned this issue Oct 5, 2021

release-21.2: opt: prevent duplicate scalar IDs when assigning placeholders #71118

Merged

mgartner mentioned this issue Oct 6, 2021

tlp: add prepared statements #71216

Closed

nehageorge mentioned this issue Oct 8, 2021

sqlsmith, tests, tree: add prepared queries to tlp #71323

Merged

This was referenced Oct 13, 2021

logictest: run queries as prepared statements #71526

Open

tlp: investigate whether TLP with prepared statements can catch #71002 #71530

Closed

mgartner added this to SQL Queries Jul 24, 2023

mgartner moved this to Done in SQL Queries Jul 24, 2023

rytaft added C-technical-advisory Caused a technical advisory branch-release-21.1 Used to mark GA and release blockers, technical advisories, and bugs for 21.1 labels Dec 6, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

sql: Planner drops WHERE condition if conditions are in a specific order #71002

sql: Planner drops WHERE condition if conditions are in a specific order #71002

lucacri commented Oct 1, 2021

lucacri commented Oct 1, 2021 •

edited

Loading

RaduBerinde commented Oct 1, 2021

mgartner commented Oct 5, 2021

lucacri commented Oct 5, 2021

sql: Planner drops WHERE condition if conditions are in a specific order #71002

sql: Planner drops WHERE condition if conditions are in a specific order #71002

Comments

lucacri commented Oct 1, 2021

lucacri commented Oct 1, 2021 • edited Loading

RaduBerinde commented Oct 1, 2021

mgartner commented Oct 5, 2021

lucacri commented Oct 5, 2021

lucacri commented Oct 1, 2021 •

edited

Loading