-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql: Planner drops WHERE condition if conditions are in a specific order #71002
Labels
branch-release-21.1
Used to mark GA and release blockers, technical advisories, and bugs for 21.1
C-bug
Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.
C-technical-advisory
Caused a technical advisory
O-community
Originated from the community
T-sql-queries
SQL Queries Team
X-blathers-triaged
blathers was able to find an owner
Comments
lucacri
added
the
C-bug
Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.
label
Oct 1, 2021
blathers-crl
bot
added
O-community
Originated from the community
X-blathers-triaged
blathers was able to find an owner
labels
Oct 1, 2021
ajwerner
changed the title
Planner drops WHERE condition if conditions are in a specific order
sql: Planner drops WHERE condition if conditions are in a specific order
Oct 1, 2021
An hour or so after the rollback to 21.1.7, I can happily report that the issue is NOT present on 21.1.7. |
Thanks for the report! We have been able to reproduce and we are investigating. |
mgartner
added a commit
to mgartner/cockroach
that referenced
this issue
Oct 2, 2021
Previously, two different scalar expressions could share the same scalar ID when placeholders were assigned to a memo copied from a cached memo. This was possible because the cached memo's `curID` was not copied to the new memo. Instead it remained the default value of `0`. When new scalar expressions were constructed while assigning placeholders, `NextID()` could hand out an ID that already existed in the cached memo. This behavior breaks the invariant that all scalar expressions have a unique ID. Most catastrophically, this could cause `DeduplicateSelectFilters` to discard filters that were not actually duplicated, causing incorrect query results. Interestingly, `ConsolidateFilters`, which was added long before `DeduplicateSelectFilters`, also eliminates filter expressions with matching scalar IDs, but we haven't yet seen an example of this causing problems with cached memos. This commit fixes the issue by simply copying the `curID` from the cached memo to the new memo. This ensures that `NextID()` will not return an ID that was already returned while building the cached memo. I made several attempts to add test build checkers to uphold the invariant that all scalar IDs are unique. Unfortunately, this is difficult because there is no single code path that will catch all violations. Adding a check that visits all expressions in the memo after optimization will not catch cases where a scalar expression with a duplicate ID was removed or folded. A similar check made after optbuilder wouldn't work for the same reason: normalization rules could have removed the expression from the tree. Adding a check while expressions are being constructed is not possible because they don't have references to all other expressions. Finally, having the memo keep track of all issued scalar IDs to detect violations when `NextID()` is called is not ideal because it will be ineffective if we forget to copy the tracking data structure from cached memos to new memos. Fixes cockroachdb#71002 Release note (bug fix): A bug has been fixed that caused the optimizer to erroneously discard `WHERE` filters when executed prepared statements, causing incorrect results to be returned. This bug was present since version 21.1.9.
mgartner
added a commit
to mgartner/cockroach
that referenced
this issue
Oct 2, 2021
Previously, every scalar expression (except lists and list items) had an ID that was said to be unique within the context of a memo. These IDs were originally added as a way to canonically order filters. Being named "IDs", their use later expanded to check for equality of two scalar expressions. Maintaining this uniqueness invariant is difficult in practice and has dangerous implications when it is violated, as seen in cockroachdb#71002. While two different scalar expressions with the same ID could certainly cause problems for sorting filters, using these IDs to check for scalar expression equality can be catastrophic. For example, a filter expression that shares an ID with another expression could be completely removed from the filter. Unfortunately, there's no obvious way to add test build assertions that scalar IDs are in fact unique, as explained in cockroachdb#71035. In order to lessen the blast radius of breaking this invariant, this commit renames "scalar ID" to "scalar rank". The comment for this attribute does not explicitly guarantee its uniqueness. This renaming should urge contributors to only use this value for ordering scalar expressions canonically, not for scalar expression equality. Instead, pointer equality should be used to check if two scalar expressions are the same. Release note: None
craig bot
pushed a commit
that referenced
this issue
Oct 4, 2021
70890: sql: Enable telemetry query_sampling by default r=knz a=logston This commit changes the default for the "sql.telemetry.query_sampling.enabled" setting to true. The CC team would like this setting enabled by default as turning it on per tenant cluster during cluster creation and before any SQL is processed is inefficient in a number of ways. Release note: None Closes #70775 71035: opt: prevent duplicate scalar IDs when assigning placeholders r=mgartner a=mgartner Previously, two different scalar expressions could share the same scalar ID when placeholders were assigned to a memo copied from a cached memo. This was possible because the cached memo's `curID` was not copied to the new memo. Instead it remained the default value of `0`. When new scalar expressions were constructed while assigning placeholders, `NextID()` could hand out an ID that already existed in the cached memo. This behavior breaks the invariant that all scalar expressions have a unique ID. Most catastrophically, this could cause `DeduplicateSelectFilters` to discard filters that were not actually duplicated, causing incorrect query results. Interestingly, `ConsolidateFilters`, which was added long before `DeduplicateSelectFilters`, also eliminates filter expressions with matching scalar IDs, but we haven't yet seen an example of this causing problems with cached memos. This commit fixes the issue by simply copying the `curID` from the cached memo to the new memo. This ensures that `NextID()` will not return an ID that was already returned while building the cached memo. I made several attempts to add test build checkers to uphold the invariant that all scalar IDs are unique. Unfortunately, this is difficult because there is no single code path that will catch all violations. Adding a check that visits all expressions in the memo after optimization will not catch cases where a scalar expression with a duplicate ID was removed or folded. A similar check made after the optbuilder has built the canonical plan wouldn't work for the same reason: normalization rules could have removed the expression from the tree. Adding a check while expressions are being constructed is not possible because they don't have references to all other expressions. Finally, having the memo keep track of all issued scalar IDs to detect violations when `NextID()` is called is not ideal because it will be ineffective if we forget to copy the tracking data structure from cached memos to new memos. Fixes #71002 Release note (bug fix): A bug has been fixed that caused the optimizer to erroneously discard `WHERE` filters when executed prepared statements, causing incorrect results to be returned. This bug was present since version 21.1.9. 71057: util/tracing: remove DeprecatedInternalStructured r=andreimatei a=andreimatei This recording field was necessary for compatibility with 21.1. Now that 21.2 is out, we no longer need it, and it costs allocations. Release note: None 71096: importccl: fix bazel failure r=adityamaru a=otan Release note: None 71112: dev: add a few more build target alises r=rail a=rickystewart I imagine people might find these useful. :) Release note: None Co-authored-by: Paul Logston <[email protected]> Co-authored-by: Marcus Gartner <[email protected]> Co-authored-by: Andrei Matei <[email protected]> Co-authored-by: Oliver Tan <[email protected]> Co-authored-by: Ricky Stewart <[email protected]>
blathers-crl bot
pushed a commit
that referenced
this issue
Oct 5, 2021
Previously, two different scalar expressions could share the same scalar ID when placeholders were assigned to a memo copied from a cached memo. This was possible because the cached memo's `curID` was not copied to the new memo. Instead it remained the default value of `0`. When new scalar expressions were constructed while assigning placeholders, `NextID()` could hand out an ID that already existed in the cached memo. This behavior breaks the invariant that all scalar expressions have a unique ID. Most catastrophically, this could cause `DeduplicateSelectFilters` to discard filters that were not actually duplicated, causing incorrect query results. Interestingly, `ConsolidateFilters`, which was added long before `DeduplicateSelectFilters`, also eliminates filter expressions with matching scalar IDs, but we haven't yet seen an example of this causing problems with cached memos. This commit fixes the issue by simply copying the `curID` from the cached memo to the new memo. This ensures that `NextID()` will not return an ID that was already returned while building the cached memo. I made several attempts to add test build checkers to uphold the invariant that all scalar IDs are unique. Unfortunately, this is difficult because there is no single code path that will catch all violations. Adding a check that visits all expressions in the memo after optimization will not catch cases where a scalar expression with a duplicate ID was removed or folded. A similar check made after the optbuilder has built the canonical plan wouldn't work for the same reason: normalization rules could have removed the expression from the tree. Adding a check while expressions are being constructed is not possible because they don't have references to all other expressions. Finally, having the memo keep track of all issued scalar IDs to detect violations when `NextID()` is called is not ideal because it will be ineffective if we forget to copy the tracking data structure from cached memos to new memos. Fixes #71002 Release note (bug fix): A bug has been fixed that caused the optimizer to erroneously discard `WHERE` filters when executed prepared statements, causing incorrect results to be returned. This bug was present since version 21.1.9.
blathers-crl bot
pushed a commit
that referenced
this issue
Oct 5, 2021
Previously, two different scalar expressions could share the same scalar ID when placeholders were assigned to a memo copied from a cached memo. This was possible because the cached memo's `curID` was not copied to the new memo. Instead it remained the default value of `0`. When new scalar expressions were constructed while assigning placeholders, `NextID()` could hand out an ID that already existed in the cached memo. This behavior breaks the invariant that all scalar expressions have a unique ID. Most catastrophically, this could cause `DeduplicateSelectFilters` to discard filters that were not actually duplicated, causing incorrect query results. Interestingly, `ConsolidateFilters`, which was added long before `DeduplicateSelectFilters`, also eliminates filter expressions with matching scalar IDs, but we haven't yet seen an example of this causing problems with cached memos. This commit fixes the issue by simply copying the `curID` from the cached memo to the new memo. This ensures that `NextID()` will not return an ID that was already returned while building the cached memo. I made several attempts to add test build checkers to uphold the invariant that all scalar IDs are unique. Unfortunately, this is difficult because there is no single code path that will catch all violations. Adding a check that visits all expressions in the memo after optimization will not catch cases where a scalar expression with a duplicate ID was removed or folded. A similar check made after the optbuilder has built the canonical plan wouldn't work for the same reason: normalization rules could have removed the expression from the tree. Adding a check while expressions are being constructed is not possible because they don't have references to all other expressions. Finally, having the memo keep track of all issued scalar IDs to detect violations when `NextID()` is called is not ideal because it will be ineffective if we forget to copy the tracking data structure from cached memos to new memos. Fixes #71002 Release note (bug fix): A bug has been fixed that caused the optimizer to erroneously discard `WHERE` filters when executed prepared statements, causing incorrect results to be returned. This bug was present since version 21.1.9.
@lucacri thanks again for the detailed report. We have a fix for the issue that should be included in the next 21.1 release, 21.1.10. |
JuanLeon1
pushed a commit
that referenced
this issue
Oct 6, 2021
Previously, two different scalar expressions could share the same scalar ID when placeholders were assigned to a memo copied from a cached memo. This was possible because the cached memo's `curID` was not copied to the new memo. Instead it remained the default value of `0`. When new scalar expressions were constructed while assigning placeholders, `NextID()` could hand out an ID that already existed in the cached memo. This behavior breaks the invariant that all scalar expressions have a unique ID. Most catastrophically, this could cause `DeduplicateSelectFilters` to discard filters that were not actually duplicated, causing incorrect query results. Interestingly, `ConsolidateFilters`, which was added long before `DeduplicateSelectFilters`, also eliminates filter expressions with matching scalar IDs, but we haven't yet seen an example of this causing problems with cached memos. This commit fixes the issue by simply copying the `curID` from the cached memo to the new memo. This ensures that `NextID()` will not return an ID that was already returned while building the cached memo. I made several attempts to add test build checkers to uphold the invariant that all scalar IDs are unique. Unfortunately, this is difficult because there is no single code path that will catch all violations. Adding a check that visits all expressions in the memo after optimization will not catch cases where a scalar expression with a duplicate ID was removed or folded. A similar check made after the optbuilder has built the canonical plan wouldn't work for the same reason: normalization rules could have removed the expression from the tree. Adding a check while expressions are being constructed is not possible because they don't have references to all other expressions. Finally, having the memo keep track of all issued scalar IDs to detect violations when `NextID()` is called is not ideal because it will be ineffective if we forget to copy the tracking data structure from cached memos to new memos. Fixes #71002 Release note (bug fix): A bug has been fixed that caused the optimizer to erroneously discard `WHERE` filters when executed prepared statements, causing incorrect results to be returned. This bug was present since version 21.1.9.
mgartner
added a commit
to mgartner/cockroach
that referenced
this issue
Oct 6, 2021
Previously, every scalar expression (except lists and list items) had an ID that was said to be unique within the context of a memo. These IDs were originally added as a way to canonically order filters. Being named "IDs", their use later expanded to check for equality of two scalar expressions. Maintaining this uniqueness invariant is difficult in practice and has dangerous implications when it is violated, as seen in cockroachdb#71002. While two different scalar expressions with the same ID could certainly cause problems for sorting filters, using these IDs to check for scalar expression equality can be catastrophic. For example, a filter expression that shares an ID with another expression could be completely removed from the filter. Unfortunately, there's no obvious way to add test build assertions that scalar IDs are in fact unique, as explained in cockroachdb#71035. In order to lessen the blast radius of breaking this invariant, this commit renames "scalar ID" to "scalar rank". The comment for this attribute does not explicitly guarantee its uniqueness. This renaming should urge contributors to only use this value for ordering scalar expressions canonically, not for scalar expression equality. Instead, pointer equality should be used to check if two scalar expressions are the same. Release note: None
andy-kimball
pushed a commit
that referenced
this issue
Oct 6, 2021
Previously, two different scalar expressions could share the same scalar ID when placeholders were assigned to a memo copied from a cached memo. This was possible because the cached memo's `curID` was not copied to the new memo. Instead it remained the default value of `0`. When new scalar expressions were constructed while assigning placeholders, `NextID()` could hand out an ID that already existed in the cached memo. This behavior breaks the invariant that all scalar expressions have a unique ID. Most catastrophically, this could cause `DeduplicateSelectFilters` to discard filters that were not actually duplicated, causing incorrect query results. Interestingly, `ConsolidateFilters`, which was added long before `DeduplicateSelectFilters`, also eliminates filter expressions with matching scalar IDs, but we haven't yet seen an example of this causing problems with cached memos. This commit fixes the issue by simply copying the `curID` from the cached memo to the new memo. This ensures that `NextID()` will not return an ID that was already returned while building the cached memo. I made several attempts to add test build checkers to uphold the invariant that all scalar IDs are unique. Unfortunately, this is difficult because there is no single code path that will catch all violations. Adding a check that visits all expressions in the memo after optimization will not catch cases where a scalar expression with a duplicate ID was removed or folded. A similar check made after the optbuilder has built the canonical plan wouldn't work for the same reason: normalization rules could have removed the expression from the tree. Adding a check while expressions are being constructed is not possible because they don't have references to all other expressions. Finally, having the memo keep track of all issued scalar IDs to detect violations when `NextID()` is called is not ideal because it will be ineffective if we forget to copy the tracking data structure from cached memos to new memos. Fixes #71002 Release note (bug fix): A bug has been fixed that caused the optimizer to erroneously discard `WHERE` filters when executed prepared statements, causing incorrect results to be returned. This bug was present since version 21.1.9.
craig bot
pushed a commit
that referenced
this issue
Oct 7, 2021
71037: opt: rename ScalarID to ScalarRank r=mgartner a=mgartner Previously, every scalar expression (except lists and list items) had an ID that was said to be unique within the context of a memo. These IDs were originally added as a way to canonically order filters. Being named "IDs", their use later expanded to check for equality of two scalar expressions. Maintaining this uniqueness invariant is difficult in practice and has dangerous implications when it is violated, as seen in #71002. While two different scalar expressions with the same ID could certainly cause problems for sorting filters, using these IDs to check for scalar expression equality can be catastrophic. For example, a filter expression that shares an ID with another expression could be completely removed from the filter. Unfortunately, there's no obvious way to add test build assertions that scalar IDs are in fact unique, as explained in #71035. In order to lessen the blast radius of breaking this invariant, this commit renames "scalar ID" to "scalar rank". The comment for this attribute does not explicitly guarantee its uniqueness. This renaming should urge contributors to only use this value for ordering scalar expressions canonically, not for scalar expression equality. Instead, pointer equality should be used to check if two scalar expressions are the same. Release note: None 71056: util/tracing: make some span options in singletons r=andreimatei a=andreimatei A couple of span creation options are empty structs implementing the SpanOption interface. Being an empty struct, putting it in an interface doesn't allocate as the compiler optimizes small types in interfaces. Still, the output of `gcflags=-m` lists the value as escaping to the heap, very confusingly. This patch introduces singletons for the structs to make it clear that there's no allocation. Release note: None 71108: server,*: untangle the Tracer from the Settings r=andreimatei a=andreimatei See individual commits. Co-authored-by: Marcus Gartner <[email protected]> Co-authored-by: Andrei Matei <[email protected]>
nehageorge
pushed a commit
to nehageorge/cockroach
that referenced
this issue
Oct 8, 2021
Previously, we did not have statements with placeholders in TLP queries. This could have caught a correctness bug (cockroachdb#71002). In this PR, support for prepared queries is added. Fixes cockroachdb#71216. Release note: None
nehageorge
pushed a commit
to nehageorge/cockroach
that referenced
this issue
Oct 8, 2021
Previously, we did not have statements with placeholders in TLP queries. This could have caught a correctness bug (cockroachdb#71002). In this PR, support for prepared queries is added. Fixes cockroachdb#71216. Release note: None
nehageorge
pushed a commit
to nehageorge/cockroach
that referenced
this issue
Oct 12, 2021
Previously, we did not have statements with placeholders in TLP queries. This could have caught a correctness bug (cockroachdb#71002). In this PR, support for prepared queries is added. Fixes cockroachdb#71216. Release note: None
andy-kimball
pushed a commit
that referenced
this issue
Oct 12, 2021
Previously, two different scalar expressions could share the same scalar ID when placeholders were assigned to a memo copied from a cached memo. This was possible because the cached memo's `curID` was not copied to the new memo. Instead it remained the default value of `0`. When new scalar expressions were constructed while assigning placeholders, `NextID()` could hand out an ID that already existed in the cached memo. This behavior breaks the invariant that all scalar expressions have a unique ID. Most catastrophically, this could cause `DeduplicateSelectFilters` to discard filters that were not actually duplicated, causing incorrect query results. Interestingly, `ConsolidateFilters`, which was added long before `DeduplicateSelectFilters`, also eliminates filter expressions with matching scalar IDs, but we haven't yet seen an example of this causing problems with cached memos. This commit fixes the issue by simply copying the `curID` from the cached memo to the new memo. This ensures that `NextID()` will not return an ID that was already returned while building the cached memo. I made several attempts to add test build checkers to uphold the invariant that all scalar IDs are unique. Unfortunately, this is difficult because there is no single code path that will catch all violations. Adding a check that visits all expressions in the memo after optimization will not catch cases where a scalar expression with a duplicate ID was removed or folded. A similar check made after the optbuilder has built the canonical plan wouldn't work for the same reason: normalization rules could have removed the expression from the tree. Adding a check while expressions are being constructed is not possible because they don't have references to all other expressions. Finally, having the memo keep track of all issued scalar IDs to detect violations when `NextID()` is called is not ideal because it will be ineffective if we forget to copy the tracking data structure from cached memos to new memos. Fixes #71002 Release note (bug fix): A bug has been fixed that caused the optimizer to erroneously discard `WHERE` filters when executed prepared statements, causing incorrect results to be returned. This bug was present since version 21.1.9.
This was referenced Oct 13, 2021
craig bot
pushed a commit
that referenced
this issue
Oct 15, 2021
71323: sqlsmith, tests, tree: add prepared queries to tlp r=nehageorge a=nehageorge Previously, we did not have statements with placeholders in TLP queries. This could have caught a correctness bug (#71002). In this PR, support for prepared queries is added. Fixes #71216. Release note: None Co-authored-by: Neha George <[email protected]>
ericharmeling
pushed a commit
to ericharmeling/cockroach
that referenced
this issue
Oct 20, 2021
Previously, every scalar expression (except lists and list items) had an ID that was said to be unique within the context of a memo. These IDs were originally added as a way to canonically order filters. Being named "IDs", their use later expanded to check for equality of two scalar expressions. Maintaining this uniqueness invariant is difficult in practice and has dangerous implications when it is violated, as seen in cockroachdb#71002. While two different scalar expressions with the same ID could certainly cause problems for sorting filters, using these IDs to check for scalar expression equality can be catastrophic. For example, a filter expression that shares an ID with another expression could be completely removed from the filter. Unfortunately, there's no obvious way to add test build assertions that scalar IDs are in fact unique, as explained in cockroachdb#71035. In order to lessen the blast radius of breaking this invariant, this commit renames "scalar ID" to "scalar rank". The comment for this attribute does not explicitly guarantee its uniqueness. This renaming should urge contributors to only use this value for ordering scalar expressions canonically, not for scalar expression equality. Instead, pointer equality should be used to check if two scalar expressions are the same. Release note: None
rytaft
added
C-technical-advisory
Caused a technical advisory
branch-release-21.1
Used to mark GA and release blockers, technical advisories, and bugs for 21.1
labels
Dec 6, 2023
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
branch-release-21.1
Used to mark GA and release blockers, technical advisories, and bugs for 21.1
C-bug
Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.
C-technical-advisory
Caused a technical advisory
O-community
Originated from the community
T-sql-queries
SQL Queries Team
X-blathers-triaged
blathers was able to find an owner
Describe the problem
I noticed that the order of simple WHERE conditions (
field = X
) on the same table changes the result of the query, resulting in the field condition being ignored.The query in question (simplified) is:
Notice the order of the WHERE conditions inside the
class_models
subquery (is_virtual
andevent_type
). This query runs as if theis_virtual
condition wasn't present at all, giving wrong data in return.I then tried to swap the fields, like so:
Even checking the planner on the dashboard, I can see that the condition is not applied in the "bad" query.
To Reproduce
I have the diagnostic zip files for both queries.
Expected behavior
The order of the where clauses shouldn't matter in the final result, and definitely the planner shouldn't drop one condition
Additional data / screenshots
Sent to Cockroach already
Environment:
The text was updated successfully, but these errors were encountered: