Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SimplifyExpressions should not consider volatile expressions equal for rewrites #13060

Closed
eejbyfeldt opened this issue Oct 22, 2024 · 1 comment · Fixed by #13128
Closed

SimplifyExpressions should not consider volatile expressions equal for rewrites #13060

eejbyfeldt opened this issue Oct 22, 2024 · 1 comment · Fixed by #13128
Assignees
Labels
bug Something isn't working

Comments

@eejbyfeldt
Copy link
Contributor

eejbyfeldt commented Oct 22, 2024

Describe the bug

Currently we do not consider the volatility of expressions in SimplifyExpressions. This leads us to doing rewrites that might change the results and lead to unexpected behavior.

To Reproduce

Consider the following query:

> explain select * from VALUES (1), (2) where random() = 0 OR (column1 = 2 AND random() = 0);
+---------------+---------------------------------------------+
| plan_type     | plan                                        |
+---------------+---------------------------------------------+
| logical_plan  | Filter: random() = Float64(0)               |
|               |   Values: (Int64(1)), (Int64(2))            |
| physical_plan | CoalesceBatchesExec: target_batch_size=8192 |
|               |   FilterExec: random() = 0                  |
|               |     ValuesExec                              |
|               |                                             |
+---------------+---------------------------------------------+
2 row(s) fetched. 
Elapsed 0.013 seconds.

The predicate get simplified into random() = 0

Expected behavior

The predicate should not be simplified so we deduplicat the volatile expressions.

> explain select * from VALUES (1), (2) where random() = 0 OR (column1 = 2 AND random() = 0);
+---------------+----------------------------------------------------------------------------------+
| plan_type     | plan                                                                             |
+---------------+----------------------------------------------------------------------------------+
| logical_plan  | Filter: random() = Float64(0)  OR column1 = Int64(2) AND  random() = Float64(0)  |
|               |   Values: (Int64(1)), (Int64(2))                                                 |
| physical_plan | CoalesceBatchesExec: target_batch_size=8192                                      |
|               |   FilterExec: random() = 0                                                       |
|               |     ValuesExec                                                                   |
|               |                                                                                  |
+---------------+----------------------------------------------------------------------------------+
2 row(s) fetched. 
Elapsed 0.013 seconds.

Additional context

We can not exclude volatile expressions outright from simplification as we would still like the simplify for example following predicate

> explain select * from VALUES (1), (2) where column1 = 2 OR (column1 = 2 AND random() = 0);
+---------------+---------------------------------------------+
| plan_type     | plan                                        |
+---------------+---------------------------------------------+
| logical_plan  | Filter: column1 = Int64(2)                  |
|               |   Values: (Int64(1)), (Int64(2))            |
| physical_plan | CoalesceBatchesExec: target_batch_size=8192 |
|               |   FilterExec: column1@0 = 2                 |
|               |     ValuesExec                              |
|               |                                             |
+---------------+---------------------------------------------+
2 row(s) fetched. 
Elapsed 0.015 seconds.

As it does not change the result.

@eejbyfeldt eejbyfeldt added the bug Something isn't working label Oct 22, 2024
@eejbyfeldt eejbyfeldt changed the title SimplifyExpressions should consider not consider volatile expressions equal for rewrites SimplifyExpressions should not consider volatile expressions equal for rewrites Oct 23, 2024
@Lordworms
Copy link
Contributor

take

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants