Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: set null_equals_null to false when convert_cross_join_to_inner_join #11738

Merged
merged 1 commit into from
Jul 31, 2024

Conversation

jonahgao
Copy link
Member

Which issue does this PR close?

Closes #11704.

Rationale for this change

Given an example query:

CREATE TABLE t1(a INT) AS VALUES(NULL);
CREATE TABLE t2(a INT) AS VALUES(NULL);
SELECT * FROM
  (SELECT * FROM t1 CROSS JOIN t2)
WHERE t1.a == t2.a
  AND t1.a + t2.a IS NULL;

The push_down_filter rule will convert the cross join into an inner join.

| initial_logical_plan                | Projection: t1.a, t2.a                                       |
|                                     |   Filter: t1.a = t2.a AND t1.a + t2.a IS NULL                |
|                                     |     Projection: t1.a, t2.a                                   |
|                                     |       CrossJoin:                                             |
|                                     |         TableScan: t1                                        |
|                                     |         TableScan: t2                                        |
| logical_plan after push_down_filter | Projection: t1.a, t2.a                                       |
|                                     |   Projection: t1.a, t2.a                                     |
|                                     |     Inner Join:  Filter: t2.a = t1.a AND t1.a + t2.a IS NULL |
|                                     |       TableScan: t1                                          |
|                                     |       TableScan: t2                                          |
| extract_equijoin_predicate          | Inner Join: t1.a = t2.a Filter: t1.a + t2.a IS NULL          |
|                                     |   TableScan: t1 projection=[a]                               |
|                                     |   TableScan: t2 projection=[a]                               |

null_equals_null is only valid for equijoins, and when applying the corresponding equality predicate to the cross join, null does not equal null. So we should set null_equals_null to false in order to convert the cross join to an equivalent equijoin.

What changes are included in this PR?

Are these changes tested?

Yes

Are there any user-facing changes?

No

(SELECT * FROM t1 CROSS JOIN t2)
WHERE t1.a == t2.a
AND t1.a + t2.a IS NULL;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The result of this query on the main branch is incorrect.

@github-actions github-actions bot added optimizer Optimizer rules sqllogictest SQL Logic Tests (.slt) labels Jul 31, 2024
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @jonahgao (and again @2010YOUY01 for the wonderful SQLancer testing)

@alamb alamb merged commit 2887491 into apache:main Jul 31, 2024
24 checks passed
@jonahgao jonahgao deleted the null_equals_null branch August 1, 2024 00:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
optimizer Optimizer rules sqllogictest SQL Logic Tests (.slt)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Incorrect result returned by a NLJ query with filter (SQLancer-NoREC)
2 participants