-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve push down filter of join #13184
Improve push down filter of join #13184
Conversation
Thanks everyone |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this PR has correctness issues see: #13211
99 NULL NULL NULL | ||
|
||
query ITIT | ||
SELECT * FROM t1 LEFT JOIN t2 ON t1.t1_id = t2.t2_id AND t2.t2_name = 'z' ORDER BY t1_id, t1_name, t2_id, t2_name |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the predicates here are "to simple" and a lot of the outer joins here will be converted to Inner
joins by eliminate_outer_join rule and therefore these tests do not provide the desired coverage.
let columns = predicate.column_refs(); | ||
macro_rules! restrict_null { | ||
() => {{ | ||
let predicate_cloned = predicate.clone(); | ||
let cols = columns.iter().cloned(); | ||
is_restrict_null_predicate(predicate_cloned, cols).unwrap_or(false) | ||
}}; | ||
} | ||
|
||
if checker.left_only(&columns) && (left_preserved || restrict_null!()) { | ||
left_push.push(predicate); | ||
} else if right_preserved && checker.is_right_only(&predicate) { | ||
} else if checker.right_only(&columns) && (right_preserved || restrict_null!()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems incorrect. Why would a predicate being restrict null allow us to push it? It filtering it out null seems to be a argument against why it can be pushed down.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this is wrong. I neglected to modify JoinType here. Thank you @eejbyfeldt for your help in finding this problem.
This reverts commit 7ae1ccb.
For anyone following along, this PR appears to have had some correctness issues so @eejbyfeldt reverted it #13229 @JasonLi-cn are you willing to create a new PR that we can work through the correctness issues |
I think for this pushdown to be correct, the join type can be changed. I think this can be done in two phases:
|
Just adding to it, I think supporting this would be relatively simple. Currently eliminate_outer_join only supports columns that are non nullable, we can support expressions that filter out any nulls like most binary operators.
|
Hm actually, the transformation is already supported by |
I tracked this here #13232 |
I'm sorry that this PR has introduced bugs. I will continue to follow up this issue, and then please continue to help review. 🙏 |
Which issue does this PR close?
Closes #.
Rationale for this change
In the current version, the push down filter of Join is not very complete. For example:
Expect
BTW: If
eliminate_outer_join
can convert Left Join to Inner Join in this query, the filterabs(t2.c1) > Int32(5)
can also be pushed down. So the next plan is to improveeliminate_outer_join
rule. It's a little difficult.What changes are included in this PR?
Modify the
push_down_all_join
functionin
push_down_filter` rule.Are these changes tested?
yes
Are there any user-facing changes?