-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: join swap for projected semi/anti joins #13022
Conversation
cc @berkaysynnada @my-vegetable-has-exploded Perhaps you could review this PR, due to still being in context after #12967? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This makes sense to me & looks good. It would be good to get another review
case::left(JoinType::Left, vec![1], true), | ||
case::right(JoinType::Right, vec![1], true), | ||
case::full(JoinType::Full, vec![1], true), | ||
case::left_anti(JoinType::LeftAnti, vec![0], false), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it will be better to add a swap test for LeftAnti & LeftSemi?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean a separate one? Initially I was going to do so, but realized that that it'll result in 3 almost equivalent tests ("normal" joins / left semi-anti / right semi-anti) except for minor differences, so decided that it could be better to parameterize test code to highlight these differences and reduce code duplication.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since small_on_right is false for LeftAnti & LeftSemi, the swap rule don't work for those tests. But it doesn't matters.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thank you @korowa, nice catch
I see you've already did what #13022 (comment) mentions, but perhaps I couldn't understand the point, so not merging this yet
}) | ||
.collect() | ||
}) | ||
match join_type { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm thanks @korowa
Which issue does this PR close?
Closes #10978.
Rationale for this change
On join swap embedded projection for anti/semi joins is getting modified just like for any other join types. This leads to resulting embedded projection, containing column indices out of bounds of join output schema.
What changes are included in this PR?
The fix is not to modify output projection for semi/anti joins -- they already have one-side schema as an output, and trying to add/subtract the number of fields from the other input side to column indices will break the projection.
Are these changes tested?
Existing test for swapping join with embedded projection is extended to test all join types.
Are there any user-facing changes?
Swapping inputs for semi/anti joins with embedded projection should start working.