-
Notifications
You must be signed in to change notification settings - Fork 915
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] Left Semi Join much slower than Inner join #10464
Comments
I have created a PR with a fix: #10511 |
I'm more interested to know why |
@jrhemstad I did some further investigation and determined that the the reason
I have updated my PR to reflect these changes: #10511 |
This issue has been labeled |
Closes #10464 Updates the `left_semi_join` to materialize the gather mask instead of generating it via a transform iterator. Including the `map.contains` in the `gather` call reduced occupancy due to increasing register usage. As a result, explicitly materializing the gather mask is faster. Authors: - Xavier Simmons (https://github.com/cheinger) - Yunsong Wang (https://github.com/PointKernel) Approvers: - Jake Hemstad (https://github.com/jrhemstad) - Yunsong Wang (https://github.com/PointKernel) URL: #10511
Describe the bug
I’ve noticed that Left Semi Join can be an order of magnitude slower than Inner Join. Below you can see the 10x slow down when we switch out Inner Join for Left Semi Join (same data & table sizes).
Inner Join
Left Semi Join
I did some further investigation and found that the cause for this slow down is the thrust::copy_if in
left_semi_anti_join
that copies the selected row indices that are found in the hash table. I switched this out forcub::DeviceSelect::Flagged
and observed an 18x performance improvement as seen below:Before
After
Expected behavior
I would expect Left Semi Join and Inner Join to be somewhat comparable in performance, and if anything, Inner Join to be slower because we have to handle duplicate keys.
Environment details
env_details.log
The text was updated successfully, but these errors were encountered: