Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[gpuCI] Forward-merge branch-22.04 to branch-22.06 [skip gpuci] #10546

Merged
merged 1 commit into from
Mar 30, 2022

Conversation

GPUtester
Copy link
Collaborator

Forward-merge triggered by push to branch-22.04 that creates a PR to keep branch-22.06 up-to-date. If this PR is unable to be immediately merged due to conflicts, it will remain open for the team to manually merge.

…seen in NDS q72 in Spark (#10534)

The following change addresses a performance degradation we noticed in the `mixed_join` and `compute_mixed_join_output_size` that looks to be tied to the theoretical occupancy of these kernels, as limited by the number of registers used.

The regression is triggered by this patch: #9727, which improves handling of unreachable code paths. That said, somehow, this change is altering the number of registers these kernels need. Both `mixed_join` and `compute_mixed_join_output_size` are very sensitive to the register count, per NSight compute. With the patch, the register required changed from 92 to 102, and 118 to 141 respectively. 

The fix here hints the compiler what our block size is (128 threads). This, from our testing, allows the compiler to reduce the number of registers required to 128 for `compute_mixed_join_output_size` and 96 for `mixed_join`. This lead to better occupancy (I think @nvdbaranec measured it going from 30% to 50%) and I saw the wall clock time of q72 (which started all this) to go from 133s to 121s, which is within the ballpark I'd expect.

Authors:
   - Alessandro Bellina (https://github.com/abellina)

Approvers:
   - Mike Wilson (https://github.com/hyperbolic2346)
@GPUtester GPUtester requested a review from a team as a code owner March 30, 2022 20:40
@GPUtester GPUtester merged commit 2d8d913 into branch-22.06 Mar 30, 2022
@GPUtester
Copy link
Collaborator Author

SUCCESS - forward-merge complete.

@github-actions github-actions bot added the libcudf Affects libcudf (C++/CUDA) code. label Mar 30, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
libcudf Affects libcudf (C++/CUDA) code.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants