Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix has_atomic_support check in can_use_hash_groupby() #10588

Merged
merged 1 commit into from
Apr 5, 2022

Conversation

jbrennan333
Copy link
Contributor

Closes #10583.

Change the has_atomic_support check in can_use_hash_groupby() to check the target type for the aggregation instead of the source type.

See discussion in #10583.

I have verified that this fixes the performance regression in our customer queries, and all unit tests still pass.

@jbrennan333 jbrennan333 requested a review from a team as a code owner April 5, 2022 01:50
@github-actions github-actions bot added the libcudf Affects libcudf (C++/CUDA) code. label Apr 5, 2022
@jbrennan333 jbrennan333 self-assigned this Apr 5, 2022
@jbrennan333 jbrennan333 added bug Something isn't working 4 - Needs Review Waiting for reviewer to review or respond Performance Performance related issue Spark Functionality that helps Spark RAPIDS non-breaking Non-breaking change labels Apr 5, 2022
@codecov
Copy link

codecov bot commented Apr 5, 2022

Codecov Report

Merging #10588 (423c216) into branch-22.06 (291fbcf) will decrease coverage by 0.00%.
The diff coverage is 96.07%.

@@               Coverage Diff                @@
##           branch-22.06   #10588      +/-   ##
================================================
- Coverage         86.31%   86.31%   -0.01%     
================================================
  Files               140      140              
  Lines             22300    22255      -45     
================================================
- Hits              19249    19210      -39     
+ Misses             3051     3045       -6     
Impacted Files Coverage Δ
python/cudf/cudf/core/column/numerical.py 96.17% <ø> (ø)
python/cudf/cudf/testing/testing.py 81.69% <ø> (-2.82%) ⬇️
python/cudf/cudf/core/index.py 92.31% <90.90%> (+0.05%) ⬆️
python/cudf/cudf/core/single_column_frame.py 96.45% <93.75%> (-0.40%) ⬇️
python/cudf/cudf/core/column/categorical.py 89.77% <100.00%> (ø)
python/cudf/cudf/core/column/column.py 89.34% <100.00%> (+0.03%) ⬆️
python/cudf/cudf/core/column/datetime.py 89.71% <100.00%> (+0.39%) ⬆️
python/cudf/cudf/core/column/numerical_base.py 98.90% <100.00%> (+0.01%) ⬆️
python/cudf/cudf/core/column/string.py 89.10% <100.00%> (+0.12%) ⬆️
python/cudf/cudf/core/column/struct.py 96.42% <100.00%> (-0.09%) ⬇️
... and 10 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 090f6b8...423c216. Read the comment docs.

Copy link
Contributor

@jrhemstad jrhemstad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, I hadn't even thought of this as a solution. Very clean.

@PointKernel
Copy link
Member

@gpucibot merge

@rapids-bot rapids-bot bot merged commit faff5de into rapidsai:branch-22.06 Apr 5, 2022
@jbrennan333
Copy link
Contributor Author

Thanks @jrhemstad, @PointKernel, and @ttnghia for the reviews, and @jlowe for help debugging this problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
4 - Needs Review Waiting for reviewer to review or respond bug Something isn't working libcudf Affects libcudf (C++/CUDA) code. non-breaking Non-breaking change Performance Performance related issue Spark Functionality that helps Spark RAPIDS
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] groupby aggregation falls back to sort aggregation for count on a string
4 participants