Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Aggregation Type Promotion: Ensure Unsigned Input Types Result in Unsigned Output for Sum and Multiply #14679

Merged
merged 5 commits into from
Jan 23, 2024

Conversation

SurajAralihalli
Copy link
Contributor

@SurajAralihalli SurajAralihalli commented Dec 28, 2023

Description

During aggregation, output types are modified to prevent overflow. Presently, summing INT32 yields INT64, but summing UINT32 still results in INT64 instead of UINT64. This pull request resolves #10149 to ensure the correct output type is used when summing or multiplying integers.

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

Signed-off-by: Suraj Aralihalli <[email protected]>
Signed-off-by: Suraj Aralihalli <[email protected]>
Copy link

copy-pr-bot bot commented Dec 28, 2023

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@github-actions github-actions bot added the libcudf Affects libcudf (C++/CUDA) code. label Dec 28, 2023
@karthikeyann karthikeyann added bug Something isn't working non-breaking Non-breaking change labels Jan 2, 2024
@karthikeyann
Copy link
Contributor

@SurajAralihalli The groupby tests already tests this. Since the tests use cudf::detail::target_type_t, this was not caught.
Suggestion: Add static assert with std::is_unsigned_v<Source> in

TYPED_TEST(groupby_sum_test, basic)

Perhaps, adding a separate test for type checking to cover all cases, could be considered too.

@karthikeyann
Copy link
Contributor

/ok to test

Signed-off-by: Suraj Aralihalli <[email protected]>
@SurajAralihalli
Copy link
Contributor Author

@SurajAralihalli The groupby tests already tests this. Since the tests use cudf::detail::target_type_t, this was not caught. Suggestion: Add static assert with std::is_unsigned_v<Source> in

TYPED_TEST(groupby_sum_test, basic)

Perhaps, adding a separate test for type checking to cover all cases, could be considered too.

Thanks @karthikeyann, the groupby_sum_test doesn't include any unsigned types. I have added uint16_t and uint64_t in the supported_types to test.

@karthikeyann
Copy link
Contributor

/ok to test

@SurajAralihalli SurajAralihalli marked this pull request as ready for review January 4, 2024 16:22
@SurajAralihalli SurajAralihalli requested a review from a team as a code owner January 4, 2024 16:22
@karthikeyann
Copy link
Contributor

/ok to test

@karthikeyann
Copy link
Contributor

karthikeyann commented Jan 10, 2024

@SurajAralihalli
Copy link
Contributor Author

https://docs.github.com/en/issues/tracking-your-work-with-issues/linking-a-pull-request-to-an-issue#linking-a-pull-request-to-an-issue-using-a-keyword
Recommend using keyword like closes or fixes or resolves in description

Thanks @karthikeyann, I'll keep that in mind!

@ttnghia
Copy link
Contributor

ttnghia commented Jan 17, 2024

/ok to test

@nvdbaranec
Copy link
Contributor

I don't see any specific handling here for int32/uint32. Is that handled in another way?

@SurajAralihalli
Copy link
Contributor Author

SurajAralihalli commented Jan 19, 2024

I don't see any specific handling here for int32/uint32. Is that handled in another way?

For computing target_type_t? The target type is always upscaled to int64/uint64 for integers of any type.

@karthikeyann
Copy link
Contributor

/merge

@rapids-bot rapids-bot bot merged commit a39897c into rapidsai:branch-24.02 Jan 23, 2024
73 checks passed
abellina added a commit to abellina/cudf that referenced this pull request Jan 26, 2024
…esult in Unsigned Output for Sum and Multiply (rapidsai#14679)"

This reverts commit a39897c.
raydouglass pushed a commit that referenced this pull request Jan 29, 2024
This pull request reverses the modifications made to the sum/product aggregation target type, ensuring it always produces int64. The changes implemented by  PR [14679](#14679) which led to degraded performance when the aggregation column had an unsigned type, are reverted. Additional details can be found in the issue [14886](#14886).

Authors:
   - Suraj Aralihalli (https://github.com/SurajAralihalli)

Approvers:
   - David Wendt (https://github.com/davidwendt)
   - Nghia Truong (https://github.com/ttnghia)
   - Karthikeyan (https://github.com/karthikeyann)
abellina added a commit to abellina/cudf that referenced this pull request Jan 31, 2024
…esult in Unsigned Output for Sum and Multiply (rapidsai#14679)"

This reverts commit a39897c.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working libcudf Affects libcudf (C++/CUDA) code. non-breaking Non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants