-
Notifications
You must be signed in to change notification settings - Fork 915
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] Sum and multiply aggregations promote unsigned input types to a signed output #10149
Comments
This issue has been labeled |
This issue has been labeled |
From #10102 (comment): I think the machinery in question was added before unsigned support and then was just never updated. It should be updated to use uint64 for unsigned integer types: cudf/cpp/include/cudf/detail/aggregation/aggregation.hpp Lines 1140 to 1147 in 1246116
could be updated to:
|
… Unsigned Output for Sum and Multiply (#14679) During aggregation, output types are modified to prevent overflow. Presently, summing INT32 yields INT64, but summing UINT32 still results in INT64 instead of UINT64. This pull request resolves Issue #[10149](#10149) to ensure the correct output type is used when summing or multiplying integers. Authors: - Suraj Aralihalli (https://github.com/SurajAralihalli) - Karthikeyan (https://github.com/karthikeyann) - Nghia Truong (https://github.com/ttnghia) Approvers: - Nghia Truong (https://github.com/ttnghia) - Shruti Shivakumar (https://github.com/shrshi) - Karthikeyan (https://github.com/karthikeyann) URL: #14679
The work in #14679 to address this issue ended up needed to be reverted in #14907 due to a performance regression reported in #14886. In addition to adding back the changes in #14679, we also need to:
|
I started an experiment in this direction before I re-read this issue and realized @SurajAralihalli was assigned here. With apologies to @SurajAralihalli, I think I have a good start on the atomics refactoring in #14962. I would like to get that PR merged first, because it should be a standalone improvement, and then we can revisit the changes that were originally reverted. |
The revert can be undone after merging #14962. I tested a similar fix while debugging this issue with @SurajAralihalli . |
Thanks @bdice for letting me know! |
…operators to detail namespace. (#14962) This PR does a thorough refactoring of `device_atomics.cuh`. - I moved all atomic-related functions to `cudf::detail::` (making this an API-breaking change, but most likely a low-impact break) - I added all missing operators for natively supported types to `atomicAdd`, `atomicMin`, `atomicMax`, etc. as discussed in #10149 and #14907. - This should prevent fallback to the `atomicCAS` path for types that are natively supported for those atomic operators, which we suspect as the root cause of the performance regression in #14886. - I kept `atomicAdd` rather than `cudf::detail::atomic_add` in locations where a native CUDA overload exists, and the same for min/max/CAS operations. Aggregations are the only place where we use the special overloads. We were previously calling the native CUDA function rather than our special overloads in many cases, so I retained the previous behavior. This avoids including the additional headers that implement an unnecessary level of wrapping for natively supported overloads. - I enabled native 2-byte CAS operations (on `unsigned short int`) that eliminate the do-while loop and extra alignment-checking logic - The CUDA docs don't state this, but some forum posts claim this is only supported by compute capability 7.0+. We now have 7.0 as a lower bound for RAPIDS so I'm not concerned by this as long as builds/tests pass. - I improved/cleaned the documentation and moved around some code so that the operators were in a logical order. - I assessed the existing tests and it looks like all the types are being covered. I'm not sure if there is a good way to enforce that certain types (like `uint64_t`) are passing through native `atomicAdd` calls. Authors: - Bradley Dice (https://github.com/bdice) Approvers: - David Wendt (https://github.com/davidwendt) - Suraj Aralihalli (https://github.com/SurajAralihalli) URL: #14962
Describe the bug
When performing aggregations, the output types are often upscaled to help combat overflow situations. For example, performing a sum aggregation on an INT32 column will produce an INT64 result. However performing a sum aggregation on a UINT32 column produces an INT64 result rather than a UINT64 result.
Steps/Code to reproduce bug
Perform a sum aggregation with an input column of UINT32 and note that the result is INT64. Here's a snippet of a session doing this with the cudf Java API in the Spark REPL shell:
Expected behavior
Unsigned input types should be promoted to unsigned output types for any aggregations where the sign of the result cannot change for unsigned inputs (e.g.: sum and multiply)
Additional context
See @jrhemstad's comment at #10102 (comment)
The text was updated successfully, but these errors were encountered: