-
Notifications
You must be signed in to change notification settings - Fork 915
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add missing atomic operators, refactor atomic operators, move atomic operators to detail namespace. #14962
Conversation
I support to retain previous behaviour. If not, it introduces another templated header into include dependency of other files. Templated atomicAdd is created to support unsupported types in CUDA. It's not required for these other files. |
@SurajAralihalli FYI |
@SurajAralihalli Would you like to share a review on this change? |
/merge |
Description
This PR does a thorough refactoring of
device_atomics.cuh
.cudf::detail::
(making this an API-breaking change, but most likely a low-impact break)atomicAdd
,atomicMin
,atomicMax
, etc. as discussed in [BUG] Sum and multiply aggregations promote unsigned input types to a signed output #10149 and Revert sum/product aggregation to always produceint64_t
type #14907.atomicCAS
path for types that are natively supported for those atomic operators, which we suspect as the root cause of the performance regression in [BUG] Performance regression in cuDF after #14679 #14886.atomicAdd
rather thancudf::detail::atomic_add
in locations where a native CUDA overload exists, and the same for min/max/CAS operations. Aggregations are the only place where we use the special overloads. We were previously calling the native CUDA function rather than our special overloads in many cases, so I retained the previous behavior. This avoids including the additional headers that implement an unnecessary level of wrapping for natively supported overloads.unsigned short int
) that eliminate the do-while loop and extra alignment-checking logicuint64_t
) are passing through nativeatomicAdd
calls.Checklist