-
Notifications
You must be signed in to change notification settings - Fork 915
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FEA] Hash function refactoring #13706
Labels
0 - Backlog
In queue waiting for assignment
feature request
New feature or request
libcudf
Affects libcudf (C++/CUDA) code.
Comments
bdice
added
feature request
New feature or request
Needs Triage
Need team to review and classify
labels
Jul 17, 2023
This was referenced Jul 17, 2023
GregoryKimball
added
0 - Backlog
In queue waiting for assignment
libcudf
Affects libcudf (C++/CUDA) code.
and removed
Needs Triage
Need team to review and classify
labels
Jul 22, 2023
|
@davidwendt proposed removing unsanitized nulls from hashing tests. I agree with this idea. I refactored MD5's tests in 0188115 and will do the same for SHA in PR #14391, but additional work is needed for other hashing algorithms to remove unsanitized nulls from the tests. |
We also need to standardize null behavior for hashing algorithms. See #10451. |
3 tasks
rapids-bot bot
pushed a commit
that referenced
this issue
Feb 20, 2024
The `cudf::hashing::spark_murmurhash3_x86_32()` function was moved to the Spark plugin since it had common code with the Spark implementation of `xxhash_64` (also implemented in the plugin). This change deprecates the API and the generic `cudf::hashing::hash()` function to be removed in a follow-on release. Reference hash cleanup issue: #13706 Authors: - David Wendt (https://github.com/davidwendt) Approvers: - Bradley Dice (https://github.com/bdice) - Karthikeyan (https://github.com/karthikeyann) URL: #15074
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
0 - Backlog
In queue waiting for assignment
feature request
New feature or request
libcudf
Affects libcudf (C++/CUDA) code.
Following up from #13681 and #13612, there are some tasks I think can be done to clean up hashing code. I am opening this issue to be a tracker for the work we've deferred from other PRs.
getblock32
can be reused by multiple hash functions. https://github.com/rapidsai/cudf/pull/13612/files#r1265587501Also consider optimizing this to use
memcpy
orif constexpr
on key type perhaps.rotate_bits_left
should not use CUDA intrinsics (it doesn't affect the PTX/SASS compared to a naive shift-based implementation), thereby making themconstexpr
-friendly and possible to put into a sharedhpp
header. Separate MurmurHash32 from hash_functions.cuh #13681 (comment)@brief
of hashing APIs should mention the algorithm name. https://github.com/rapidsai/cudf/pull/13612/files#r1267270076The text was updated successfully, but these errors were encountered: