-
Notifications
You must be signed in to change notification settings - Fork 912
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add XXHash_64 hash function to cudf #13612
Conversation
@davidwendt Note that a bug was found in the cuCollections implementation here: NVIDIA/cuCollections#326 |
Already fixed in a5a0d4b |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pick one of the following and then I can approve:
xxhash64
(cudf API) andXXHash64
(functor)xxhash_64
(cudf API) andXXHash_64
(functor)
auto block = reinterpret_cast<uint8_t const*>(data + offset); | ||
return block[0] | (block[1] << 8) | (block[2] << 16) | (block[3] << 24); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will always emit 4x pipelined LDG.E.U8
. I wonder if we should add an extra path that performs a single LDG.E.32
in case the pointer is aligned correctly.
Dumb question: When can the start of a string be not aligned to 4 bytes?
Instead of loading and shifting the result, a common pattern is to use a memcpy
for this:
uint32_t ret;
memcpy(&ret, block, sizeof(uint32_t));
return ret;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A string is almost never aligned to 4 bytes. A string is rarely allocated individually but usually part of a larger contiguous block of memory.
The plan is to move these block functions into a separate utilities header where I think we could optimize based on type.
Reference #13706
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A string is rarely allocated individually but usually part of a larger contiguous block of memory.
Good point! Let's leave this as-is for now then.
/merge |
When hashing large keys, e.g., strings, we traverse the input key iteratively in chunks of 4/8 bytes. The current implementation of the `load_chunk` function falsely assumes that the start of the key is always aligned to the chunk size, which is not always the case (see [discussion](rapidsai/cudf#13612 (comment))). Additionally, this PR fixes some uncaught `[-Wmaybe-uninitialized]` warnings when compiling the unit tests.
Description
Add XXHash_64 hash function to libcudf
Checklist