-
Notifications
You must be signed in to change notification settings - Fork 915
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] SHA-1 and SHA-2 hashes #6020
Conversation
Please update the changelog in order to start CI tests. View the gpuCI docs here. |
uint8_t buffer[64]; | ||
}; | ||
|
||
__device__ __constant__ sha256_word_type sha256_hash_constants[64] = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm concerned about using up all of the available constant memory with all of these lookup tables. There's only 64KB available.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, what will the access pattern be across threads in the same warp? If it is random, rather than uniform, I think just __device__ const
would be better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Happy to change the approach, was asked to use __device__ __constant__
in the md5 hashing. Should I swap back to the thread_safe_per_context_cache
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, Mark's reply didn't show up in my initial comment.
Access pattern for each thread is sequential, but there's nothing synchronizing each thread across the warp -- so I think the access within a warp would effectively be random.
} | ||
|
||
std::unique_ptr<column> sha256_base(table_view const& input, | ||
bool truncate_output, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't use a bool parameter. Use an enum class with a descriptive name so it's more obvious at the callsite what this controls.
Please update the changelog in order to start CI tests. View the gpuCI docs here. |
Moving to 0.17 |
@rwlee is this still being developed? |
Moving to 0.19. |
This PR has been labeled |
@harrism I think it can be closed for now. We can reopen once we get back to this. |
This PR refactors the MD5 hash implementation in libcudf. I used the MD5 code as a reference while working on SHA (extending #6020, PR #9215 to follow). List of high-level changes: - I moved the implementation of `MD5Hash` and related logic from `include/cudf/detail/utilities/hash_functions.cuh` to `src/hash/md5_hash.cu` because it is only used in that file and nowhere else. We don't need to include and build MD5 in `hash_functions.cuh` for all the collections/sorting/groupby tools that only use Murmur3 variants and `IdentityHash`. (This will be a bigger deal once we add the SHA hash functions, soon to follow this PR, because the size of `hash_functions.cuh` would be substantially larger without this separation.) - I removed an `MD5Hash` constructor that accepted and stored a seed whose value was unused. - Improved use of namespaces. - Use named constants instead of magic numbers. - Introduced a `hash_circular_buffer` and refactored dispatch logic. No changes were made to the feature scope or public APIs of the MD5 feature, so existing unit tests and bindings should remain the same. Authors: - Bradley Dice (https://github.com/bdice) Approvers: - David Wendt (https://github.com/davidwendt) - Mark Harris (https://github.com/harrism) - Jake Hemstad (https://github.com/jrhemstad) - Vyas Ramasubramani (https://github.com/vyasr) URL: #9212
Resolves #4989
Linked issue was resolved when MD5 functionality went in. This PR adds SHA hashes to existing hashing functionality.
In this WIP PR, the SHA2 family of hashes is split into 2 separate kernels -- 1 for SHA-224 + SHA-256 and the other for SHA-384 and SHA-512. The underlying algorithm is very similar, but uses a different word size, buffer size, iteration counts, and shift constants. Given these differences, is it reasonable to split the algorithm despite how similar the resulting code will be?