-
Notifications
You must be signed in to change notification settings - Fork 915
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor hash functions and hash_combine
#10379
Conversation
hash_combine
Codecov Report
@@ Coverage Diff @@
## branch-22.04 #10379 +/- ##
=============================================
Coverage 10.50% 10.50%
=============================================
Files 126 127 +1
Lines 21218 21200 -18
=============================================
Hits 2228 2228
+ Misses 18990 18972 -18
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the detailed description, made it very easy to review the code :)
Looks good 👍
@@ -83,16 +83,15 @@ void __device__ inline uint32ToLowercaseHexString(uint32_t num, char* destinatio | |||
// algorithms are optimized for their respective platforms. You can still | |||
// compile and run any of them on any platform, but your performance with the | |||
// non-native version will be less than optimal. | |||
template <typename Key> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The precedent in C++ hash function objects is that the type itself is a template and not callable operator.
https://en.cppreference.com/w/cpp/utility/hash
This is important if a hash function for different types requires different state or different behavior in the ctor to initialize state.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jrhemstad I wasn't familiar with that, thanks for the reference. Do you think we need to retain that class template behavior, or is this comment just informational to explain the previous design? In practice, we don't have any constructor behavior that varies by key type. It seems like a hash functor with type specializations of operator()
is more in line with the rest of libcudf's design, unless you think that matching std::hash
is crucial here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer to keep the class template.
It seems like a hash functor with type specializations of operator() is more in line with the rest of libcudf's design
Class templates are actually more in line with "dispatch to invoke" pattern I've advocated for in the past. See: https://github.com/rapidsai/cudf/pull/8217/files and the Better Code talk I gave about this pattern.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGreatTM 👍
rerun tests |
I wanted to check the benchmarks again before merging. Commit 65edea2 shows the same performance as |
@gpucibot merge |
hash_combine
hash_combine
This PR refactors a few pieces of libcudf's hash functions:
hash_combine
only once (with 32/64 bit overloads), rather than several times in the codebaseRemove class template parameter fromI retracted this change after conversation with @jrhemstad. I'll look into a different way to do this soon, using a dispatch-to-invoke approach as in Add dispatch_to_invoke for better type dispatching #8217.MurmurHash3_32
and related classes. This template parameter was redundant. We already use a template for the argument of thecompute
method, which is called byoperator()
, so I put the template parameter onoperator()
instead of the whole class. I think this removal of the template parameter could be considered API-breaking so I added thebreaking
label.This addresses part of issue #10081. I have a few more things I'd like to try, but this felt like a nicely-scoped PR so I stopped here for the moment.
Benchmarking info (outdated)
I benchmarked the code before and after making these changes and saw a small but consistent decrease in runtime.The benchmarks in
HashBenchmark/{HASH_MURMUR3,HASH_SERIAL_MURMUR3,HASH_SPARK_MURMUR3}_{nulls,no_nulls}/*
all decreased or saw no change in runtime, with a geometric mean of 2.87% less time.The benchmarks in
Hashing/hash_partition/*
all decreased or saw no change in runtime, with a geometric mean of 2.37% less time.For both sets of benchmarks, the largest data sizes saw more significant decreases in runtime, with a best-improvement of 7.38% less time in
HashBenchmark/HASH_MURMUR3_nulls/16777216
(similar for other large data sizes) and a best-improvement of 10.54% less time inHashing/hash_partition/1048576/256/64
(similar for other large data sizes).See comment below for updated benchmarks.