-
Notifications
You must be signed in to change notification settings - Fork 915
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Refactor hash function templates and
hash_combine
(#10379)
This PR refactors a few pieces of libcudf's hash functions: - Define the utility function `hash_combine` only once (with 32/64 bit overloads), rather than several times in the codebase - ~Remove class template parameter from `MurmurHash3_32` and related classes. This template parameter was redundant. We already use a template for the argument of the `compute` method, which is called by `operator()`, so I put the template parameter on `operator()` instead of the whole class. I think this removal of the template parameter could be considered API-breaking so I added the `breaking` label.~ I retracted this change after conversation with @jrhemstad. I'll look into a different way to do this soon, using a dispatch-to-invoke approach as in #8217. This addresses part of issue #10081. I have a few more things I'd like to try, but this felt like a nicely-scoped PR so I stopped here for the moment. I benchmarked the code before and after making these changes and saw a small but consistent decrease in runtime. The benchmarks in `HashBenchmark/{HASH_MURMUR3,HASH_SERIAL_MURMUR3,HASH_SPARK_MURMUR3}_{nulls,no_nulls}/*` all decreased or saw no change in runtime, with a geometric mean of 2.87% less time. The benchmarks in `Hashing/hash_partition/*` all decreased or saw no change in runtime, with a geometric mean of 2.37% less time. For both sets of benchmarks, the largest data sizes saw more significant decreases in runtime, with a best-improvement of 7.38% less time in `HashBenchmark/HASH_MURMUR3_nulls/16777216` (similar for other large data sizes) and a best-improvement of 10.54% less time in `Hashing/hash_partition/1048576/256/64` (similar for other large data sizes). Authors: - Bradley Dice (https://github.com/bdice) Approvers: - Vukasin Milovanovic (https://github.com/vuule) - Conor Hoekstra (https://github.com/codereport) URL: #10379
- Loading branch information
Showing
3 changed files
with
72 additions
and
110 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters