Murmur3 hash kernel cleanup #10143

rwlee · 2022-01-27T01:27:21Z

Followup to #9919 -- kernel merging and code cleanup for Murmur3 hash.

Partial fix for #10081.

Benchmarked compute_bytes kernel with aligned read vs unaligned read and saw no difference. Looking into it further to confirm that the uint32_t construction was doing the same thing implicitly.

Due to byte alignment, the string alignment will require the getblock32 function regardless. Regardless, the benchmarks ran with 100, 103, and 104 byte strings had negligible performance differences. This reflects forced misalignment not negatively impacting the hash speed.

codecov · 2022-01-27T03:42:55Z

Codecov Report

Merging #10143 (e5e4914) into branch-22.04 (a7d88cd) will decrease coverage by 0.25%.
The diff coverage is 0.00%.

@@               Coverage Diff                @@
##           branch-22.04   #10143      +/-   ##
================================================
- Coverage         10.42%   10.16%   -0.26%     
================================================
  Files               119      122       +3     
  Lines             20603    24693    +4090     
================================================
+ Hits               2148     2511     +363     
- Misses            18455    22182    +3727

Impacted Files	Coverage Δ
python/cudf/cudf/core/_base_index.py	`0.00% <ø> (ø)`
python/cudf/cudf/core/column/categorical.py	`0.00% <0.00%> (ø)`
python/cudf/cudf/core/column/column.py	`0.00% <0.00%> (ø)`
python/cudf/cudf/core/column/datetime.py	`0.00% <ø> (ø)`
python/cudf/cudf/core/column/numerical.py	`0.00% <0.00%> (ø)`
python/cudf/cudf/core/column/string.py	`0.00% <ø> (ø)`
python/cudf/cudf/core/column/timedelta.py	`0.00% <0.00%> (ø)`
python/cudf/cudf/core/column_accessor.py	`0.00% <0.00%> (ø)`
python/cudf/cudf/core/dataframe.py	`0.00% <0.00%> (ø)`
python/cudf/cudf/core/frame.py	`0.00% <ø> (ø)`
... and 35 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 83accc6...e5e4914. Read the comment docs.

bdice

Here are some starting ideas for us to refactor. We can work through these in a pairing session.

cpp/include/cudf/detail/utilities/hash_functions.cuh

bdice · 2022-01-27T17:59:45Z

cpp/include/cudf/detail/utilities/hash_functions.cuh

+    constexpr uint32_t c3             = 0xe6546b64;
+    constexpr uint32_t rot_c1         = 15;
+    constexpr uint32_t rot_c2         = 13;
+    auto getblock32                   = [] __device__(uint32_t const* p, int i) -> uint32_t {


Let's pull this lambda out into a separate device function (not define it as an inline lambda) - like rotl32 and fmix32.

See additional comments below about why this isn't safe to take uint32 const* and must instead take std::byte const*. However, then there's no need to have the offset parameter int i because we can do that with pointer arithmetic at the call site.

bdice · 2022-01-27T22:42:21Z

cpp/include/cudf/detail/utilities/hash_functions.cuh

      k1 *= c1;
-      k1 = rotl32(k1, 15);
+      k1 = rotl32(k1, rot_c1);


Future PR: We might define common functions and magic values between MurmurHash3_32 and SparkMurmurHash3_32 like rotl32 and fmix32 and getblock32 in a common base class, and only override the Spark-specific bits in a derived class. CRTP might be an even better choice, like I did for the SHA-family functions (draft #9215) - just needs a bit of analysis to decide which way to go.

cpp/include/cudf/detail/utilities/hash_functions.cuh

rwlee · 2022-02-01T03:43:41Z

original_benchmark.txt
new_benchmark.txt

There's a very tiny performance hit after the most recent set of change, I don't think it's large enough to be a concern.

bdice

I think this PR is beneficial and its scope is reasonable -- I'd hesitate to make it larger since we're at a stopping point and have good benchmarks. I have some ideas for further refactors that I'd like to take on later.

I'm going to apply my own suggestions to renaming variables and will merge branch-22.04 so we can get an updated build time metrics report.

bdice · 2022-01-28T23:15:01Z

cpp/include/cudf/detail/utilities/hash_functions.cuh

@@ -105,6 +105,14 @@ struct MurmurHash3_32 {
    return h;
  }



Note to self for a future PR: Do we need MurmurHash3_32 to be a templated class? Currently the class takes a template parameter Key and has an operator()(Key key) with no template parameters which calls a templated compute(T key). However, the way it's called in row_operators.cuh seems to indicate that we could instead have a plain (non-template) class with a templated operator(). That's the way we typically do type dispatching, and it's reversed here for no clear reason. The calling code uses a type dispatch on element_hasher_with_seed.

(This would probably affect performance and/or compile time but I don't know if it would be better or worse.)

cpp/include/cudf/detail/utilities/hash_functions.cuh

bdice · 2022-02-02T23:08:46Z

cpp/include/cudf/detail/utilities/hash_functions.cuh

@@ -131,60 +139,69 @@ struct MurmurHash3_32 {
    return combined;
  }

-  result_type __device__ inline operator()(Key const& key) const { return compute(key); }
+  // TODO Do we need this operator() and/or compute? Probably not both.


Note to self for a future PR: I would try removing the compute method and move its definition to operator(). I think we might be able to safely remove the template <typename T> on compute(T). The operator() template parameter T has to match the class template parameter Key, from what I can see, and may be redundant. Any exceptions to this would probably be solved by removing the class template parameter Key and switching to just an operator() template parameter.

cpp/include/cudf/detail/utilities/hash_functions.cuh

…format2204

bdice · 2022-02-04T03:16:26Z

rerun tests

bdice · 2022-02-04T03:17:48Z

@cwharris This is ready for review, I just fixed the labels so it no longer says “in progress.” 😊

bdice · 2022-02-07T22:28:47Z

@gpucibot merge

Merge kernels into compute_bytes and cleanup kernel code

1376774

rwlee requested a review from bdice January 27, 2022 01:27

github-actions bot added the libcudf Affects libcudf (C++/CUDA) code. label Jan 27, 2022

rwlee added code quality non-breaking Non-breaking change labels Jan 27, 2022

rwlee added the 2 - In Progress Currently a work in progress label Jan 27, 2022

bdice reviewed Jan 28, 2022

View reviewed changes

bdice added the improvement Improvement / enhancement to an existing function label Jan 28, 2022

rwlee added 2 commits January 28, 2022 14:52

formatting and code cleanup

db648d7

more formatting changes

291ccb9

rwlee marked this pull request as ready for review February 1, 2022 00:01

rwlee requested a review from a team as a code owner February 1, 2022 00:01

rwlee requested a review from cwharris February 1, 2022 00:01

bdice approved these changes Feb 2, 2022

View reviewed changes

bdice added 2 commits February 2, 2022 17:25

Apply suggestions from code review

5eaa8e6

Merge remote-tracking branch 'upstream/branch-22.04' into rwlee/hash_…

e5e4914

…format2204

bdice mentioned this pull request Feb 3, 2022

[BUG] test_groupby_diff_row_mixed_numerics failed randomly with overflow in datetime subtraction #10214

Open

bdice added 3 - Ready for Review Ready for review by team and removed 2 - In Progress Currently a work in progress labels Feb 4, 2022

cwharris approved these changes Feb 7, 2022

View reviewed changes

rapids-bot bot merged commit 8014add into rapidsai:branch-22.04 Feb 7, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Murmur3 hash kernel cleanup #10143

Murmur3 hash kernel cleanup #10143

rwlee commented Jan 27, 2022

codecov bot commented Jan 27, 2022 •

edited

Loading

bdice left a comment

bdice Jan 27, 2022 •

edited

Loading

bdice Jan 27, 2022 •

edited

Loading

rwlee commented Feb 1, 2022

bdice left a comment

bdice Jan 28, 2022 •

edited

Loading

bdice Feb 2, 2022 •

edited

Loading

bdice commented Feb 4, 2022

bdice commented Feb 4, 2022

bdice commented Feb 7, 2022

Murmur3 hash kernel cleanup #10143

Murmur3 hash kernel cleanup #10143

Conversation

rwlee commented Jan 27, 2022

codecov bot commented Jan 27, 2022 • edited Loading

Codecov Report

bdice left a comment

Choose a reason for hiding this comment

bdice Jan 27, 2022 • edited Loading

Choose a reason for hiding this comment

bdice Jan 27, 2022 • edited Loading

Choose a reason for hiding this comment

rwlee commented Feb 1, 2022

bdice left a comment

Choose a reason for hiding this comment

bdice Jan 28, 2022 • edited Loading

Choose a reason for hiding this comment

bdice Feb 2, 2022 • edited Loading

Choose a reason for hiding this comment

bdice commented Feb 4, 2022

bdice commented Feb 4, 2022

bdice commented Feb 7, 2022

codecov bot commented Jan 27, 2022 •

edited

Loading

bdice Jan 27, 2022 •

edited

Loading

bdice Jan 27, 2022 •

edited

Loading

bdice Jan 28, 2022 •

edited

Loading

bdice Feb 2, 2022 •

edited

Loading