Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add dispatch_to_invoke for better type dispatching #8217

Closed
wants to merge 7 commits into from

Conversation

jrhemstad
Copy link
Contributor

  • Added dispatch_to_invoke which wraps a type_dispatcher call and invoking a type template's static invoke member function
  • Updated gather to use dispatch_to_invoke as an example
  • Cleaned up places that were reusing gather's dispatched function object instead of calling gather directly
  • Added an overload of detail::gather that accepts and returns a column
  • Update developer guide for dispatch_to_invoke

@github-actions github-actions bot added the libcudf Affects libcudf (C++/CUDA) code. label May 11, 2021
@jrhemstad jrhemstad added non-breaking Non-breaking change improvement Improvement / enhancement to an existing function labels May 11, 2021
@codecov
Copy link

codecov bot commented May 12, 2021

Codecov Report

All modified and coverable lines are covered by tests ✅

❗ No coverage uploaded for pull request base (21.12@8ae73d5). Click here to learn what that means.

Additional details and impacted files
@@           Coverage Diff            @@
##             21.12    #8217   +/-   ##
========================================
  Coverage         ?   82.88%           
========================================
  Files            ?      104           
  Lines            ?    17907           
  Branches         ?        0           
========================================
  Hits             ?    14843           
  Misses           ?     3064           
  Partials         ?        0           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@jrhemstad jrhemstad changed the base branch from branch-21.06 to 21.12 March 7, 2022 21:09
rapids-bot bot pushed a commit that referenced this pull request Mar 8, 2022
This PR refactors a few pieces of libcudf's hash functions:
- Define the utility function `hash_combine` only once (with 32/64 bit overloads), rather than several times in the codebase
- ~Remove class template parameter from `MurmurHash3_32` and related classes. This template parameter was redundant. We already use a template for the argument of the `compute` method, which is called by `operator()`, so I put the template parameter on `operator()` instead of the whole class. I think this removal of the template parameter could be considered API-breaking so I added the `breaking` label.~ I retracted this change after conversation with @jrhemstad. I'll look into a different way to do this soon, using a dispatch-to-invoke approach as in #8217.

This addresses part of issue #10081. I have a few more things I'd like to try, but this felt like a nicely-scoped PR so I stopped here for the moment.

I benchmarked the code before and after making these changes and saw a small but consistent decrease in runtime.

The benchmarks in `HashBenchmark/{HASH_MURMUR3,HASH_SERIAL_MURMUR3,HASH_SPARK_MURMUR3}_{nulls,no_nulls}/*` all decreased or saw no change in runtime, with a geometric mean of 2.87% less time.

The benchmarks in `Hashing/hash_partition/*` all decreased or saw no change in runtime, with a geometric mean of 2.37% less time.

For both sets of benchmarks, the largest data sizes saw more significant decreases in runtime, with a best-improvement of 7.38% less time in `HashBenchmark/HASH_MURMUR3_nulls/16777216` (similar for other large data sizes) and a best-improvement of 10.54% less time in `Hashing/hash_partition/1048576/256/64` (similar for other large data sizes).

Authors:
  - Bradley Dice (https://github.com/bdice)

Approvers:
  - Vukasin Milovanovic (https://github.com/vuule)
  - Conor Hoekstra (https://github.com/codereport)

URL: #10379
@bdice bdice mentioned this pull request Jun 8, 2022
@jrhemstad jrhemstad mentioned this pull request Aug 8, 2022
3 tasks
@abellina abellina deleted the branch rapidsai:21.12 January 26, 2024 22:29
@abellina abellina closed this Jan 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
improvement Improvement / enhancement to an existing function libcudf Affects libcudf (C++/CUDA) code. non-breaking Non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants