Improve the performance of low cardinality groupby #16619

PointKernel · 2024-08-20T22:21:05Z

Description

This PR enhances groupby performance for low-cardinality input cases. When applicable, it leverages shared memory for initial aggregation, followed by global memory aggregation to reduce atomic contention and improve performance.

Checklist

I am familiar with the Contributing Guidelines.
New or existing tests cover these changes.
The documentation is up to date with these changes.

This PR introduces a new `num_multiprocessors` utility and moves the existing `elements_per_thread` host utility to the new `cuda.hpp` header. Needed by #16619. Authors: - Yunsong Wang (https://github.com/PointKernel) Approvers: - David Wendt (https://github.com/davidwendt) - Mark Harris (https://github.com/harrism) URL: #16628

PointKernel · 2024-10-31T01:51:29Z

This PR is ready for review.

davidwendt

Looks good. Just a couple suggestions.

cpp/src/groupby/hash/compute_shared_memory_aggs.cu

cpp/src/groupby/hash/single_pass_functors.cuh

Co-authored-by: David Wendt <[email protected]>

hyperbolic2346

Unsure about the copyright data, but otherwise looks good.

cpp/src/groupby/hash/single_pass_functors.cuh

vyasr

Cmake approval.

PointKernel · 2024-11-08T21:15:20Z

/merge

PointKernel added 4 commits August 19, 2024 12:04

Update docs

1fa441e

Minor improvement

65e1b5a

Migrate the GQE shared memory groupby to cudf

c58ddef

Merge remote-tracking branch 'upstream/branch-24.10' into shm-groupby

bb24053

PointKernel added 2 - In Progress Currently a work in progress libcudf Affects libcudf (C++/CUDA) code. CMake CMake build issue Performance Performance related issue improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels Aug 20, 2024

github-actions bot removed the CMake CMake build issue label Aug 20, 2024

PointKernel added 6 commits August 20, 2024 17:52

Many cleanups

d604d0a

Minor cleanups: use CCCL traits in device APIs

9ab1c02

Move more constexpr to the helper

db1b26a

More cleanups with constexprs

9993283

Add doc

c96d02c

Renaming

7cd14d6

PointKernel mentioned this pull request Aug 21, 2024

Add num_multiprocessors utility #16628

Merged

3 tasks

Fix cardinality bench

1e04c10

PointKernel self-assigned this Aug 21, 2024

PointKernel added 2 commits August 22, 2024 09:53

Merge remote-tracking branch 'upstream/branch-24.10' into shm-groupby

b21909a

More cleanups with CG

47aee18

PointKernel added 7 commits August 28, 2024 11:39

Use custom cuco

6eb3459

Merge branch 'branch-24.10' into shm-groupby

f9adaad

Merge remote-tracking branch 'upstream/branch-24.10' into shm-groupby

c08e9aa

Cleanups with new key_eq and hash_function

ee5f7fa

Remove the redundant num_sms function

aa4e957

Add missing header + minor cleanup

4fdb4b8

Clean up grid_size and shmem_size utilities

4049aeb

PointKernel added 4 commits October 30, 2024 17:41

Merge remote-tracking branch 'upstream/branch-24.12' into shm-groupby

ac03ce8

Renaming for clarity + add missing func

fef1ca8

Minor fix

8ccd817

Merge remote-tracking branch 'upstream/branch-24.12' into shm-groupby

540503d

PointKernel added 3 - Ready for Review Ready for review by team and removed 2 - In Progress Currently a work in progress labels Oct 31, 2024

PointKernel requested review from davidwendt and hyperbolic2346 October 31, 2024 01:50

PointKernel added 2 commits November 4, 2024 11:40

Merge remote-tracking branch 'upstream/branch-24.12' into shm-groupby

5d5e7ff

Update comments

0c315f8

davidwendt approved these changes Nov 6, 2024

View reviewed changes

cpp/src/groupby/hash/compute_shared_memory_aggs.cu Outdated Show resolved Hide resolved

cpp/src/groupby/hash/single_pass_functors.cuh Outdated Show resolved Hide resolved

PointKernel and others added 5 commits November 6, 2024 13:25

Apply suggestions from code review

7131c9f

Co-authored-by: David Wendt <[email protected]>

Merge remote-tracking branch 'upstream/branch-24.12' into shm-groupby

c520f41

Make compute_shmem_offsets_size constexpr

5c6b33c

Formatting

b05fab4

Merge remote-tracking branch 'upstream/branch-24.12' into shm-groupby

f32bbf8

hyperbolic2346 requested changes Nov 8, 2024

View reviewed changes

cpp/src/groupby/hash/single_pass_functors.cuh Outdated Show resolved Hide resolved

Merge branch 'branch-24.12' into shm-groupby

6a5d582

github-actions bot assigned hyperbolic2346 Nov 8, 2024

PointKernel commented Nov 8, 2024

View reviewed changes

cpp/src/groupby/hash/single_pass_functors.cuh Outdated Show resolved Hide resolved

Update cpp/src/groupby/hash/single_pass_functors.cuh

96fbaa9

PointKernel requested a review from hyperbolic2346 November 8, 2024 01:07

hyperbolic2346 approved these changes Nov 8, 2024

View reviewed changes

vyasr approved these changes Nov 8, 2024

View reviewed changes

rapids-bot bot merged commit 2e0d2d6 into rapidsai:branch-24.12 Nov 8, 2024
105 checks passed

PointKernel deleted the shm-groupby branch November 8, 2024 21:15

karthikeyann approved these changes Nov 8, 2024

View reviewed changes

GregoryKimball mentioned this pull request Nov 18, 2024

[FEA] Add shared memory hash map for low-cardinality aggregations #15262

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve the performance of low cardinality groupby #16619

Improve the performance of low cardinality groupby #16619

PointKernel commented Aug 20, 2024 •

edited

Loading

PointKernel commented Oct 31, 2024

davidwendt left a comment

hyperbolic2346 left a comment

vyasr left a comment

PointKernel commented Nov 8, 2024

Improve the performance of low cardinality groupby #16619

Improve the performance of low cardinality groupby #16619

Conversation

PointKernel commented Aug 20, 2024 • edited Loading

Description

Checklist

PointKernel commented Oct 31, 2024

davidwendt left a comment

Choose a reason for hiding this comment

hyperbolic2346 left a comment

Choose a reason for hiding this comment

vyasr left a comment

Choose a reason for hiding this comment

PointKernel commented Nov 8, 2024

PointKernel commented Aug 20, 2024 •

edited

Loading