Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add compound aggregations to cudf::segmented_reduce #12573

Merged

Conversation

davidwendt
Copy link
Contributor

Description

Adds mean, variance, and standard deviation aggregation support to cudf::segmented_reduce. These are compound (multi-step) aggregations and are modeled after the same aggregations supported but cudf::reduce. Once this approved and merged, the visitor pattern for this approach will be reworked for both cudf::reduce and cudf::segmented_reduce as per #10432.

The source tree for src/reductions has been adjusted to put all segmented-reduce source files into src/reductions/segmented and removing the segmented_ prefix from those file names.
Also, the segmented-reduce functions have been moved from cudf/detail/reduction_functions.hpp into their own cudf/detail/segmented_reduction_functions.hpp. Likewise, the segmented-reduce CUB calls have been moved from cudf/detail/reduction.cuh to the new cudf/detail/segmented_reduction.cuh to help minimize including CUB headers.

Additionally, the sum-of-squares aggregation is also included since it was a simple reduction only requiring the appropriate aggregation class registration and source file.

Finally, gtests are added for these new types. The compound types only support floating-point outputs.

Follow on PRs will address the visitor pattern already mentioned above as well as additional data types. Discussion on additional aggregations will occur in the reference issue #10432.

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

@davidwendt davidwendt added feature request New feature or request 2 - In Progress Currently a work in progress libcudf Affects libcudf (C++/CUDA) code. non-breaking Non-breaking change labels Jan 18, 2023
@davidwendt davidwendt self-assigned this Jan 18, 2023
@github-actions github-actions bot added CMake CMake build issue conda labels Jan 18, 2023
@davidwendt
Copy link
Contributor Author

This will target 23.04 once it is ready.

@davidwendt davidwendt requested a review from bdice January 18, 2023 20:11
@@ -229,92 +228,6 @@ std::unique_ptr<scalar> reduce(InputIterator d_in,
return std::unique_ptr<scalar>(result);
}

/**
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These functions were moved to segmented_reduction.cuh

@@ -338,171 +338,5 @@ std::unique_ptr<scalar> merge_sets(
rmm::cuda_stream_view stream,
rmm::mr::device_memory_resource* mr = rmm::mr::get_current_device_resource());

/**
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These functions were moved to segmented_reduction_functions.hpp

@codecov
Copy link

codecov bot commented Jan 18, 2023

Codecov Report

❗ No coverage uploaded for pull request base (branch-23.04@21ef256). Click here to learn what that means.
Patch has no changes to coverable lines.

Additional details and impacted files
@@               Coverage Diff               @@
##             branch-23.04   #12573   +/-   ##
===============================================
  Coverage                ?   85.81%           
===============================================
  Files                   ?      158           
  Lines                   ?    25153           
  Branches                ?        0           
===============================================
  Hits                    ?    21586           
  Misses                  ?     3567           
  Partials                ?        0           

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

cpp/include/cudf/detail/segmented_reduction.cuh Outdated Show resolved Hide resolved
cpp/include/cudf/detail/segmented_reduction.cuh Outdated Show resolved Hide resolved
cpp/include/cudf/detail/segmented_reduction_functions.hpp Outdated Show resolved Hide resolved
cpp/include/cudf/detail/segmented_reduction_functions.hpp Outdated Show resolved Hide resolved
cpp/include/cudf/detail/segmented_reduction_functions.hpp Outdated Show resolved Hide resolved
cpp/include/cudf/detail/segmented_reduction_functions.hpp Outdated Show resolved Hide resolved
cpp/src/reductions/segmented/compound.cuh Outdated Show resolved Hide resolved
cpp/src/reductions/segmented/mean.cu Outdated Show resolved Hide resolved
@davidwendt davidwendt changed the base branch from branch-23.02 to branch-23.04 January 24, 2023 19:34
@davidwendt davidwendt requested a review from a team as a code owner January 30, 2023 18:28
Copy link
Contributor

@robertmaynard robertmaynard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CMake changes LGTM

Copy link
Member

@ajschmidt8 ajschmidt8 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving ops-codeowner file changes

Copy link
Contributor

@bdice bdice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed with @davidwendt. A few comments attached. Great work!

cpp/include/cudf/detail/segmented_reduction.cuh Outdated Show resolved Hide resolved
cpp/include/cudf/detail/segmented_reduction.cuh Outdated Show resolved Hide resolved
cpp/include/cudf/detail/segmented_reduction_functions.hpp Outdated Show resolved Hide resolved
cpp/include/cudf/detail/segmented_reduction_functions.hpp Outdated Show resolved Hide resolved
Copy link
Contributor

@bdice bdice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Holding off for a final review pass.

@davidwendt davidwendt requested a review from bdice February 2, 2023 19:17
Copy link
Contributor

@bdice bdice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great! Thanks @davidwendt.

@davidwendt
Copy link
Contributor Author

/merge

@rapids-bot rapids-bot bot merged commit 182ee2c into rapidsai:branch-23.04 Feb 3, 2023
@davidwendt davidwendt deleted the reduction-segmented-compound branch February 3, 2023 13:49
rapids-bot bot pushed a commit that referenced this pull request Feb 7, 2023
Reworks some internal source specific to fixed-point types using `cudf::reduce` by removing the duplicated code logic. This was found while working on #12573 and #10432. Since the fix is requires no dependencies, this separate PR is used to minimize code review churn. This should help with code consistency with the fixed-point-specific logic when added to segmented-reduction.
No function has changed so all existing gtests are adequate.

Authors:
  - David Wendt (https://github.com/davidwendt)

Approvers:
  - Yunsong Wang (https://github.com/PointKernel)
  - Bradley Dice (https://github.com/bdice)

URL: #12652
rapids-bot bot pushed a commit that referenced this pull request Feb 23, 2023
Depends on #12573 

Adds additional support for fixed-point types in `cudf::segmented_reduce` for simple aggregations: sum, product, and sum-of-squares.
Reference: #10432

Authors:
  - David Wendt (https://github.com/davidwendt)

Approvers:
  - Mike Wilson (https://github.com/hyperbolic2346)
  - Nghia Truong (https://github.com/ttnghia)
  - Bradley Dice (https://github.com/bdice)

URL: #12680
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3 - Ready for Review Ready for review by team CMake CMake build issue feature request New feature or request libcudf Affects libcudf (C++/CUDA) code. non-breaking Non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants