-
Notifications
You must be signed in to change notification settings - Fork 915
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add compound aggregations to cudf::segmented_reduce #12573
Merged
rapids-bot
merged 40 commits into
rapidsai:branch-23.04
from
davidwendt:reduction-segmented-compound
Feb 3, 2023
Merged
Changes from all commits
Commits
Show all changes
40 commits
Select commit
Hold shift + click to select a range
a9fdb94
Add compound aggregations to cudf::segmented_reduce
davidwendt d8d89af
Merge branch 'branch-23.02' into reduction-segmented-compound
davidwendt 2d32592
Merge branch 'branch-23.02' into reduction-segmented-compound
davidwendt 94d15f9
Merge branch 'branch-23.02' into reduction-segmented-compound
davidwendt 26d2f0d
add gtests with nulls include/exclude
davidwendt ebc41a1
reduce number of nulls in new gtests
davidwendt 1ecece3
Merge branch 'branch-23.02' into reduction-segmented-compound
davidwendt 96a6643
Merge branch 'branch-23.02' into reduction-segmented-compound
davidwendt 60bdf8b
Merge branch 'reduction-segmented-compound' of github.com:davidwendt/…
davidwendt ca27894
update include statements
davidwendt c1f0939
update doxygen for consistency
davidwendt cdc2f9c
remove unneeded namespace specification
davidwendt 7b42240
Merge branch 'branch-23.04' into reduction-segmented-compound
davidwendt 3064979
add more error gtests
davidwendt 7436879
fix copyright year
davidwendt 105fe62
remove unneeded include
davidwendt 1b31af2
Merge branch 'branch-23.04' into reduction-segmented-compound
davidwendt 5fbad16
Merge branch 'reduction-segmented-compound' of github.com:davidwendt/…
davidwendt c9ad9f5
Merge branch 'branch-23.04' into reduction-segmented-compound
davidwendt fafe40f
Merge branch 'branch-23.04' into reduction-segmented-compound
davidwendt e8d1123
Merge branch 'reduction-segmented-compound' of github.com:davidwendt/…
davidwendt 7a31c4d
Merge branch 'branch-23.04' into reduction-segmented-compound
davidwendt 8bef913
Merge branch 'branch-23.04' into reduction-segmented-compound
davidwendt facc772
Merge branch 'reduction-segmented-compound' of github.com:davidwendt/…
davidwendt d139d83
Merge branch 'branch-23.04' into reduction-segmented-compound
davidwendt d35c475
refactor validity-mask logic into separate source file
davidwendt 8055499
Merge branch 'branch-23.04' into reduction-segmented-compound
davidwendt ea12e69
Merge branch 'branch-23.04' into reduction-segmented-compound
davidwendt bdd0e1d
Merge branch 'branch-23.04' into reduction-segmented-compound
davidwendt 35e2cbe
Merge branch 'branch-23.04' into reduction-segmented-compound
davidwendt d53b536
Merge branch 'reduction-segmented-compound' of github.com:davidwendt/…
davidwendt a500a15
Merge branch 'branch-23.04' into reduction-segmented-compound
davidwendt 46fb462
Merge branch 'reduction-segmented-compound' of github.com:davidwendt/…
davidwendt 700e61e
Merge branch 'branch-23.04' into reduction-segmented-compound
davidwendt af56df5
Merge branch 'branch-23.04' into reduction-segmented-compound
davidwendt 67540f2
Merge branch 'reduction-segmented-compound' of github.com:davidwendt/…
davidwendt af9d778
Merge branch 'branch-23.04' into reduction-segmented-compound
davidwendt 5bedc84
additional refactor for update_validity
davidwendt 04f7205
Merge branch 'branch-23.04' into reduction-segmented-compound
davidwendt 2d2587b
rename update-validity
davidwendt File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,5 +1,5 @@ | ||
/* | ||
* Copyright (c) 2019-2022, NVIDIA CORPORATION. | ||
* Copyright (c) 2019-2023, NVIDIA CORPORATION. | ||
* | ||
* Licensed under the Apache License, Version 2.0 (the "License"); | ||
* you may not use this file except in compliance with the License. | ||
|
@@ -216,7 +216,7 @@ std::unique_ptr<scalar> mean( | |
std::unique_ptr<scalar> variance( | ||
column_view const& col, | ||
data_type const output_dtype, | ||
cudf::size_type ddof, | ||
size_type ddof, | ||
rmm::cuda_stream_view stream, | ||
rmm::mr::device_memory_resource* mr = rmm::mr::get_current_device_resource()); | ||
|
||
|
@@ -239,7 +239,7 @@ std::unique_ptr<scalar> variance( | |
std::unique_ptr<scalar> standard_deviation( | ||
column_view const& col, | ||
data_type const output_dtype, | ||
cudf::size_type ddof, | ||
size_type ddof, | ||
rmm::cuda_stream_view stream, | ||
rmm::mr::device_memory_resource* mr = rmm::mr::get_current_device_resource()); | ||
|
||
|
@@ -338,171 +338,5 @@ std::unique_ptr<scalar> merge_sets( | |
rmm::cuda_stream_view stream, | ||
rmm::mr::device_memory_resource* mr = rmm::mr::get_current_device_resource()); | ||
|
||
/** | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. These functions were moved to |
||
* @brief Compute sum of each segment in input column. | ||
* | ||
* If an input segment is empty, the segment result is null. | ||
* | ||
* @throw cudf::logic_error if input column type is not convertible to `output_dtype`. | ||
* @throw cudf::logic_error if `output_dtype` is not an arithmetic type. | ||
* | ||
* @param col Input column to compute sum | ||
* @param offsets Indices to identify segment boundaries | ||
* @param output_dtype Data type of return type and typecast elements of input column | ||
* @param null_handling If `null_policy::INCLUDE`, all elements in a segment must be valid for the | ||
* reduced value to be valid. If `null_policy::EXCLUDE`, the reduced value is valid if any element | ||
* in the segment is valid. | ||
* @param init Initial value of each sum | ||
* @param stream CUDA stream used for device memory operations and kernel launches | ||
* @param mr Device memory resource used to allocate the returned column's device memory | ||
* @return Sums of segments in type `output_dtype` | ||
*/ | ||
std::unique_ptr<column> segmented_sum( | ||
column_view const& col, | ||
device_span<size_type const> offsets, | ||
data_type const output_dtype, | ||
null_policy null_handling, | ||
std::optional<std::reference_wrapper<scalar const>> init, | ||
rmm::cuda_stream_view stream, | ||
rmm::mr::device_memory_resource* mr = rmm::mr::get_current_device_resource()); | ||
|
||
/** | ||
* @brief Computes product of each segment in input column. | ||
* | ||
* If an input segment is empty, the segment result is null. | ||
* | ||
* @throw cudf::logic_error if input column type is not convertible to `output_dtype`. | ||
* @throw cudf::logic_error if `output_dtype` is not an arithmetic type. | ||
* | ||
* @param col Input column to compute product | ||
* @param offsets Indices to identify segment boundaries | ||
* @param output_dtype data type of return type and typecast elements of input column | ||
* @param null_handling If `null_policy::INCLUDE`, all elements in a segment must be valid for the | ||
* reduced value to be valid. If `null_policy::EXCLUDE`, the reduced value is valid if any element | ||
* in the segment is valid. | ||
* @param init Initial value of each product | ||
* @param stream CUDA stream used for device memory operations and kernel launches | ||
* @param mr Device memory resource used to allocate the returned scalar's device memory | ||
* @return Product as scalar of type `output_dtype` | ||
*/ | ||
std::unique_ptr<column> segmented_product( | ||
column_view const& col, | ||
device_span<size_type const> offsets, | ||
data_type const output_dtype, | ||
null_policy null_handling, | ||
std::optional<std::reference_wrapper<scalar const>> init, | ||
rmm::cuda_stream_view stream, | ||
rmm::mr::device_memory_resource* mr = rmm::mr::get_current_device_resource()); | ||
|
||
/** | ||
* @brief Compute minimum of each segment in input column. | ||
* | ||
* If an input segment is empty, the segment result is null. | ||
* | ||
* @throw cudf::logic_error if input column type is convertible to `output_dtype`. | ||
* | ||
* @param col Input column to compute minimum | ||
* @param offsets Indices to identify segment boundaries | ||
* @param output_dtype Data type of return type and typecast elements of input column | ||
* @param null_handling If `null_policy::INCLUDE`, all elements in a segment must be valid for the | ||
* reduced value to be valid. If `null_policy::EXCLUDE`, the reduced value is valid if any element | ||
* in the segment is valid. | ||
* @param init Initial value of each minimum | ||
* @param stream CUDA stream used for device memory operations and kernel launches | ||
* @param mr Device memory resource used to allocate the returned scalar's device memory | ||
* @return Minimums of segments in type `output_dtype` | ||
*/ | ||
std::unique_ptr<column> segmented_min( | ||
column_view const& col, | ||
device_span<size_type const> offsets, | ||
data_type const output_dtype, | ||
null_policy null_handling, | ||
std::optional<std::reference_wrapper<scalar const>> init, | ||
rmm::cuda_stream_view stream, | ||
rmm::mr::device_memory_resource* mr = rmm::mr::get_current_device_resource()); | ||
|
||
/** | ||
* @brief Compute maximum of each segment in input column. | ||
* | ||
* If an input segment is empty, the segment result is null. | ||
* | ||
* @throw cudf::logic_error if input column type is convertible to `output_dtype`. | ||
* | ||
* @param col Input column to compute maximum | ||
* @param offsets Indices to identify segment boundaries | ||
* @param output_dtype Data type of return type and typecast elements of input column | ||
* @param null_handling If `null_policy::INCLUDE`, all elements in a segment must be valid for the | ||
* reduced value to be valid. If `null_policy::EXCLUDE`, the reduced value is valid if any element | ||
* in the segment is valid. | ||
* @param init Initial value of each maximum | ||
* @param stream CUDA stream used for device memory operations and kernel launches | ||
* @param mr Device memory resource used to allocate the returned scalar's device memory | ||
* @return Maximums of segments in type `output_dtype` | ||
*/ | ||
std::unique_ptr<column> segmented_max( | ||
column_view const& col, | ||
device_span<size_type const> offsets, | ||
data_type const output_dtype, | ||
null_policy null_handling, | ||
std::optional<std::reference_wrapper<scalar const>> init, | ||
rmm::cuda_stream_view stream, | ||
rmm::mr::device_memory_resource* mr = rmm::mr::get_current_device_resource()); | ||
|
||
/** | ||
* @brief Compute if any of the values in the segment are true when typecasted to bool. | ||
* | ||
* If an input segment is empty, the segment result is null. | ||
* | ||
* @throw cudf::logic_error if input column type is not convertible to bool. | ||
* @throw cudf::logic_error if `output_dtype` is not bool8. | ||
* | ||
* @param col Input column to compute any | ||
* @param offsets Indices to identify segment boundaries | ||
* @param output_dtype Data type of return type and typecast elements of input column | ||
* @param null_handling If `null_policy::INCLUDE`, all elements in a segment must be valid for the | ||
* reduced value to be valid. If `null_policy::EXCLUDE`, the reduced value is valid if any element | ||
* in the segment is valid. | ||
* @param init Initial value of each any | ||
* @param stream CUDA stream used for device memory operations and kernel launches | ||
* @param mr Device memory resource used to allocate the returned scalar's device memory | ||
* @return Column of bool8 for the results of the segments | ||
*/ | ||
std::unique_ptr<column> segmented_any( | ||
column_view const& col, | ||
device_span<size_type const> offsets, | ||
data_type const output_dtype, | ||
null_policy null_handling, | ||
std::optional<std::reference_wrapper<scalar const>> init, | ||
rmm::cuda_stream_view stream, | ||
rmm::mr::device_memory_resource* mr = rmm::mr::get_current_device_resource()); | ||
|
||
/** | ||
* @brief Compute if all of the values in the segment are true when typecasted to bool. | ||
* | ||
* If an input segment is empty, the segment result is null. | ||
* | ||
* @throw cudf::logic_error if input column type is not convertible to bool. | ||
* @throw cudf::logic_error if `output_dtype` is not bool8. | ||
* | ||
* @param col Input column to compute all | ||
* @param offsets Indices to identify segment boundaries | ||
* @param output_dtype Data type of return type and typecast elements of input column | ||
* @param null_handling If `null_policy::INCLUDE`, all elements in a segment must be valid for the | ||
* reduced value to be valid. If `null_policy::EXCLUDE`, the reduced value is valid if any element | ||
* in the segment is valid. | ||
* @param init Initial value of each all | ||
* @param stream CUDA stream used for device memory operations and kernel launches | ||
* @param mr Device memory resource used to allocate the returned scalar's device memory | ||
* @return Column of bool8 for the results of the segments | ||
*/ | ||
std::unique_ptr<column> segmented_all( | ||
column_view const& col, | ||
device_span<size_type const> offsets, | ||
data_type const output_dtype, | ||
null_policy null_handling, | ||
std::optional<std::reference_wrapper<scalar const>> init, | ||
rmm::cuda_stream_view stream, | ||
rmm::mr::device_memory_resource* mr = rmm::mr::get_current_device_resource()); | ||
|
||
} // namespace reduction | ||
} // namespace cudf |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These functions were moved to
segmented_reduction.cuh