Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support min and max operations for structs in rolling window #10332

Merged
merged 29 commits into from
Mar 7, 2022

Conversation

ttnghia
Copy link
Contributor

@ttnghia ttnghia commented Feb 18, 2022

This PR adds support for min and max operations in rolling window for STRUCT type. It also does some minor modifications to the existing code, such as renaming some variables and refining some comments.

Partially addresses #8974.

@ttnghia ttnghia added feature request New feature or request 2 - In Progress Currently a work in progress libcudf Affects libcudf (C++/CUDA) code. Spark Functionality that helps Spark RAPIDS non-breaking Non-breaking change labels Feb 18, 2022
@ttnghia ttnghia self-assigned this Feb 18, 2022
@codecov

This comment was marked as off-topic.

@ttnghia ttnghia added 3 - Ready for Review Ready for review by team and removed 2 - In Progress Currently a work in progress labels Feb 25, 2022
@ttnghia ttnghia marked this pull request as ready for review February 25, 2022 18:53
@ttnghia ttnghia requested a review from a team as a code owner February 25, 2022 18:53
@@ -166,31 +178,62 @@ struct DeviceRollingArgMinMax {
size_type end_index,
size_type current_index)
{
using AggOp = typename corresponding_operator<op>::type;
AggOp agg_op;
if constexpr (std::is_same_v<InputType, cudf::string_view>) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The operator() function implementation for struct_view and string_view appear to be completely different as shown by this if constexpr expression. Also, the string_view path does not require the member variable agg_op. Since it's operator() and member variables do not align, perhaps they should not be defined in the same functor? Or maybe some other refactoring is possible?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like to share as much common code as possible for the two types, so we can reduce maintenance effort afterward. Yes, their operator() interface seem to be different. I'm totally fine with refactoring them to have different functors.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update: I have separated the previous functor into 2 separate ones for string and struct.

[&input](size_type idx) { return input.is_valid_nocheck(idx); })
: end_index - start_index;

// Set -1 will help identify null elements while gathering for Min and Max.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe the argmin/argmax ops define a sentinel to avoid needing to use a magic number here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, so it may be neglected in the original code. I just fixed both (old and new).

@ttnghia ttnghia requested review from jrhemstad and davidwendt March 7, 2022 20:13
@ttnghia
Copy link
Contributor Author

ttnghia commented Mar 7, 2022

@gpucibot merge

@rapids-bot rapids-bot bot merged commit a584cdc into rapidsai:branch-22.04 Mar 7, 2022
@ttnghia ttnghia deleted the structs_window_ops branch March 7, 2022 21:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3 - Ready for Review Ready for review by team feature request New feature or request libcudf Affects libcudf (C++/CUDA) code. non-breaking Non-breaking change Spark Functionality that helps Spark RAPIDS
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants