Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Configurable NaN handling in device_row_comparators #10870

Merged
merged 37 commits into from
Jun 3, 2022

Conversation

rwlee
Copy link
Contributor

@rwlee rwlee commented May 17, 2022

Further splitting up #9452 -- split off at the suggestion of @bdice

Related to #10781 and #4760 -- issues and discussions related to NaN comparison behavior.

@rwlee rwlee added 3 - Ready for Review Ready for review by team libcudf Affects libcudf (C++/CUDA) code. improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels May 17, 2022
@rwlee rwlee requested a review from a team as a code owner May 17, 2022 00:17
@rwlee rwlee requested review from PointKernel, jrhemstad and bdice May 17, 2022 00:17
@codecov
Copy link

codecov bot commented May 17, 2022

Codecov Report

❗ No coverage uploaded for pull request base (branch-22.08@f5aa39d). Click here to learn what that means.
The diff coverage is n/a.

❗ Current head e5d6197 differs from pull request most recent head 6da4a4c. Consider uploading reports for the commit 6da4a4c to get more accurate results

@@               Coverage Diff               @@
##             branch-22.08   #10870   +/-   ##
===============================================
  Coverage                ?   86.33%           
===============================================
  Files                   ?      144           
  Lines                   ?    22706           
  Branches                ?        0           
===============================================
  Hits                    ?    19603           
  Misses                  ?     3103           
  Partials                ?        0           

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update f5aa39d...6da4a4c. Read the comment docs.

@rwlee rwlee requested review from a team as code owners May 31, 2022 23:45
Copy link
Contributor

@jrhemstad jrhemstad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving, but I'd like to see the rename to equal_to as well. Very nice work. This looks very clean.

@rwlee
Copy link
Contributor Author

rwlee commented Jun 3, 2022

Approving, but I'd like to see the rename to equal_to as well. Very nice work. This looks very clean.

Added the rename to equal_to. Not sure why java code owners was requested. Are we good to merge?

@ttnghia ttnghia removed the request for review from a team June 3, 2022 04:52
@ttnghia
Copy link
Contributor

ttnghia commented Jun 3, 2022

Added the rename to equal_to. Not sure why java code owners was requested. Are we good to merge?

Maybe you had java code modified before (?) 😃

@bdice
Copy link
Contributor

bdice commented Jun 3, 2022

Are we good to merge?

I’m taking a final pass of review first thing this morning and will merge after.

@bdice bdice self-requested a review June 3, 2022 11:57
Copy link
Contributor

@bdice bdice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work. I have a final pass of comments attached. I'm going to work on applying my own suggestions so that CI can run and pass. At some later point I'd like to overhaul this file to make naming conventions match. We have mismatched names across classes like lhs vs. _lhs for members and check_nulls vs. nullate for Nullate types. There's also some inconsistency about whether check_nulls goes before or after the lhs, rhs parameters.

cudf::data_type(cudf::type_id::INT8), input_table_1.num_rows(), cudf::mask_state::UNALLOCATED);
fixed_width_column_wrapper<int8_t> expected{{1, 1, 0, 1}};

row_comparison(input_table_1,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These tests would be a lot cleaner if the row_comparison returned a std::unique_ptr<column> (like most cudf algorithms) of bool dtype instead of accepting/mutating a mutable view. Move the make_numeric_column logic into the algorithm implementation rather than creating the object in each test.

Result would look like:

auto const got = row_comparison(input_table_1, input_table_2, column_order, lexicographic::physical_element_comparator{});

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a note, most of the functions I can think of that accept/mutate mutable views are doing so because they're a direct wrapper around a Thrust or CUB call that takes raw iterators, and they usually live in a detail namespace. User-facing algorithms generally allocate and return owning objects like std::unique_ptr<column>.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I overhauled this file in 3a7f118. I need to make sure it compiles and will push again soon.

cpp/tests/table/experimental_row_operator_tests.cu Outdated Show resolved Hide resolved
cpp/tests/table/experimental_row_operator_tests.cu Outdated Show resolved Hide resolved
cpp/tests/table/experimental_row_operator_tests.cu Outdated Show resolved Hide resolved
cpp/include/cudf/table/experimental/row_operators.cuh Outdated Show resolved Hide resolved
cpp/include/cudf/table/experimental/row_operators.cuh Outdated Show resolved Hide resolved
cpp/include/cudf/table/experimental/row_operators.cuh Outdated Show resolved Hide resolved
cpp/include/cudf/table/experimental/row_operators.cuh Outdated Show resolved Hide resolved
cpp/include/cudf/table/experimental/row_operators.cuh Outdated Show resolved Hide resolved
cpp/include/cudf/table/experimental/row_operators.cuh Outdated Show resolved Hide resolved
@rwlee
Copy link
Contributor Author

rwlee commented Jun 3, 2022

@gpucibot merge

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3 - Ready for Review Ready for review by team CMake CMake build issue improvement Improvement / enhancement to an existing function libcudf Affects libcudf (C++/CUDA) code. non-breaking Non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants