-
Notifications
You must be signed in to change notification settings - Fork 915
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reimplement cudf::merge
for nested types without using comparators
#14250
Conversation
|
/ok to test |
cudf::merge
without using comparatorscudf::merge
for nested types without using comparators
/ok to test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great. I have a couple early suggestions on the draft.
/ok to test |
/ok to test |
/ok to test |
/ok to test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome work. I have a few comments.
Co-authored-by: Bradley Dice <[email protected]>
/ok to test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approving. I had one comment about whether a comment is really a TODO or not, but otherwise LGTM.
/ok to test |
1 similar comment
/ok to test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is looking very good. I have a couple of trivial nitpicks.
I'm not done. I've just hit the meat of the change. I'll resume the review shortly.
* `null_order::BEFORE` for all columns. | ||
* @param comparator Physical element relational comparison functor. | ||
*/ | ||
template <bool nested_disable = not has_nested_columns, CUDF_ENABLE_IF(nested_disable)> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting phrasing. I'll remember this for when I need it.
Is nested_disable
used elsewhere?
If not, would CUDF_ENABLE_IF(not has_nested_columns)
be less readable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can change the variable name if you'd like, but I cannot use the class-template has_nested_columns
directly here.
Let me explain why, and my apologies if you already knew this but I misunderstood what you are suggesting.
If I did:
template <CUDF_ENABLE_IF(not has_nested_columns)>
then we'd not be able to trigger SFINAE. This is because has_nested_columns
is a class-template. So, when you write device_comparator<true> d_comp
, the substitution for the enable-if on the constructor will trigger a failure as enable_if::type
will be ill-formed.
As a result of which, we want to defer the SFINAE at the member function level so that it behaves the way it is intended to: as an overload.
{ | ||
size_type const left_size = left_table.num_rows(); | ||
size_type const right_size = right_table.num_rows(); | ||
size_type const total_size = left_size + right_size; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another impertinent question: Do we need to protect against the case where the total_size
overflows?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. A couple of minor questions.
Edit: Thanks, that was an interesting read. I've learnt a thing or two from this change.
/ok to test |
/merge |
CUDF_EXPECTS(std::accumulate(tables_to_merge.cbegin(), | ||
tables_to_merge.cend(), | ||
cudf::size_type{0}, | ||
[](auto const& running_sum, auto const& tbl) { | ||
return running_sum + tbl.num_rows(); | ||
}) <= std::numeric_limits<cudf::size_type>::max(), | ||
"Total number of merged rows exceeds row limit"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@divyegala (cc: @mythrocks) I don't think this actually protects against overflow. By the definition of overflow (if it were even defined behavior for signed types, which it's not...), the sum will always be less than the max limit of size_type
when it rolls over. You need to use a larger type and accumulate in std::size_t
precision. Can you file a follow-up PR with this change?
CUDF_EXPECTS(std::accumulate(tables_to_merge.cbegin(), | |
tables_to_merge.cend(), | |
cudf::size_type{0}, | |
[](auto const& running_sum, auto const& tbl) { | |
return running_sum + tbl.num_rows(); | |
}) <= std::numeric_limits<cudf::size_type>::max(), | |
"Total number of merged rows exceeds row limit"); | |
CUDF_EXPECTS(std::accumulate(tables_to_merge.cbegin(), | |
tables_to_merge.cend(), | |
size_t{0}, | |
[](auto const& running_sum, auto const& tbl) { | |
return running_sum + static_cast<size_t>(tbl.num_rows()); | |
}) <= static_cast<size_t>(std::numeric_limits<cudf::size_type>::max()), | |
"Total number of merged rows exceeds row limit"); |
Examples of this pattern:
cudf/cpp/src/strings/copying/concatenate.cu
Line 221 in 2a923df
CUDF_EXPECTS(total_bytes <= static_cast<std::size_t>(std::numeric_limits<size_type>::max()), cudf/cpp/src/copying/scatter.cu
Line 324 in 2a923df
CUDF_EXPECTS(scatter_map.size() <= static_cast<size_t>(std::numeric_limits<size_type>::max()), cudf/cpp/src/scalar/scalar.cpp
Line 70 in 2a923df
string.size() <= static_cast<std::size_t>(std::numeric_limits<cudf::size_type>::max()),
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for pointing that out @bdice! I'll raise a PR
#14250 added a check to ensure `cudf::merge` throws when the total number of merged rows exceed `cudf::size_type` limit, however @bdice pointed out that the check was not correct because the accumulation was still occurring in `cudf::size_type`. This PR computes the accumulation in `std::size_t`. Authors: - Divye Gala (https://github.com/divyegala) Approvers: - Bradley Dice (https://github.com/bdice) - MithunR (https://github.com/mythrocks) - Nghia Truong (https://github.com/ttnghia) URL: #14345
Description
Part of #11844
This PR also uses new experimental comparators for non-nested types by introducing a new device constructor for
cudf::experimental::row::lexicographic::device_row_comparator
. In the case of non-nested types, preprocessing can be skipped so comparators can be created on the fly. This solution helps us avoid creating 3 comparator types becausethrust::merge
can call the operator with indices from either side of the table.Furthermore, the PR reworks
cudf/detail/merge.cuh
by removing any CUDA headers/components to expose a true detail API of the formcudf/detail/merge.hpp
.Benchmark comparison for non-nested types
Compilation time increases from ~6 mins to ~7 mins.
Checklist