Skip to content

Commit

Permalink
Fast path for experimental::row::equality (#12676)
Browse files Browse the repository at this point in the history
This PR adds a fast path for primitive types similar to `experimental::row::lexicographic`. The compilation impact for building on bare-metal from source with command `./build.sh libcudf tests benchmarks` for baseline `16m43.607s` vs this branch `17m13.987s`.

This PR is a part of #12593.

Algorithms and benchmarks (those that were available are linked) affected by this change:
`experimental::row::equality::self_comparator`
- [x] [`group_nunique`](#12676 (comment))
- [x] [`group_rank_scan`](#12676 (comment))
- [x] [`rank_scan`](#12676 (comment))
- [x] `contains_table`
- [x] [`distinct`](#12676 (comment))
- [x] [`unique`](#12676 (comment))
- [x] [`rank`](#12676 (comment)) 

`experimental::row::equality::two_table_comparator`
- [x] `struct_binary_ops`
- [x] `lists/contains`
- [x] [`contains_scalar`](#12676 (comment)) (This algorithm does not need a primitive type optimization because the enclosing struct already type-dispatches based on nested vs non-nested types)
- [x] `contains_table`
- [x] `one_hot_encode`

Authors:
  - Divye Gala (https://github.com/divyegala)

Approvers:
  - Mike Wilson (https://github.com/hyperbolic2346)
  - Nghia Truong (https://github.com/ttnghia)
  - David Wendt (https://github.com/davidwendt)

URL: #12676
  • Loading branch information
divyegala authored Feb 16, 2023
1 parent d787ff2 commit e4ffcbb
Show file tree
Hide file tree
Showing 17 changed files with 589 additions and 244 deletions.
79 changes: 72 additions & 7 deletions cpp/include/cudf/table/experimental/row_operators.cuh
Original file line number Diff line number Diff line change
Expand Up @@ -245,6 +245,16 @@ using optional_dremel_view = thrust::optional<detail::dremel_device_view const>;
* second letter in both words is the first non-equal letter, and `a < b`, thus
* `aac < abb`.
*
* @note The operator overloads in sub-class `element_comparator` are templated via the
* `type_dispatcher` to help select an overload instance for each column in a table.
* So, `cudf::is_nested<Element>` will return `true` if the table has nested-type columns,
* but it will be a runtime error if template parameter `has_nested_columns != true`.
*
* @tparam has_nested_columns compile-time optimization for primitive types.
* This template parameter is to be used by the developer by querying
* `cudf::detail::has_nested_columns(input)`. `true` compiles operator
* overloads for nested types, while `false` only compiles operator
* overloads for primitive types.
* @tparam Nullate A cudf::nullate type describing whether to check for nulls.
* @tparam PhysicalElementComparator A relational comparator functor that compares individual values
* rather than logical elements, defaults to `NaN` aware relational comparator that evaluates `NaN`
Expand Down Expand Up @@ -857,6 +867,16 @@ class self_comparator {
*
* `F(i,j)` returns true if and only if row `i` compares lexicographically less than row `j`.
*
* @note The operator overloads in sub-class `element_comparator` are templated via the
* `type_dispatcher` to help select an overload instance for each column in a table.
* So, `cudf::is_nested<Element>` will return `true` if the table has nested-type columns,
* but it will be a runtime error if template parameter `has_nested_columns != true`.
*
* @tparam has_nested_columns compile-time optimization for primitive types.
* This template parameter is to be used by the developer by querying
* `cudf::detail::has_nested_columns(input)`. `true` compiles operator
* overloads for nested types, while `false` only compiles operator
* overloads for primitive types.
* @tparam Nullate A cudf::nullate type describing whether to check for nulls.
* @tparam PhysicalElementComparator A relational comparator functor that compares individual
* values rather than logical elements, defaults to `NaN` aware relational comparator that
Expand Down Expand Up @@ -1009,6 +1029,16 @@ class two_table_comparator {
* only if row `i` of the right table compares lexicographically less than row
* `j` of the left table.
*
* @note The operator overloads in sub-class `element_comparator` are templated via the
* `type_dispatcher` to help select an overload instance for each column in a table.
* So, `cudf::is_nested<Element>` will return `true` if the table has nested-type columns,
* but it will be a runtime error if template parameter `has_nested_columns != true`.
*
* @tparam has_nested_columns compile-time optimization for primitive types.
* This template parameter is to be used by the developer by querying
* `cudf::detail::has_nested_columns(input)`. `true` compiles operator
* overloads for nested types, while `false` only compiles operator
* overloads for primitive types.
* @tparam Nullate A cudf::nullate type describing whether to check for nulls.
* @tparam PhysicalElementComparator A relational comparator functor that compares individual
* values rather than logical elements, defaults to `NaN` aware relational comparator that
Expand Down Expand Up @@ -1131,11 +1161,22 @@ struct nan_equal_physical_equality_comparator {
* returns false, representing unequal rows. If the rows are compared without mismatched elements,
* the rows are equal.
*
* @note The operator overloads in sub-class `element_comparator` are templated via the
* `type_dispatcher` to help select an overload instance for each column in a table.
* So, `cudf::is_nested<Element>` will return `true` if the table has nested-type columns,
* but it will be a runtime error if template parameter `has_nested_columns != true`.
*
* @tparam has_nested_columns compile-time optimization for primitive types.
* This template parameter is to be used by the developer by querying
* `cudf::detail::has_nested_columns(input)`. `true` compiles operator
* overloads for nested types, while `false` only compiles operator
* overloads for primitive types.
* @tparam Nullate A cudf::nullate type describing whether to check for nulls.
* @tparam PhysicalEqualityComparator A equality comparator functor that compares individual values
* rather than logical elements, defaults to a comparator for which `NaN == NaN`.
*/
template <typename Nullate,
template <bool has_nested_columns,
typename Nullate,
typename PhysicalEqualityComparator = nan_equal_physical_equality_comparator>
class device_row_comparator {
friend class self_comparator; ///< Allow self_comparator to access private members
Expand Down Expand Up @@ -1246,14 +1287,14 @@ class device_row_comparator {

template <typename Element,
CUDF_ENABLE_IF(not cudf::is_equality_comparable<Element, Element>() and
not cudf::is_nested<Element>()),
(not has_nested_columns or not cudf::is_nested<Element>())),
typename... Args>
__device__ bool operator()(Args...)
{
CUDF_UNREACHABLE("Attempted to compare elements of uncomparable types.");
}

template <typename Element, CUDF_ENABLE_IF(cudf::is_nested<Element>())>
template <typename Element, CUDF_ENABLE_IF(has_nested_columns and cudf::is_nested<Element>())>
__device__ bool operator()(size_type const lhs_element_index,
size_type const rhs_element_index) const noexcept
{
Expand Down Expand Up @@ -1437,6 +1478,16 @@ class self_comparator {
*
* `F(i,j)` returns true if and only if row `i` compares equal to row `j`.
*
* @note The operator overloads in sub-class `element_comparator` are templated via the
* `type_dispatcher` to help select an overload instance for each column in a table.
* So, `cudf::is_nested<Element>` will return `true` if the table has nested-type columns,
* but it will be a runtime error if template parameter `has_nested_columns != true`.
*
* @tparam has_nested_columns compile-time optimization for primitive types.
* This template parameter is to be used by the developer by querying
* `cudf::detail::has_nested_columns(input)`. `true` compiles operator
* overloads for nested types, while `false` only compiles operator
* overloads for primitive types.
* @tparam Nullate A cudf::nullate type describing whether to check for nulls.
* @tparam PhysicalEqualityComparator A equality comparator functor that compares individual
* values rather than logical elements, defaults to a comparator for which `NaN == NaN`.
Expand All @@ -1445,13 +1496,15 @@ class self_comparator {
* @param comparator Physical element equality comparison functor.
* @return A binary callable object
*/
template <typename Nullate,
template <bool has_nested_columns,
typename Nullate,
typename PhysicalEqualityComparator = nan_equal_physical_equality_comparator>
auto equal_to(Nullate nullate = {},
null_equality nulls_are_equal = null_equality::EQUAL,
PhysicalEqualityComparator comparator = {}) const noexcept
{
return device_row_comparator{nullate, *d_t, *d_t, nulls_are_equal, comparator};
return device_row_comparator<has_nested_columns, Nullate, PhysicalEqualityComparator>{
nullate, *d_t, *d_t, nulls_are_equal, comparator};
}

private:
Expand Down Expand Up @@ -1539,6 +1592,16 @@ class two_table_comparator {
* Similarly, `F(rhs_index_type i, lhs_index_type j)` returns true if and only if row `i` of the
* right table compares equal to row `j` of the left table.
*
* @note The operator overloads in sub-class `element_comparator` are templated via the
* `type_dispatcher` to help select an overload instance for each column in a table.
* So, `cudf::is_nested<Element>` will return `true` if the table has nested-type columns,
* but it will be a runtime error if template parameter `has_nested_columns != true`.
*
* @tparam has_nested_columns compile-time optimization for primitive types.
* This template parameter is to be used by the developer by querying
* `cudf::detail::has_nested_columns(input)`. `true` compiles operator
* overloads for nested types, while `false` only compiles operator
* overloads for primitive types.
* @tparam Nullate A cudf::nullate type describing whether to check for nulls.
* @tparam PhysicalEqualityComparator A equality comparator functor that compares individual
* values rather than logical elements, defaults to a `NaN == NaN` equality comparator.
Expand All @@ -1547,14 +1610,16 @@ class two_table_comparator {
* @param comparator Physical element equality comparison functor.
* @return A binary callable object
*/
template <typename Nullate,
template <bool has_nested_columns,
typename Nullate,
typename PhysicalEqualityComparator = nan_equal_physical_equality_comparator>
auto equal_to(Nullate nullate = {},
null_equality nulls_are_equal = null_equality::EQUAL,
PhysicalEqualityComparator comparator = {}) const noexcept
{
return strong_index_comparator_adapter{
device_row_comparator(nullate, *d_left_table, *d_right_table, nulls_are_equal, comparator)};
device_row_comparator<has_nested_columns, Nullate, PhysicalEqualityComparator>(
nullate, *d_left_table, *d_right_table, nulls_are_equal, comparator)};
}

private:
Expand Down
77 changes: 59 additions & 18 deletions cpp/src/binaryop/compiled/struct_binary_ops.cuh
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
/*
* Copyright (c) 2022, NVIDIA CORPORATION.
* Copyright (c) 2022-2023, NVIDIA CORPORATION.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
Expand Down Expand Up @@ -106,6 +106,36 @@ void apply_struct_binary_op(mutable_column_view& out,
}
}

template <typename OptionalIteratorType, typename DeviceComparatorType>
struct struct_equality_functor {
struct_equality_functor(OptionalIteratorType optional_iter,
DeviceComparatorType device_comparator,
bool is_lhs_scalar,
bool is_rhs_scalar,
bool preserve_output)
: _optional_iter(optional_iter),
_device_comparator(device_comparator),
_is_lhs_scalar(is_lhs_scalar),
_is_rhs_scalar(is_rhs_scalar),
_preserve_output(preserve_output)
{
}

auto __device__ operator()(size_type i) const noexcept
{
auto const lhs = cudf::experimental::row::lhs_index_type{_is_lhs_scalar ? 0 : i};
auto const rhs = cudf::experimental::row::rhs_index_type{_is_rhs_scalar ? 0 : i};
return _optional_iter[i].has_value() and (_device_comparator(lhs, rhs) == _preserve_output);
}

private:
OptionalIteratorType _optional_iter;
DeviceComparatorType _device_comparator;
bool _is_lhs_scalar;
bool _is_rhs_scalar;
bool _preserve_output;
};

template <typename PhysicalEqualityComparator =
cudf::experimental::row::equality::physical_equality_comparator>
void apply_struct_equality_op(mutable_column_view& out,
Expand All @@ -125,26 +155,37 @@ void apply_struct_equality_op(mutable_column_view& out,
auto trhs = table_view{{rhs}};
auto table_comparator =
cudf::experimental::row::equality::two_table_comparator{tlhs, trhs, stream};
auto device_comparator =
table_comparator.equal_to(nullate::DYNAMIC{has_nested_nulls(tlhs) || has_nested_nulls(trhs)},
null_equality::EQUAL,
comparator);

auto outd = column_device_view::create(out, stream);
auto optional_iter =
cudf::detail::make_optional_iterator<bool>(*outd, nullate::DYNAMIC{out.has_nulls()});
thrust::tabulate(rmm::exec_policy(stream),
out.begin<bool>(),
out.end<bool>(),
[optional_iter,
is_lhs_scalar,
is_rhs_scalar,
preserve_output = (op != binary_operator::NOT_EQUAL),
device_comparator] __device__(size_type i) {
auto lhs = cudf::experimental::row::lhs_index_type{is_lhs_scalar ? 0 : i};
auto rhs = cudf::experimental::row::rhs_index_type{is_rhs_scalar ? 0 : i};
return optional_iter[i].has_value() and
(device_comparator(lhs, rhs) == preserve_output);
});

auto const comparator_helper = [&](auto const device_comparator) {
thrust::tabulate(rmm::exec_policy(stream),
out.begin<bool>(),
out.end<bool>(),
struct_equality_functor<decltype(optional_iter), decltype(device_comparator)>(
optional_iter,
device_comparator,
is_lhs_scalar,
is_rhs_scalar,
op != binary_operator::NOT_EQUAL));
};

if (cudf::detail::has_nested_columns(tlhs) or cudf::detail::has_nested_columns(trhs)) {
auto device_comparator = table_comparator.equal_to<true>(
nullate::DYNAMIC{has_nested_nulls(tlhs) || has_nested_nulls(trhs)},
null_equality::EQUAL,
comparator);

comparator_helper(device_comparator);
} else {
auto device_comparator = table_comparator.equal_to<false>(
nullate::DYNAMIC{has_nested_nulls(tlhs) || has_nested_nulls(trhs)},
null_equality::EQUAL,
comparator);

comparator_helper(device_comparator);
}
}
} // namespace cudf::binops::compiled::detail
Loading

0 comments on commit e4ffcbb

Please sign in to comment.