Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Faster struct row comparator #10164

Merged
merged 58 commits into from
Mar 22, 2022
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
58 commits
Select commit Hold shift + click to select a range
933c974
First commit
devavret Aug 26, 2021
a1636e5
testing and profiling deep single hierarchy struct
devavret Aug 27, 2021
d59f54c
Merge branch 'branch-22.02' into struct-row-comp
devavret Jan 12, 2022
765dd8d
Merge branch 'branch-22.02' into struct-row-comp
devavret Jan 12, 2022
3d21daf
Make the sandboxed test compile again
devavret Jan 14, 2022
9f32e6b
Update my row_comparator with nullate
devavret Jan 15, 2022
53d3c90
Merge branch 'branch-22.02' into struct-row-comp
devavret Jan 21, 2022
022e2a4
Basic verticalization utility and experimental namespace
devavret Jan 24, 2022
7fef643
clean up most of row operators that I didn't change.
devavret Jan 26, 2022
930d8de
Sliced column test
devavret Jan 27, 2022
0ecc4f8
column order and null precendence support
devavret Jan 28, 2022
ff36d2d
Manually managed stack
devavret Jan 28, 2022
cd0f938
New depth based method to avoid superimpose nulls
devavret Feb 2, 2022
7b8e060
Put sort2 impl in separate TU
devavret Feb 2, 2022
25eb237
Merge branch 'branch-22.04' into struct-row-comp
devavret Feb 2, 2022
d2937cf
Merge branch 'branch-22.04' into struct-row-comp
devavret Feb 10, 2022
d55c9c7
Move verticalization code to row_comparator.cpp
devavret Feb 15, 2022
3bd749e
Owning row lex operator
devavret Feb 22, 2022
613d664
merge fixes
devavret Feb 23, 2022
2ef3ac7
Move struct logic out of main row loop and into element_relational_co…
devavret Feb 24, 2022
5577431
pushing even more logic into element_relational_comparator
devavret Feb 24, 2022
f037bc0
More optimizations.
devavret Feb 24, 2022
8c54a85
review changes
devavret Feb 24, 2022
9d24a87
Checks to ensure tables can be compared
devavret Feb 24, 2022
294b0cf
Another attempt at new API
devavret Mar 2, 2022
a4c799a
Remove stack based struct comparator + cleanups
devavret Mar 7, 2022
ecb2eb0
thrust::pair -> cuda::std::pair
devavret Mar 7, 2022
34a6564
optional device spans
devavret Mar 7, 2022
fa4abb4
Prevent device comparator construction from any table_device_view
devavret Mar 7, 2022
b213210
Nullate default and fix for non nested depth
devavret Mar 7, 2022
6f9bedd
Fix an unsurfaced bug about depth passing
devavret Mar 7, 2022
be69ffa
Switch over sort impl to new comparator
devavret Mar 8, 2022
76d535a
Copyright changes to satiate ci
devavret Mar 8, 2022
78d10fc
Migrate struct sort benchmark to nvbench
devavret Mar 8, 2022
15920ee
Avoid optional::value in favor of *
devavret Mar 8, 2022
d01fc30
throw when trying to sort List
devavret Mar 8, 2022
ac2eb0d
Leftover change for struct sort nvbench
devavret Mar 8, 2022
076c4c1
struct without null pushdown test
devavret Mar 9, 2022
e8a9202
Remove temporary sort2_test
devavret Mar 9, 2022
a4b1167
Remove temporary sort2 files
devavret Mar 9, 2022
62f6914
leftover sort2 in cmake
devavret Mar 9, 2022
8f628ae
cleanup benchmark headers
devavret Mar 9, 2022
dc7d125
Docs
devavret Mar 9, 2022
fa7d940
Merge branch 'branch-22.04' into struct-row-comp
devavret Mar 10, 2022
76c883f
Apply suggestions from code review
devavret Mar 14, 2022
98b253b
rmm pool in benchmark + style fixes
devavret Mar 14, 2022
3255dc5
Merge branch 'branch-22.04' into struct-row-comp
devavret Mar 14, 2022
52e3a35
Review changes
devavret Mar 15, 2022
9470f06
More review changes
devavret Mar 15, 2022
7c897c3
Review changes req by @vyasr
devavret Mar 17, 2022
e0467c7
add a runtime is_relationally_comparable funtion
devavret Mar 17, 2022
fc1e993
Review changes
devavret Mar 18, 2022
096593f
Review changes
devavret Mar 18, 2022
f539647
Avoid WAR of storing a table_device_view
devavret Mar 18, 2022
01be0bc
Rename struct_linearize to decompose_structs and Improve docs
devavret Mar 18, 2022
de95530
review changes req by @ttnghia
devavret Mar 21, 2022
6c45cd4
Namespace changes and making element comparator private
devavret Mar 21, 2022
9bfd08e
Update cpp/include/cudf/table/experimental/row_operators.cuh
devavret Mar 22, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
55 changes: 27 additions & 28 deletions cpp/include/cudf/table/experimental/row_operators.cuh
Original file line number Diff line number Diff line change
Expand Up @@ -259,6 +259,9 @@ class device_row_comparator {
}; // class device_row_comparator

struct preprocessed_table {
using table_device_view_owner =
std::invoke_result_t<decltype(table_device_view::create), table_view, rmm::cuda_stream_view>;

/**
* @brief Preprocess table for use with lexicographical comparison
*
Expand All @@ -274,22 +277,29 @@ struct preprocessed_table {
* `null_order::BEFORE` for all columns.
* @param stream The stream to launch kernels and h->d copies on while preprocessing.
*/
preprocessed_table(table_view const& table,
host_span<order const> column_order,
host_span<null_order const> null_precedence,
rmm::cuda_stream_view stream);

// TODO: Should we add a static create method that returns a shared_ptr?
static std::shared_ptr<preprocessed_table> create(table_view const& table,
host_span<order const> column_order,
host_span<null_order const> null_precedence,
rmm::cuda_stream_view stream);
Comment on lines +282 to +285
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really want to force people to always construct as a shared_ptr?

I suppose if someone is going through the trouble of constructing a preprocessed_table themselves, then that means they are intending to use it in more than one comparator, in which case it will already need to be a shared_ptr.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

exactly. Why not return a shared_ptr if everything that can accept it needs a shared_ptr.


private:
friend class self_comparator;

preprocessed_table(table_device_view_owner&& table,
rmm::device_uvector<order>&& column_order,
rmm::device_uvector<null_order>&& null_precedence,
rmm::device_uvector<size_type>&& depths)
: _t(std::move(table)),
_column_order(std::move(column_order)),
_null_precedence(std::move(null_precedence)),
_depths(std::move(depths)){};

/**
* @brief Implicit conversion operator to a `table_device_view` of the preprocessed table.
*
* @return table_device_view
*/
operator table_device_view() { return **d_t; }
operator table_device_view() { return *_t; }

/**
* @brief Get a device array containing the desired order of each column in the preprocessed table
Expand All @@ -300,8 +310,8 @@ struct preprocessed_table {
*/
[[nodiscard]] std::optional<device_span<order const>> column_order() const
{
return d_column_order.size() ? std::optional<device_span<order const>>(d_column_order)
: std::nullopt;
return _column_order.size() ? std::optional<device_span<order const>>(_column_order)
: std::nullopt;
}

/**
Expand All @@ -314,9 +324,8 @@ struct preprocessed_table {
*/
[[nodiscard]] std::optional<device_span<null_order const>> null_precedence() const
{
return d_null_precedence.size()
? std::optional<device_span<null_order const>>(d_null_precedence)
: std::nullopt;
return _null_precedence.size() ? std::optional<device_span<null_order const>>(_null_precedence)
: std::nullopt;
}

/**
Expand All @@ -329,24 +338,14 @@ struct preprocessed_table {
*/
[[nodiscard]] std::optional<device_span<int const>> depths() const
{
return d_depths.size() ? std::optional<device_span<int const>>(d_depths) : std::nullopt;
return _depths.size() ? std::optional<device_span<int const>>(_depths) : std::nullopt;
}

/**
* @brief Whether the table has any nullable column
*
*/
[[nodiscard]] bool has_nulls() const { return _has_nulls; }

private:
using table_device_view_owner =
std::invoke_result_t<decltype(table_device_view::create), table_view, rmm::cuda_stream_view>;

std::unique_ptr<table_device_view_owner> d_t;
rmm::device_uvector<order> d_column_order;
rmm::device_uvector<null_order> d_null_precedence;
rmm::device_uvector<size_type> d_depths;
bool _has_nulls;
table_device_view_owner _t;
rmm::device_uvector<order> _column_order;
rmm::device_uvector<null_order> _null_precedence;
rmm::device_uvector<size_type> _depths;
};

/**
Expand Down Expand Up @@ -382,7 +381,7 @@ class self_comparator {
host_span<order const> column_order = {},
host_span<null_order const> null_precedence = {},
rmm::cuda_stream_view stream = rmm::cuda_stream_default)
: d_t{std::make_shared<preprocessed_table>(t, column_order, null_precedence, stream)}
: d_t{preprocessed_table::create(t, column_order, null_precedence, stream)}
{
}

Expand Down
26 changes: 12 additions & 14 deletions cpp/src/table/row_operators.cu
Original file line number Diff line number Diff line change
Expand Up @@ -167,26 +167,24 @@ void check_lex_compatibility(table_view const& input)

namespace lex {

preprocessed_table::preprocessed_table(table_view const& t,
host_span<order const> column_order,
host_span<null_order const> null_precedence,
rmm::cuda_stream_view stream)
: d_column_order(0, stream),
d_null_precedence(0, stream),
d_depths(0, stream),
_has_nulls(has_nested_nulls(t))
std::shared_ptr<preprocessed_table> preprocessed_table::create(
table_view const& t,
host_span<order const> column_order,
host_span<null_order const> null_precedence,
rmm::cuda_stream_view stream)
{
check_lex_compatibility(t);

auto [verticalized_lhs, new_column_order, new_null_precedence, verticalized_col_depths] =
struct_linearize(t, column_order, null_precedence);
decompose_structs(t, column_order, null_precedence);

d_t =
std::make_unique<table_device_view_owner>(table_device_view::create(verticalized_lhs, stream));
auto d_t = table_device_view::create(verticalized_lhs, stream);
auto d_column_order = detail::make_device_uvector_async(new_column_order, stream);
auto d_null_precedence = detail::make_device_uvector_async(new_null_precedence, stream);
auto d_depths = detail::make_device_uvector_async(verticalized_col_depths, stream);

d_column_order = detail::make_device_uvector_async(new_column_order, stream);
d_null_precedence = detail::make_device_uvector_async(new_null_precedence, stream);
d_depths = detail::make_device_uvector_async(verticalized_col_depths, stream);
return std::shared_ptr<preprocessed_table>(new preprocessed_table(
std::move(d_t), std::move(d_column_order), std::move(d_null_precedence), std::move(d_depths)));
Comment on lines +200 to +201
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make_shared

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can't do it for the same reason table_device_view couldn't. make_shared needs a public ctor.

}

} // namespace lex
Expand Down