Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor lists::contains #11019

Merged
merged 218 commits into from
Jun 23, 2022
Merged
Changes from 1 commit
Commits
Show all changes
218 commits
Select commit Hold shift + click to select a range
64e4160
Rename constant
ttnghia Mar 29, 2022
b252c1e
Remove template parameter
ttnghia Mar 29, 2022
06bc976
Remove copydoc in implementation
ttnghia Mar 29, 2022
c409fc5
Add template keyword to avoid compiler warning
ttnghia Mar 30, 2022
580afdf
Remove constructor
ttnghia Mar 30, 2022
7408d9a
Cleanup
ttnghia Mar 30, 2022
7432b5b
Change out_validity from a column to device_uvector
ttnghia Mar 30, 2022
c94b32b
Add const
ttnghia Mar 30, 2022
c6a4218
Remove cudf::detail:: namespace from lists_column_device_view
ttnghia Mar 30, 2022
6c86384
Fix doxygen
ttnghia Mar 30, 2022
ff0a761
Add empty struct support function
ttnghia Mar 30, 2022
d439fe1
Remove cudf:: prefix
ttnghia Mar 30, 2022
5d64986
Refactor output iters
ttnghia Mar 30, 2022
8babe05
Unify functions
ttnghia Mar 30, 2022
1a87fdc
Fix tests
ttnghia Mar 30, 2022
bb45455
Remove variable
ttnghia Mar 30, 2022
47ca5c9
Update doxygen
ttnghia Apr 7, 2022
8b031de
Implement `search_list` functors
ttnghia Apr 8, 2022
8058785
Rewrite dox and change functor name
ttnghia Apr 8, 2022
24aa6a9
Implement `search_functor` specialized for non-struct types
ttnghia Apr 8, 2022
758a3e8
MISC
ttnghia Apr 8, 2022
221cd1c
Implementing `dispatch_index_of`
ttnghia Apr 8, 2022
c81b544
Move `is_struct_type` into `struct_view.hpp`
ttnghia Apr 11, 2022
64b0d63
Fix lambda
ttnghia Apr 11, 2022
98c5a57
Complete a (non-working) implementation for non-struct types
ttnghia Apr 11, 2022
d103876
Fix compile errors
ttnghia Apr 11, 2022
2afad64
Rename function
ttnghia Apr 11, 2022
e2fdccc
Use `scalar_value_accessor` to retrieve scalar key
ttnghia Apr 11, 2022
4f15201
Fix iterators
ttnghia Apr 11, 2022
98f3471
WIP
ttnghia Apr 12, 2022
bf0e148
Fix scalar access
ttnghia Apr 12, 2022
b5772ee
Add overload for `operator()` that can return a pair iterator for fix…
ttnghia Apr 12, 2022
07ed05e
Remove an overload of `search_all_lists`
ttnghia Apr 12, 2022
247d3e7
Compilable
ttnghia Apr 12, 2022
b11149c
Fix has_nulls
ttnghia Apr 12, 2022
399f380
Add overload into optional iterator to support fixed point types
ttnghia Apr 13, 2022
58d3bd6
Use optional iterator
ttnghia Apr 13, 2022
6c65cb2
Misc
ttnghia Apr 13, 2022
47dd0c7
Merge branch 'branch-22.06' into lists_contains_for_structs
ttnghia Apr 13, 2022
6e91f82
Fix compile issues
ttnghia Apr 13, 2022
35ad1bb
Fix tests
ttnghia Apr 13, 2022
63f0b15
Fix all errors
ttnghia Apr 13, 2022
f2e59c9
Rename variable
ttnghia Apr 13, 2022
b32bc8a
Fix offset for sliced child
ttnghia Apr 13, 2022
7aaee96
Implement `search_functor` for struct
ttnghia Apr 13, 2022
317a5ea
Add back `detail::` namespace to `lists_column_device_view`
ttnghia Apr 13, 2022
ba87400
Move back `lists_column_device_view` into `detail::` namespace
ttnghia Apr 13, 2022
36d41a9
Implement structs for search_all_lists
ttnghia Apr 14, 2022
2c460e0
Fix null handling
ttnghia Apr 14, 2022
fb79a7b
Misc
ttnghia Apr 14, 2022
e1ca105
Rename variables
ttnghia Apr 15, 2022
0e28f48
Add `EmptyInputTest`
ttnghia Apr 15, 2022
e6a49fd
Add a simple contains test without null
ttnghia Apr 15, 2022
d0bd6c0
Add `ScalarKeyWithNullLists` test
ttnghia Apr 15, 2022
b71237c
Add `SlicedListsColumn` test
ttnghia Apr 15, 2022
fc9bddd
Add `ScalarKeyNoNullListsWithNullStructs` test
ttnghia Apr 15, 2022
e5c1900
Fix has null condition
ttnghia Apr 15, 2022
88ce647
Add `EmptyInputTest` for key column
ttnghia Apr 15, 2022
fbe0e24
Add comment and fix validity iterator
ttnghia Apr 15, 2022
fa964ee
Add a template parameter to `make_validity_iterator` for scalar
ttnghia Apr 15, 2022
af79f5f
Fix copyright year
ttnghia Apr 15, 2022
a4b3f18
Fix index
ttnghia Apr 16, 2022
1f70fe4
Add more tests
ttnghia Apr 16, 2022
3d582b5
Add `ColumnKeyWithSlicedListsHavingNulls`
ttnghia Apr 16, 2022
7739539
Fix scalar and has_any_null
ttnghia Apr 16, 2022
9d595ca
Enable all tests
ttnghia Apr 16, 2022
79d0d4d
Fix empty lines
ttnghia Apr 18, 2022
d36c688
Add `table_comparator`
ttnghia Apr 18, 2022
0259f44
Adding code to use the new table_comparator
ttnghia Apr 18, 2022
a037528
Cleanup
ttnghia Apr 18, 2022
5494c1a
Enable table_comparator
ttnghia Apr 18, 2022
f8a6720
Fix index
ttnghia Apr 18, 2022
b7ae7be
Cleanup
ttnghia Apr 18, 2022
d1c646e
Use validity iterators
ttnghia Apr 19, 2022
43d2be5
Add a function to return row index of the current list
ttnghia Apr 20, 2022
0a7c5a0
Rewrite much of the code
ttnghia Apr 20, 2022
ae70e54
Cleanup headers
ttnghia Apr 20, 2022
2dbceaf
Fix the tests with throw message
ttnghia Apr 20, 2022
9c0f22e
Complete cleaning up
ttnghia Apr 20, 2022
a1408fc
Remove added code
ttnghia Apr 20, 2022
67bd5a8
Unify functions
ttnghia Apr 20, 2022
bd555e5
Merge branch 'branch-22.06' into lists_contains_for_structs
ttnghia Apr 20, 2022
609fac3
Reverse changes in `struct_view.hpp`
ttnghia Apr 28, 2022
9b9fe84
Enable nested types instead of just struct type
ttnghia Apr 28, 2022
14be4ae
Reverse change in `struct_view.hpp`
ttnghia Apr 28, 2022
3126c67
Rename variable
ttnghia Apr 28, 2022
34dc70d
Change comments
ttnghia Apr 28, 2022
e4db5ef
Fix error
ttnghia Apr 29, 2022
60c3d93
Merge branch 'branch-22.06' into lists_contains_for_structs
ttnghia Apr 29, 2022
b0d6b4e
Add a simple list test
ttnghia Apr 29, 2022
497acc8
Fix comment
ttnghia Apr 29, 2022
5d5d6db
Add list tests
ttnghia May 1, 2022
6869fb1
Cleanup
ttnghia May 2, 2022
198a156
Merge branch 'branch-22.06' into lists_contains_for_structs
ttnghia May 2, 2022
50b8891
Add strong index type.
bdice Apr 16, 2022
b9ed4d7
Revert changes to non-experimental row operators.
bdice Apr 20, 2022
d67f17e
Use enum for strongly typed index.
bdice May 3, 2022
464ed2b
Add two table comparator and adapter.
bdice May 3, 2022
b26b318
Add friends. :)
bdice May 3, 2022
1fd199d
Apply two-table comparator to search algorithms.
bdice May 3, 2022
18bd9f0
Move shared lhs/rhs logic into launch_search.
bdice May 3, 2022
b5b8b39
Improve comments, remove old code.
bdice May 3, 2022
4060b4f
Merge remote-tracking branch 'upstream/branch-22.06' into strong-inde…
bdice May 11, 2022
73c4b27
Move strong typing code into cudf::experimental::row::lexicographic.
bdice May 11, 2022
9cdbe27
Merge remote-tracking branch 'upstream/branch-22.06' into strong-inde…
bdice May 13, 2022
c8a38fe
Improve comment.
bdice May 13, 2022
8b5ef34
Fix docstrings.
bdice May 13, 2022
77f85b4
Enable weak ordering machinery (weak_ordering_comparator_impl) to wra…
bdice May 13, 2022
529e944
Remove template template parameters.
bdice May 13, 2022
fb0e192
Use references.
bdice May 13, 2022
56d99ba
Use Ts const...
bdice May 13, 2022
c5998b7
Move strong typing to cudf::experimental::row.
bdice May 13, 2022
b78d978
Use constexpr.
bdice May 13, 2022
3aea8d4
Use custom iterator class.
bdice May 14, 2022
bbaf360
Use __device__ only.
bdice May 14, 2022
4a1d7aa
Add comment.
bdice May 14, 2022
09c5661
Use symmetry of comparator (now possible with weak ordering) to avoid…
bdice May 14, 2022
290323f
Add constexpr to two_table_device_row_comparator_adapter.
bdice May 14, 2022
4c69edd
Remove forward (always accepts lvalues).
bdice May 14, 2022
ea8c223
Merge branch 'strong-index-type' into strong_typed_index
ttnghia May 16, 2022
857f570
Implement strong typed index for equality comparator
ttnghia May 16, 2022
d9f63f0
Adopt new strong typed index
ttnghia May 16, 2022
12f7a8b
Remove header
ttnghia May 16, 2022
fbd5b90
Indicate reversed signature.
bdice May 16, 2022
3db6484
Move constructor to implementation, add shape compatibility check.
bdice May 16, 2022
3e81b53
Improve docstrings.
bdice May 16, 2022
1834095
Use thrust::iterator_facade.
bdice May 16, 2022
9cb656b
Merge branch 'branch-22.06' into strong_typed_index
ttnghia May 16, 2022
c766bf3
Merge branch 'strong-index-type' into strong_typed_index
ttnghia May 16, 2022
a311bcc
Add type check
ttnghia May 16, 2022
b935835
Change parameter side for comparator
ttnghia May 16, 2022
ff26024
Use const for struct members.
bdice May 17, 2022
f779bff
Slim down the strong index layer by using a templated struct.
bdice May 17, 2022
157abbc
Simplify construction.
bdice May 17, 2022
a2ac19d
Use size_type const where possible.
bdice May 17, 2022
75249e8
Require weakly or strongly typed values for lhs_index and rhs_index.
bdice May 17, 2022
f50faf5
Merge branch 'strong-index-type' into strong_typed_index
ttnghia May 17, 2022
bed1162
Unconstrain template typenames.
bdice May 18, 2022
6930952
Merge branch 'strong-index-type' into strong_typed_index
ttnghia May 18, 2022
2781af1
Remove type check
ttnghia May 18, 2022
8b239d4
Update adapter comparator
ttnghia May 18, 2022
17bd96c
Merge branch 'branch-22.06' into strong_typed_index
ttnghia May 18, 2022
ae77f68
Merge branch 'branch-22.06' into strong_typed_index
ttnghia May 18, 2022
d6b5eb9
Remove deprecated code
ttnghia May 18, 2022
5f39a28
Fix comments
ttnghia May 18, 2022
bf3555c
Fix header check
ttnghia May 18, 2022
df98698
Fix format
ttnghia May 18, 2022
b67d070
Rename variables and fix header
ttnghia May 18, 2022
4429cc4
Merge branch 'branch-22.06' into strong_typed_index
ttnghia May 18, 2022
893db8a
Address review comments
ttnghia May 18, 2022
7e69d3b
Merge branch 'branch-22.06' into strong_typed_index
ttnghia May 18, 2022
c51b053
Fix renaming issue
ttnghia May 18, 2022
934ee73
Switch operands
ttnghia May 18, 2022
cf996f0
Update cpp/src/structs/search/contains.cu
ttnghia May 18, 2022
7747f8d
Merge branch 'strong_typed_index' into lists_contains_for_structs
ttnghia May 18, 2022
6f20dd6
Adopt strong index types
ttnghia May 18, 2022
ecd6742
Add type check
ttnghia May 18, 2022
14194d4
Merge branch 'branch-22.06' into lists_contains_for_structs
ttnghia May 19, 2022
b0558bf
Fix comment
ttnghia May 19, 2022
795506f
Rewrite functions
ttnghia May 19, 2022
261e98c
Add `nested_type_scalar_to_column_view`
ttnghia May 19, 2022
4f7ec7d
Merge branch 'column_utility' into lists_contains_for_structs
ttnghia May 19, 2022
49661b8
Adopt `nested_type_scalar_to_column_view`
ttnghia May 19, 2022
94dee44
Remove header
ttnghia May 19, 2022
b69b1bb
Add comments, and rewrite utility functions
ttnghia May 19, 2022
5f98d5b
Rewrite comments and refactor
ttnghia May 19, 2022
9cc4ade
Merge branch 'branch-22.06' into lists_contains_for_structs
ttnghia May 19, 2022
91447cf
Rename .cuh into .hpp
ttnghia May 19, 2022
ead10f2
Merge branch 'column_utility' into lists_contains_for_structs
ttnghia May 19, 2022
58a231c
Change header extension
ttnghia May 19, 2022
29613bc
Move implementation to cpp file
ttnghia May 19, 2022
99f6036
Merge branch 'column_utility' into lists_contains_for_structs
ttnghia May 19, 2022
a7f9dff
Simplify code
ttnghia May 20, 2022
dd0f226
Materialize a new column from scalar, not `column_view`
ttnghia May 23, 2022
dd1f2ee
Merge branch 'branch-22.08' into lists_contains_for_structs
ttnghia May 23, 2022
758b781
Merge branch 'branch-22.08' into lists_contains_for_structs
ttnghia May 26, 2022
13b57b0
Merge branch 'branch-22.08' into lists_contains_for_structs
ttnghia Jun 1, 2022
f45eb15
Reverse `contains_tests.cpp`
ttnghia Jun 1, 2022
2d6adb0
Remove nested type support
ttnghia Jun 1, 2022
f8d70bf
Add back headers
ttnghia Jun 1, 2022
84617fc
Add back type check
ttnghia Jun 1, 2022
18934d9
Remove variable
ttnghia Jun 1, 2022
ad32b04
Rewrite comments
ttnghia Jun 1, 2022
6b238be
Add doxygen
ttnghia Jun 1, 2022
d27f0ce
Remove wrong docs
ttnghia Jun 1, 2022
7cd394d
Add comments
ttnghia Jun 1, 2022
5f3f120
Remove not-yet-used function
ttnghia Jun 1, 2022
a3226d9
Change comments
ttnghia Jun 1, 2022
c27a4b9
Fix doxygen
ttnghia Jun 1, 2022
2656ca6
Fix comments and use `CUDF_ENABLE_IF`
ttnghia Jun 1, 2022
e40d511
Merge branch 'branch-22.08' into refactor_lists_contains
ttnghia Jun 3, 2022
5365f84
Minor fixes
ttnghia Jun 8, 2022
fb96c44
Merge branch 'branch-22.08' into refactor_lists_contains
ttnghia Jun 8, 2022
18dc23c
Remove reference from iterators
ttnghia Jun 8, 2022
1408792
Rewrite struct `search_index_fn` into functor `search_lists_fn`
ttnghia Jun 8, 2022
6c489bb
Move `search_list` into private
ttnghia Jun 8, 2022
ce18a98
Change `search_list` into static function
ttnghia Jun 8, 2022
ccfaf95
Remove redundant template arguments
ttnghia Jun 8, 2022
635d82f
Use all references in lambda
ttnghia Jun 8, 2022
9399722
Rename variables
ttnghia Jun 8, 2022
821302b
Merge branch 'branch-22.08' into refactor_lists_contains
ttnghia Jun 9, 2022
bd4ff30
WIP
ttnghia Jun 15, 2022
0626f01
Change template argument name
ttnghia Jun 15, 2022
eedd341
Remove input parameter from functions
ttnghia Jun 15, 2022
097c31e
Use CTAD for `search_lists_fn`
ttnghia Jun 15, 2022
110d7d3
Fix tests
ttnghia Jun 16, 2022
217dc0f
Remove `out_validity` array
ttnghia Jun 16, 2022
9626710
Use `thrust::transform` instead of `thrust::tabulate`
ttnghia Jun 16, 2022
364a3db
Attempt to fix compiler error
ttnghia Jun 16, 2022
0215c12
Try to use device lambda
ttnghia Jun 16, 2022
778933a
Move function inside struct
ttnghia Jun 16, 2022
148912a
Move function inside struct
ttnghia Jun 16, 2022
a6019b5
Add a default case for unsupported types
ttnghia Jun 16, 2022
a370480
Further compact code, and remove ref in device function
ttnghia Jun 17, 2022
b112fe8
Merge branch 'branch-22.08' into refactor_lists_contains
ttnghia Jun 21, 2022
9389a08
Merge branch 'branch-22.08' into refactor_lists_contains
ttnghia Jun 22, 2022
cb0c736
Merge branch 'branch-22.08' into refactor_lists_contains
ttnghia Jun 23, 2022
980e0ac
Minor change to SFINAE
ttnghia Jun 23, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Fix error
ttnghia committed Apr 29, 2022
commit e4db5efe5879a8496082c4ba99d6ad1a64c71fc1
33 changes: 22 additions & 11 deletions cpp/src/lists/contains.cu
Original file line number Diff line number Diff line change
@@ -16,6 +16,7 @@

#include <cudf/column/column_factories.hpp>
#include <cudf/detail/iterator.cuh>
#include <cudf/detail/utilities/vector_factories.hpp>
#include <cudf/detail/valid_if.cuh>
#include <cudf/lists/detail/contains.hpp>
#include <cudf/lists/list_device_view.cuh>
@@ -205,16 +206,28 @@ auto get_search_keys_device_view_ptr(SearchKeyType const& search_keys,
* @brief Create a `table_view` having one column that is the search key(s).
*/
template <typename SearchKeyType>
auto get_search_keys_table_view(SearchKeyType const& search_keys)
std::pair<rmm::device_uvector<offset_type>, table_view> get_search_keys_table_view(
SearchKeyType const& search_keys, rmm::cuda_stream_view stream)
{
if constexpr (std::is_same_v<SearchKeyType, cudf::scalar>) {
auto const children = static_cast<struct_scalar const*>(&search_keys)->view();
// Create a `column_view` of struct type that have children copied from the input scalar.
auto const parent = column_view{
data_type{type_id::STRUCT}, 1, nullptr, nullptr, 0, 0, {children.begin(), children.end()}};
return table_view{{parent}};
if (static_cast<scalar const*>(&search_keys)->type().id() == type_id::STRUCT) {
auto const children = static_cast<struct_scalar const*>(&search_keys)->view();
// Create a `column_view` of struct type that have the same children as from the input scalar.
auto const structs_col = column_view{
data_type{type_id::STRUCT}, 1, nullptr, nullptr, 0, 0, {children.begin(), children.end()}};
return {rmm::device_uvector<offset_type>{0, stream}, table_view{{structs_col}}};
} else {
auto const child = static_cast<list_scalar const*>(&search_keys)->view();
auto offsets = cudf::detail::make_device_uvector_async<offset_type>(
std::vector<offset_type>{0, child.size()}, stream);
auto const offsets_cview = column_view(data_type{type_id::INT32}, 2, offsets.data());
// Create a `column_view` of list type that have the same child as from the input scalar.
auto const lists_col =
column_view{data_type{type_id::LIST}, 1, nullptr, nullptr, 0, 0, {offsets_cview, child}};
return {std::move(offsets), table_view{{lists_col}}};
}
} else {
return table_view{{search_keys}};
return {rmm::device_uvector<offset_type>{0, stream}, table_view{{search_keys}}};
}
}

@@ -242,8 +255,6 @@ struct dispatch_index_of {
// operations.
auto const child = lists.child();

CUDF_EXPECTS(!cudf::is_nested(child.type()) || child.type().id() == type_id::STRUCT,
"Nested types except STRUCT are not supported in list search operations.");
CUDF_EXPECTS(child.type() == search_keys.type(),
"Type of search key does not match with type of the list column element type.");
CUDF_EXPECTS(search_keys.type().id() != type_id::EMPTY, "Type cannot be empty.");
@@ -272,8 +283,8 @@ struct dispatch_index_of {
if constexpr (cudf::is_nested<Type>()) { // nested types (list + struct) ======================
auto const key_validity_iter = cudf::detail::make_validity_iterator<true>(*keys_dv_ptr);
auto const child_tview = table_view{{child}};
auto const keys_tview = get_search_keys_table_view(search_keys);
auto const has_any_nulls = has_nested_nulls(child_tview) || has_nested_nulls(keys_tview);
[[maybe_unused]] auto const [_, keys_tview] = get_search_keys_table_view(search_keys, stream);
auto const has_any_nulls = has_nested_nulls(child_tview) || has_nested_nulls(keys_tview);
auto const comp =
cudf::experimental::row::equality::table_comparator(child_tview, keys_tview, stream);
auto const eq_comp = comp.device_comparator(nullate::DYNAMIC{has_any_nulls});