Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add struct type support for drop_list_duplicates #9202

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
24 commits
Select commit Hold shift + click to select a range
aa92eb4
Update doxygen
ttnghia Sep 3, 2021
669a8a4
Change preconditioning
ttnghia Sep 3, 2021
206b300
Rewrite tests
ttnghia Sep 8, 2021
82f31cf
Implement `has_negative_nans_fn` for structs
ttnghia Sep 8, 2021
dca7b74
Implement `replace_negative_nans_fn` for structs
ttnghia Sep 8, 2021
75b246f
Implementation is working
ttnghia Sep 9, 2021
2a7efce
Add test
ttnghia Sep 9, 2021
b6b0c45
Merge branch 'branch-21.10' into drop_list_duplicates_for_structs
ttnghia Sep 9, 2021
edca8d5
Add tests for structs
ttnghia Sep 12, 2021
73f4823
Fix test for structs
ttnghia Sep 12, 2021
d00c192
Rename structs
ttnghia Sep 13, 2021
5c56282
Access children of structs column by sliced child
ttnghia Sep 14, 2021
a27e186
Rewrite doxygen, rename variable, and various other small changes
ttnghia Sep 14, 2021
9d54708
Add sliced input test
ttnghia Sep 14, 2021
064c958
Apply upstream `gather.cuh`
ttnghia Sep 14, 2021
c89b40b
Fix offsets with non-zero base
ttnghia Sep 14, 2021
d8776ad
Merge branch 'branch-21.10' into drop_list_duplicates_for_structs
ttnghia Sep 17, 2021
515474f
Reverse change for default case of `has_negative_nans_dispatch`
ttnghia Sep 17, 2021
6af5d29
Update tests
ttnghia Sep 17, 2021
a811a6b
Merge branch 'branch-21.10' into drop_list_duplicates_for_structs
ttnghia Sep 17, 2021
f8767f3
Address review comments
ttnghia Sep 20, 2021
ad15972
Merge branch 'branch-21.10' into drop_list_duplicates_for_structs
ttnghia Sep 20, 2021
8a2e993
Remove debug printing
ttnghia Sep 20, 2021
2296f1a
Add constructors for the functors and add comments for `has_negative_…
ttnghia Sep 20, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
28 changes: 14 additions & 14 deletions cpp/include/cudf/lists/drop_list_duplicates.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -28,32 +28,32 @@ namespace lists {
*/

/**
* @brief Create a new lists column by removing duplicated entries from each list element in the
* given lists column
* @brief Create a new lists column by extracting unique entries from list elements in the given
* lists column.
*
* @throw cudf::logic_error if any row (list element) in the input column is a nested type.
*
* Given an `input` lists_column_view, the list elements in the column are copied to an output lists
* Given an input lists column, the list elements in the column are copied to an output lists
* column such that their duplicated entries are dropped out to keep only the unique ones. The
* order of those entries within each list are not guaranteed to be preserved as in the input. In
* the current implementation, entries in the output lists are sorted by ascending order (nulls
* last), but this is not guaranteed in future implementation.
*
* @param lists_column The input lists_column_view
* @param nulls_equal Flag to specify whether null entries should be considered equal
* @param nans_equal Flag to specify whether NaN entries should be considered as equal value (only
* applicable for floating point data column)
* @param mr Device resource used to allocate memory
* @throw cudf::logic_error if the child column of the input lists column contains nested type other
* than struct.
*
* @param lists_column The input lists column to extract lists with unique entries.
* @param nulls_equal Flag to specify whether null entries should be considered equal.
* @param nans_equal Flag to specify whether NaN entries should be considered as equal value (only
* applicable for floating point data column).
* @param mr Device resource used to allocate memory.
*
* @code{.pseudo}
* lists_column = { {1, 1, 2, 1, 3}, {4}, NULL, {}, {NULL, NULL, NULL, 5, 6, 6, 6, 5} }
* input = { {1, 1, 2, 1, 3}, {4}, NULL, {}, {NULL, NULL, NULL, 5, 6, 6, 6, 5} }
* output = { {1, 2, 3}, {4}, NULL, {}, {5, 6, NULL} }
*
* Note that permuting the entries of each list in this output also produces another valid
* output.
* Note that permuting the entries of each list in this output also produces another valid output.
* @endcode
*
* @return A list column with list elements having unique entries
* @return A lists column with list elements having unique entries.
*/
std::unique_ptr<column> drop_list_duplicates(
lists_column_view const& lists_column,
Expand Down
Loading