Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

extract_list_elements() with column_view indices #9367

Merged

Conversation

mythrocks
Copy link
Contributor

@mythrocks mythrocks commented Oct 4, 2021

Fixes #9172.

Adds an overload of extract_list_element() where the indices
may be specified as a column_view.

This function returns a list element from a potentially different index,
for each list row. The semantics of the scalar-index version of the function
are retained. i.e.:

  1. The index is 0-based.
  2. if (list_row == null) return null;
  3. if (index > list_row.size()) return null;
  4. if (index == null) return null;
  5. if (index < 0 && -index <= length) return list_row[length + index];

This commit also reworks extract_list_element(list, size_type, stream),
to use segmented_gather(), as per the advice in #9214.

Fixes rapidsai#9172.

Adds an overload of `extract_list_element()` where the indices
may be specified as a `column_view`.

This function returns a list element from a potentially different index,
for each list row. The semantics of the scalar-index version of the function
are retained. i.e.:

0. The index is 0-based.
1. `if (list_row == null) return null;`
2. `if (index > list_row.size()) return null;`
3. `if (index == null) return null;`
4. `if (index < 0 || -index <= length) return list_row[length + index];`

This commit also reworks `extract_list_element(list, size_type, stream)`,
to use `segmented_gather()`.
@mythrocks mythrocks self-assigned this Oct 4, 2021
@mythrocks mythrocks requested a review from a team as a code owner October 4, 2021 23:21
@github-actions github-actions bot added the libcudf Affects libcudf (C++/CUDA) code. label Oct 4, 2021
@mythrocks mythrocks added feature request New feature or request non-breaking Non-breaking change and removed libcudf Affects libcudf (C++/CUDA) code. labels Oct 4, 2021
@codecov
Copy link

codecov bot commented Oct 5, 2021

Codecov Report

Merging #9367 (61a39db) into branch-21.12 (ab4bfaa) will decrease coverage by 0.04%.
The diff coverage is 0.00%.

❗ Current head 61a39db differs from pull request most recent head e0a5689. Consider uploading reports for the commit e0a5689 to get more accurate results
Impacted file tree graph

@@               Coverage Diff                @@
##           branch-21.12    #9367      +/-   ##
================================================
- Coverage         10.79%   10.74%   -0.05%     
================================================
  Files               116      116              
  Lines             18869    19503     +634     
================================================
+ Hits               2036     2096      +60     
- Misses            16833    17407     +574     
Impacted Files Coverage Δ
python/cudf/cudf/__init__.py 0.00% <0.00%> (ø)
python/cudf/cudf/_lib/__init__.py 0.00% <ø> (ø)
python/cudf/cudf/_lib/strings/__init__.py 0.00% <0.00%> (ø)
python/cudf/cudf/core/_base_index.py 0.00% <0.00%> (ø)
python/cudf/cudf/core/column/categorical.py 0.00% <0.00%> (ø)
python/cudf/cudf/core/column/column.py 0.00% <0.00%> (ø)
python/cudf/cudf/core/column/datetime.py 0.00% <0.00%> (ø)
python/cudf/cudf/core/column/lists.py 0.00% <0.00%> (ø)
python/cudf/cudf/core/column/numerical.py 0.00% <0.00%> (ø)
python/cudf/cudf/core/column/string.py 0.00% <0.00%> (ø)
... and 67 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update fb18491...e0a5689. Read the comment docs.

@harrism
Copy link
Member

harrism commented Oct 5, 2021

if (index < 0 || -index <= length) return list_row[length + index];

Are you testing me? In order for the second clause to be evaluated, index >= 0 must be true. If so, and length must be non-negative, then the second clause, -index <= length, must always be true, right? So this is equivalent to

if (index < 0) return list_row[length + index]

Can you state #4 in English for clarity?

Edit: Oh, I think you mean && instead of ||?

cpp/src/lists/extract.cu Outdated Show resolved Hide resolved
cpp/src/lists/extract.cu Outdated Show resolved Hide resolved
@mythrocks
Copy link
Contributor Author

mythrocks commented Oct 5, 2021

Edit: Oh, I think you mean && instead of ||?

@harrism: Ah, thank you for catching that. I think I meant:

if (index < 0 && -index <= length) return list_row[length + index];

I will update the description in this PR, and in #9172.

@github-actions github-actions bot added the libcudf Affects libcudf (C++/CUDA) code. label Oct 5, 2021
@mythrocks mythrocks requested a review from davidwendt October 5, 2021 22:30
Copy link
Member

@harrism harrism left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Mostly doc improvements.

cpp/include/cudf/lists/extract.hpp Outdated Show resolved Hide resolved
cpp/include/cudf/lists/extract.hpp Outdated Show resolved Hide resolved
cpp/include/cudf/lists/extract.hpp Outdated Show resolved Hide resolved
cpp/src/lists/extract.cu Outdated Show resolved Hide resolved
cpp/src/lists/extract.cu Outdated Show resolved Hide resolved
@mythrocks mythrocks requested a review from harrism October 7, 2021 20:51
cpp/src/lists/extract.cu Outdated Show resolved Hide resolved
@mythrocks mythrocks requested a review from ttnghia October 12, 2021 16:54
@mythrocks
Copy link
Contributor Author

Rerun tests

@mythrocks
Copy link
Contributor Author

Thank you for the reviews, and thought-provoking discussion, all. I'm merging this now.

@mythrocks
Copy link
Contributor Author

@gpucibot merge

@rapids-bot rapids-bot bot merged commit 794863c into rapidsai:branch-21.12 Oct 13, 2021
@mythrocks mythrocks deleted the extract-list-element-redux branch October 13, 2021 03:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request New feature or request libcudf Affects libcudf (C++/CUDA) code. non-breaking Non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[FEA] Implement extract_list_element() with column_view indices
4 participants