Replace unnecessary uses of `UNKNOWN_NULL_COUNT` #13102

vyasr · 2023-04-10T16:15:05Z

Description

This PR replaces uses of cudf::UNKNOWN_NULL_COUNT where the null count is either already known or trivially computed.

Contributes to #11968

Checklist

I am familiar with the Contributing Guidelines.
New or existing tests cover these changes.
The documentation is up to date with these changes.

…e issue in parquet data

cpp/src/binaryop/compiled/binary_ops.cu

Co-authored-by: David Wendt <[email protected]>

bdice · 2023-04-10T16:49:33Z

It'd be nice to know if we're eliminating many "real-world" calls to the null_count kernel by supplying these values. Not sure how to measure that well, but perhaps benchmarks would show it?

cpp/src/binaryop/compiled/binary_ops.cu

vyasr · 2023-04-10T17:51:52Z

It'd be nice to know if we're eliminating many "real-world" calls to the null_count kernel by supplying these values. Not sure how to measure that well, but perhaps benchmarks would show it?

I agree on both counts, that is an interesting question, and one that is hard to answer. A good starting point would be defining what really constitutes a real-world use case. Also, the changes in this PR come in two flavors: 1) precomputing a null count, in which case we're doing more work up front under the assumption that it will eventually be necessary, and 2) propagating a known null count, which is a strict reduction in work. The latter is more common in this PR and is the case where we'd hope for kernel reduction of course, but depending on the workflow being benchmarked that affected might be washed out by changes of the first type in instances where the null count is not actually used. Hard to say without asking a pretty precise question I suspect.

cpp/src/binaryop/compiled/binary_ops.cu

vyasr · 2023-04-10T23:32:35Z

/merge

vyasr added 9 commits April 7, 2023 13:18

Set null count in binops

ae8070c

Spoof value in pack

aa7e81b

Use arrow's known null count in from_arrow

3a8bb63

Compute the null count when concatenating

860c8da

One more concatenate

b23dde6

Update row operators

f285e84

Revert concat changes. Should work, so failures probably indicate som…

a066f50

…e issue in parquet data

transform

f6faff5

dremel

c3c6567

vyasr added 3 - Ready for Review Ready for review by team libcudf Affects libcudf (C++/CUDA) code. improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels Apr 10, 2023

vyasr requested a review from a team as a code owner April 10, 2023 16:15

vyasr self-assigned this Apr 10, 2023

vyasr requested review from karthikeyann and davidwendt April 10, 2023 16:15

Merge branch 'branch-23.06' into feat/set_null_count_no_unknown

250c46f

vyasr added this to the Enable streams milestone Apr 10, 2023

davidwendt reviewed Apr 10, 2023

View reviewed changes

cpp/src/binaryop/compiled/binary_ops.cu Outdated Show resolved Hide resolved

Update cpp/src/binaryop/compiled/binary_ops.cu

09daff6

Co-authored-by: David Wendt <[email protected]>

ttnghia reviewed Apr 10, 2023

View reviewed changes

cpp/src/binaryop/compiled/binary_ops.cu Outdated Show resolved Hide resolved

Fix count in binop and change C-style cast to static_cast

6096427

ttnghia reviewed Apr 10, 2023

View reviewed changes

cpp/src/binaryop/compiled/binary_ops.cu Outdated Show resolved Hide resolved

Switch to reinterpret_cast

2a65edc

davidwendt approved these changes Apr 10, 2023

View reviewed changes

ttnghia approved these changes Apr 10, 2023

View reviewed changes

Merge branch 'branch-23.06' into feat/set_null_count_no_unknown

ba45113

ttnghia mentioned this pull request Apr 10, 2023

Support structs of lists in row lexicographic comparator #13005

Merged

rapids-bot bot merged commit cab6522 into rapidsai:branch-23.06 Apr 10, 2023

vyasr deleted the feat/set_null_count_no_unknown branch April 10, 2023 23:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Replace unnecessary uses of `UNKNOWN_NULL_COUNT` #13102

Replace unnecessary uses of `UNKNOWN_NULL_COUNT` #13102

vyasr commented Apr 10, 2023

bdice commented Apr 10, 2023

vyasr commented Apr 10, 2023

vyasr commented Apr 10, 2023

Replace unnecessary uses of UNKNOWN_NULL_COUNT #13102

Replace unnecessary uses of UNKNOWN_NULL_COUNT #13102

Conversation

vyasr commented Apr 10, 2023

Description

Checklist

bdice commented Apr 10, 2023

vyasr commented Apr 10, 2023

vyasr commented Apr 10, 2023

Replace unnecessary uses of `UNKNOWN_NULL_COUNT` #13102

Replace unnecessary uses of `UNKNOWN_NULL_COUNT` #13102