-
Notifications
You must be signed in to change notification settings - Fork 915
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Convert rank
to use to experimental row comparators
#12481
Convert rank
to use to experimental row comparators
#12481
Conversation
Codecov Report
Additional details and impacted files@@ Coverage Diff @@
## branch-23.04 #12481 +/- ##
===============================================
Coverage ? 85.81%
===============================================
Files ? 158
Lines ? 25153
Branches ? 0
===============================================
Hits ? 21586
Misses ? 3567
Partials ? 0 Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report at Codecov. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good overall. The actual comparator changes were fairly small, so the bulk of it is benchmarks and tests.
Can you post a benchmark result comparing the throughput (GB/s) of ranking lists of ints/floats to ranking plain ints or floats?
Throughput: (GB/s)
|
It surprises me that lists of depth 4 achieve a higher throughput than lists of depth 1. Can you verify that result or comment on why you think that is happening? Are the "size_bytes" including list offsets, or just list contents (the leaf column of integers)? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All looks good -- I'll approve after we make a small tweak to the benchmark axes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me!
@divyegala Can you please tag the meta-issue #11844 in the description for all PRs that switch code to use experimental comparators? I edited the description for this PR. |
rank
to use to experimental row comparatorsrank
to use to experimental row comparators
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
/merge |
Description
Converts the
rank
function to use experimental row comparators, which support list and struct types. Part of #11844.Throughput benchmarks are available below. It seems like when
size_bytes
is constrained, the generator generates fewer rows inlist
types for increasing depths. That's why,depth=4
has a higher throughput thandepth=1
because the number of leaf elements generated are the same, but with much fewer rows.Checklist