Convert `rank` to use to experimental row comparators #12481

divyegala · 2023-01-05T20:06:02Z

Description

Converts the rank function to use experimental row comparators, which support list and struct types. Part of #11844.

Throughput benchmarks are available below. It seems like when size_bytes is constrained, the generator generates fewer rows in list types for increasing depths. That's why, depth=4 has a higher throughput than depth=1 because the number of leaf elements generated are the same, but with much fewer rows.

Checklist

I am familiar with the Contributing Guidelines.
New or existing tests cover these changes.
The documentation is up to date with these changes.

codecov · 2023-01-05T22:44:24Z

Codecov Report

❗ No coverage uploaded for pull request base (branch-23.04@291c751). Click here to learn what that means.
Patch has no changes to coverable lines.

Additional details and impacted files

@@               Coverage Diff               @@
##             branch-23.04   #12481   +/-   ##
===============================================
  Coverage                ?   85.81%           
===============================================
  Files                   ?      158           
  Lines                   ?    25153           
  Branches                ?        0           
===============================================
  Hits                    ?    21586           
  Misses                  ?     3567           
  Partials                ?        0

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

…mparator

bdice

Looks good overall. The actual comparator changes were fairly small, so the bulk of it is benchmarks and tests.

Can you post a benchmark result comparing the throughput (GB/s) of ranking lists of ints/floats to ranking plain ints or floats?

cpp/benchmarks/sort/nested_types_common.hpp

cpp/benchmarks/sort/rank_lists.cpp

cpp/tests/sort/rank_test.cpp

divyegala · 2023-02-03T20:44:18Z

Throughput: (GB/s)
int, rank_method::FIRST, null_frequency=0.2:

   nulls_type  size_bytes  throughput
0    no_nulls        4096    0.067143
1    no_nulls       16384    0.270018
2    no_nulls      131072    1.008052
3    no_nulls     1048576    7.954703
4    no_nulls     8388608   17.807422
5    no_nulls    67108864   17.511224
6    no_nulls   268435456   17.633304
7       nulls        4096    0.041735
8       nulls       16384    0.092360
9       nulls      131072    0.522772
10      nulls     1048576    2.862868
11      nulls     8388608    1.865691
12      nulls    67108864    1.905172
13      nulls   268435456    1.883955

List<int>, rank_method::FIRST:

    null_frequency  depth  size_bytes  throughput
0              0.0      1        1024    0.001656
1              0.0      1      262144    0.119881
2              0.0      1    16777216    0.322392
3              0.0      1   268435456    0.219398
4              0.0      4        1024    0.001183
5              0.0      4      262144    0.102692
6              0.0      4    16777216    1.481422
7              0.0      4   268435456    1.318262
8              0.2      1        1024    0.001730
9              0.2      1      262144    0.118890
10             0.2      1    16777216    0.439320
11             0.2      1   268435456    0.313680
12             0.2      4        1024    0.001368
13             0.2      4      262144    0.123197
14             0.2      4    16777216    1.981596
15             0.2      4   268435456    2.129115

…mparator

bdice · 2023-02-03T22:03:24Z

It surprises me that lists of depth 4 achieve a higher throughput than lists of depth 1. Can you verify that result or comment on why you think that is happening? Are the "size_bytes" including list offsets, or just list contents (the leaf column of integers)?

bdice

All looks good -- I'll approve after we make a small tweak to the benchmark axes.

cpp/benchmarks/sort/rank_lists.cpp

bdice

Looks good to me!

bdice · 2023-02-06T17:42:03Z

@divyegala Can you please tag the meta-issue #11844 in the description for all PRs that switch code to use experimental comparators? I edited the description for this PR.

PointKernel

LGTM

divyegala · 2023-02-06T23:47:47Z

/merge

This reverts commit 945e89f.

update to experimental row comparator

c4c7cf2

divyegala added feature request New feature or request non-breaking Non-breaking change labels Jan 5, 2023

github-actions bot added the libcudf Affects libcudf (C++/CUDA) code. label Jan 5, 2023

working through tests

2723469

GregoryKimball assigned divyegala Jan 17, 2023

divyegala added 7 commits January 24, 2023 14:28

Merge remote-tracking branch 'upstream/branch-23.02' into rank-row-co…

4ef330c

…mparator

lists tests passing

72cdb6e

formatting

5253196

adding tests for structs

2dd66c0

Merge remote-tracking branch 'upstream/branch-23.02' into rank-row-co…

0baac12

…mparator

all tests passed

1e86c5d

Merge remote-tracking branch 'upstream/branch-23.02' into rank-row-co…

20197e8

…mparator

divyegala marked this pull request as ready for review January 31, 2023 18:25

divyegala requested a review from a team as a code owner January 31, 2023 18:25

divyegala requested review from bdice and nvdbaranec January 31, 2023 18:25

divyegala added 3 commits January 31, 2023 15:50

add benchmarks

b017f54

Merge remote-tracking branch 'upstream/branch-23.04' into rank-row-co…

5f2aaa5

…mparator

fix bad merge

7a86326

divyegala requested review from a team as code owners February 1, 2023 00:09

divyegala requested review from vyasr and removed request for a team February 1, 2023 00:09

github-actions bot added ci CMake CMake build issue labels Feb 1, 2023

divyegala changed the base branch from branch-23.02 to branch-23.04 February 1, 2023 00:09

divyegala removed request for a team and vyasr February 1, 2023 00:10

formatting

095ef4b

github-actions bot removed ci Java Affects Java cuDF API. Python Affects Python cuDF API. labels Feb 1, 2023

bdice reviewed Feb 1, 2023

View reviewed changes

cpp/benchmarks/sort/nested_types_common.hpp Outdated Show resolved Hide resolved

cpp/benchmarks/sort/rank_lists.cpp Outdated Show resolved Hide resolved

cpp/tests/sort/rank_test.cpp Outdated Show resolved Hide resolved

cpp/tests/sort/rank_test.cpp Outdated Show resolved Hide resolved

divyegala added 3 commits February 3, 2023 12:48

review comments

8a1bedf

Merge remote-tracking branch 'upstream/branch-23.04' into rank-row-co…

bdc2ee1

…mparator

copyright year

3c6903b

bdice reviewed Feb 3, 2023

View reviewed changes

cpp/benchmarks/sort/rank_lists.cpp Outdated Show resolved Hide resolved

cpp/benchmarks/sort/rank_lists.cpp Outdated Show resolved Hide resolved

address review

19075e6

bdice approved these changes Feb 6, 2023

View reviewed changes

bdice changed the title ~~rank to use to experimental row comparators~~ Convert rank to use to experimental row comparators Feb 6, 2023

bdice mentioned this pull request Feb 6, 2023

[FEA] Implement full support for nested types #11844

Closed

PointKernel approved these changes Feb 6, 2023

View reviewed changes

Merge branch 'branch-23.04' into rank-row-comparator

0093b32

jjacobelli and others added 2 commits February 7, 2023 17:33

Merge branch 'branch-23.04' into rank-row-comparator

91dce1f

add make to dependencies of conda-java-tests CI

945e89f

ajschmidt8 approved these changes Feb 7, 2023

View reviewed changes

divyegala added 2 commits February 7, 2023 10:18

Revert "add make to dependencies of conda-java-tests CI"

7b512b9

This reverts commit 945e89f.

Merge branch 'branch-23.04' into rank-row-comparator

df16780

rapids-bot bot merged commit b87b64f into rapidsai:branch-23.04 Feb 8, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Convert `rank` to use to experimental row comparators #12481

Convert `rank` to use to experimental row comparators #12481

divyegala commented Jan 5, 2023 •

edited by bdice

Loading

codecov bot commented Jan 5, 2023 •

edited

Loading

bdice left a comment

divyegala commented Feb 3, 2023 •

edited

Loading

bdice commented Feb 3, 2023 •

edited

Loading

bdice left a comment

bdice left a comment

bdice commented Feb 6, 2023

PointKernel left a comment

divyegala commented Feb 6, 2023

Convert rank to use to experimental row comparators #12481

Convert rank to use to experimental row comparators #12481

Conversation

divyegala commented Jan 5, 2023 • edited by bdice Loading

Description

Checklist

codecov bot commented Jan 5, 2023 • edited Loading

Codecov Report

bdice left a comment

Choose a reason for hiding this comment

divyegala commented Feb 3, 2023 • edited Loading

bdice commented Feb 3, 2023 • edited Loading

bdice left a comment

Choose a reason for hiding this comment

bdice left a comment

Choose a reason for hiding this comment

bdice commented Feb 6, 2023

PointKernel left a comment

Choose a reason for hiding this comment

divyegala commented Feb 6, 2023

Convert `rank` to use to experimental row comparators #12481

Convert `rank` to use to experimental row comparators #12481

divyegala commented Jan 5, 2023 •

edited by bdice

Loading

codecov bot commented Jan 5, 2023 •

edited

Loading

divyegala commented Feb 3, 2023 •

edited

Loading

bdice commented Feb 3, 2023 •

edited

Loading