Implement groupby apply with JIT #11452

bwyogatama · 2022-08-03T19:44:28Z

Description

Experimental cuDF Groupby Apply JIT pipeline.

Checklist

I am familiar with the Contributing Guidelines.
New or existing tests cover these changes.
The documentation is up to date with these changes.

PointKernel · 2022-08-03T20:59:30Z

@bwyogatama We should move the function.cu file to a place like cudf/cpp/src/udf/kernels.cu (just the first thing came into my mind, feel free to suggest the proper place) to make this PR exposed to cpp reviewers.

This CUDA source is required to be compiled to a ptx file in order to be used by the python code. Not sure what would be the best way to do this in cudf. @robertmaynard Any pointers are highly appreciated.

vyasr · 2022-08-04T16:02:15Z

If we want to compile to PTX I think the cleanest CMake solution is to create an object library with just this file and then set CUDA_PTX_COMPILATION.

Also we should probably discuss reusing atomics etc from libcudf, although I'm not yet sure how we want that to look since I don't think we want to make those part of the libcudf public API.

robertmaynard · 2022-08-04T18:27:02Z

If we want to compile to PTX I think the cleanest CMake solution is to create an object library with just this file and then set CUDA_PTX_COMPILATION.

Also we should probably discuss reusing atomics etc from libcudf, although I'm not yet sure how we want that to look since I don't think we want to make those part of the libcudf public API.

This is correct, but you will need to iterate over the values of CMAKE_CUDA_ARCHITECTURES and create a new object library for each value since PTX compilation only supports a single arch.

set(ptx_src "src/a.cu")
foreach(arch IN LISTS CMAKE_CUDA_ARCHITECTURES)
    add_library(ptx_example_${arch} OBJECT ${ptx_src})
    set_target_properties(ptx_example_${arch}
        PROPERTIES CUDA_ARCHITECTURES ${arch}
                   CUDA_PTX_COMPILATION ON
       )
endforeach()

We need to update the install rules to also ship the ptx output

brandon-b-miller · 2022-08-05T14:00:28Z

We should move the function.cu file to a place like cudf/cpp/src/udf/kernels.cu (just the first thing came into my mind, feel free to suggest the proper place) to make this PR exposed to cpp reviewers.

We went back and forth on this. We considered leaving it somewhere in the python area since it serves only the python API. I think it's a good idea that it is owned by the c++ code owners though.

We need to update the install rules to also ship the ptx output

We're going to need these PTX files to come as part of the conda packages as well. Both of these issues need to be solved in #11319 as well which relies on the same pattern, so if we can solve them here that's great! :)

quasiben · 2022-08-08T14:02:44Z

This is super cool to see! Can you also post one of the early benchmarks plot comparing performance?

bdice

Initial round of feedback attached. I offered a few specific solutions to a few challenges with re-using boilerplate code.

python/cudf/cudf/core/groupby/groupby.py

python/cudf/cudf/core/udf/function.cu

python/cudf/cudf/core/udf/groupby_function.py

github-actions · 2022-09-18T19:02:54Z

This PR has been labeled inactive-30d due to no recent activity in the past 30 days. Please close this PR if it is no longer required. Otherwise, please respond with a comment indicating any updates. This PR will be labeled inactive-90d if there is no activity in the next 60 days.

…pply-jit

…to groupby

PointKernel · 2023-01-27T16:14:54Z

/ok to test

vyasr

Temporarily blocking since there's ongoing discussion about my question about needing an atomic.

PointKernel · 2023-01-27T18:01:31Z

/ok to test

python/cudf/udf_cpp/groupby/function.cu

PointKernel · 2023-01-27T18:17:50Z

/ok to test

PointKernel · 2023-01-27T19:46:45Z

/ok to test

robertmaynard

CMake changes LGTM

The discussion in #11452 (comment) is wrapped, so I think we can unblock now.

shwina · 2023-01-27T22:09:03Z

/merge

brandon-b-miller · 2023-01-28T00:04:28Z

FANTASTIC work and great job @bwyogatama !

bwyogatama · 2023-01-29T23:31:42Z

Thank you so much @brandon-b-miller and @PointKernel for helping with this PR! This PR would not be possible without the help of you two!
Thank you as well for all the reviewers on their reviews to improve the code @vyasr @bdice @wence- @shwina @robertmaynard @jrhemstad @davidwendt @ajschmidt8 !
And also special thanks to @gmarkall and @GregoryKimball for all the helps during my internship!

I am super glad that this feature can finally be merged and I am looking forward to seeing how this feature would expand in the future!

This PR enables doctests for some GroupBy methods that are not currently tested due to not meeting the inclusion criteria in our doctest class. This includes enabling tests for `GroupBy.apply` with `engine='jit'`. came up during #11452 Authors: - https://github.com/brandon-b-miller Approvers: - GALI PREM SAGAR (https://github.com/galipremsagar) URL: #12658

Prior to #11452 cuDF Python did not require CUDA for compilation. When libcudf was is found by CMake, however, it triggers a compilation of the C++ library, which does require CUDA for compilation. In order to support this behavior, we included some extra logic in cuDF's CMake to ensure that the appropriate CUDA architectures are compiled for (respecting the extra options like `RAPIDS` and `NATIVE` that `rapids-cmake` offers). However, with the merge of #11452 this conditional is now redundant because cuDF requires CUDA compilation unconditionally, so we can remove the extra code. Authors: - Vyas Ramasubramani (https://github.com/vyasr) - AJ Schmidt (https://github.com/ajschmidt8) Approvers: - Bradley Dice (https://github.com/bdice) URL: #12758

With the merge of #11452 we have the machinery to build and deploy PTX libraries of shim functions as part of cuDF's build process. With this there is no reason to keep the `strings_udf` code separate anymore. This PR removes the separate package and all of it's related CI plumbing as well as supports the strings feature by default, just like GroupBy. Authors: - https://github.com/brandon-b-miller - Vyas Ramasubramani (https://github.com/vyasr) Approvers: - Bradley Dice (https://github.com/bdice) - Vyas Ramasubramani (https://github.com/vyasr) - AJ Schmidt (https://github.com/ajschmidt8) URL: #12669

Groupby Apply with JIT (First Commit)

8db918f

github-actions bot added the Python Affects Python cuDF API. label Aug 3, 2022

bwyogatama added 2 - In Progress Currently a work in progress non-breaking Non-breaking change feature request New feature or request labels Aug 3, 2022

Fix error in Pytest

2d6b4c9

JIT Caching Support

fd8680e

bwyogatama added 2 commits August 5, 2022 17:08

Add IdxMax and IdxMin

9220658

Add IdxMax and IdxMin

f4bc7c4

Dynamic Launch Parameter

8659149

bdice changed the title ~~Groupby Apply with JIT (First Commit)~~ Implement groupby apply with JIT Aug 16, 2022

bdice requested changes Aug 16, 2022

View reviewed changes

Code cleanup #1

b7ede43

github-actions bot added the inactive-30d label Sep 18, 2022

GregoryKimball added this to the UDF Enhancements milestone Sep 21, 2022

vyasr added 7 commits September 23, 2022 15:50

Merge remote-tracking branch 'origin/branch-22.10' into fea-groupby-a…

1dbbb77

…pply-jit

Add support for building the JIT functions with the rest of the build.

f98fc63

Make engine name consistent with tests

11edd37

Generalize compiled PTX selection for CUDA arch.

1e12416

Cleanup of strings_udf PTX detection

d348fb8

Fix tests with some hacks so that we can start validating.

795e580

Standardize the engine argument handling so that we get clear errors.

0ce0a90

github-actions bot added the CMake CMake build issue label Sep 23, 2022

Update style.

3493d49

PointKernel added 3 commits January 27, 2023 11:08

Compute blockstd via blockvar

0b407c8

Merge branch 'fea-groupby-apply-jit' of github.com:bwyogatama/cudf in…

568ab97

…to groupby

Merge remote-tracking branch 'upstream/branch-23.02' into groupby

83f8d88

PointKernel requested a review from vyasr January 27, 2023 16:19

vyasr approved these changes Jan 27, 2023

View reviewed changes

vyasr previously requested changes Jan 27, 2023

View reviewed changes

Use atomic operations to avoid concurrent writes

dbd5eeb

PointKernel requested a review from vyasr January 27, 2023 18:02

wence- reviewed Jan 27, 2023

View reviewed changes

python/cudf/udf_cpp/groupby/function.cu Outdated Show resolved Hide resolved

Use int64_t atomic ref

eaa8ff7

robertmaynard approved these changes Jan 27, 2023

View reviewed changes

shwina added the 5 - Ready to Merge Testing and reviews complete, ready to merge label Jan 27, 2023

rapids-bot bot merged commit 7695850 into rapidsai:branch-23.02 Jan 27, 2023

This was referenced Jan 31, 2023

Enable doctests for GroupBy methods #12658

Merged

Move strings_udf code into cuDF #12669

Merged

vyasr mentioned this pull request Feb 11, 2023

Remove now redundant cuda initialization #12758

Merged

3 tasks

brandon-b-miller mentioned this pull request Apr 10, 2023

[FEA] Attempt to JIT GroupBy.apply functions by default and fall back to iterative algorithm #13103

Closed

This was referenced Aug 8, 2023

[BUG] JIT Groupby Apply idxmax/idxmin reductions return incorrect values when the data is all NaN #13832

Open

Return a Series from JIT GroupBy apply, rather than a DataFrame #13820

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement groupby apply with JIT #11452

Implement groupby apply with JIT #11452

bwyogatama commented Aug 3, 2022 •

edited by PointKernel

Loading

PointKernel commented Aug 3, 2022

vyasr commented Aug 4, 2022

robertmaynard commented Aug 4, 2022

brandon-b-miller commented Aug 5, 2022

quasiben commented Aug 8, 2022

bdice left a comment

github-actions bot commented Sep 18, 2022

PointKernel commented Jan 27, 2023

vyasr left a comment

PointKernel commented Jan 27, 2023

PointKernel commented Jan 27, 2023

PointKernel commented Jan 27, 2023

robertmaynard left a comment

shwina commented Jan 27, 2023

brandon-b-miller commented Jan 28, 2023

bwyogatama commented Jan 29, 2023

Implement groupby apply with JIT #11452

Implement groupby apply with JIT #11452

Conversation

bwyogatama commented Aug 3, 2022 • edited by PointKernel Loading

Description

Checklist

PointKernel commented Aug 3, 2022

vyasr commented Aug 4, 2022

robertmaynard commented Aug 4, 2022

brandon-b-miller commented Aug 5, 2022

quasiben commented Aug 8, 2022

bdice left a comment

Choose a reason for hiding this comment

github-actions bot commented Sep 18, 2022

PointKernel commented Jan 27, 2023

vyasr left a comment

Choose a reason for hiding this comment

PointKernel commented Jan 27, 2023

PointKernel commented Jan 27, 2023

PointKernel commented Jan 27, 2023

robertmaynard left a comment

Choose a reason for hiding this comment

shwina commented Jan 27, 2023

brandon-b-miller commented Jan 28, 2023

bwyogatama commented Jan 29, 2023

bwyogatama commented Aug 3, 2022 •

edited by PointKernel

Loading