Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement groupby apply with JIT #11452

Merged
merged 157 commits into from
Jan 27, 2023

Conversation

bwyogatama
Copy link
Contributor

@bwyogatama bwyogatama commented Aug 3, 2022

Description

Experimental cuDF Groupby Apply JIT pipeline.

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

@github-actions github-actions bot added the Python Affects Python cuDF API. label Aug 3, 2022
@bwyogatama bwyogatama added 2 - In Progress Currently a work in progress non-breaking Non-breaking change feature request New feature or request labels Aug 3, 2022
@PointKernel
Copy link
Member

@bwyogatama We should move the function.cu file to a place like cudf/cpp/src/udf/kernels.cu (just the first thing came into my mind, feel free to suggest the proper place) to make this PR exposed to cpp reviewers.

This CUDA source is required to be compiled to a ptx file in order to be used by the python code. Not sure what would be the best way to do this in cudf. @robertmaynard Any pointers are highly appreciated.

@vyasr
Copy link
Contributor

vyasr commented Aug 4, 2022

If we want to compile to PTX I think the cleanest CMake solution is to create an object library with just this file and then set CUDA_PTX_COMPILATION.

Also we should probably discuss reusing atomics etc from libcudf, although I'm not yet sure how we want that to look since I don't think we want to make those part of the libcudf public API.

@robertmaynard
Copy link
Contributor

If we want to compile to PTX I think the cleanest CMake solution is to create an object library with just this file and then set CUDA_PTX_COMPILATION.

Also we should probably discuss reusing atomics etc from libcudf, although I'm not yet sure how we want that to look since I don't think we want to make those part of the libcudf public API.

This is correct, but you will need to iterate over the values of CMAKE_CUDA_ARCHITECTURES and create a new object library for each value since PTX compilation only supports a single arch.

set(ptx_src "src/a.cu")
foreach(arch IN LISTS CMAKE_CUDA_ARCHITECTURES)
    add_library(ptx_example_${arch} OBJECT ${ptx_src})
    set_target_properties(ptx_example_${arch}
        PROPERTIES CUDA_ARCHITECTURES ${arch}
                   CUDA_PTX_COMPILATION ON
       )
endforeach()

We need to update the install rules to also ship the ptx output

@brandon-b-miller
Copy link
Contributor

We should move the function.cu file to a place like cudf/cpp/src/udf/kernels.cu (just the first thing came into my mind, feel free to suggest the proper place) to make this PR exposed to cpp reviewers.

We went back and forth on this. We considered leaving it somewhere in the python area since it serves only the python API. I think it's a good idea that it is owned by the c++ code owners though.

We need to update the install rules to also ship the ptx output

We're going to need these PTX files to come as part of the conda packages as well. Both of these issues need to be solved in #11319 as well which relies on the same pattern, so if we can solve them here that's great! :)

@quasiben
Copy link
Member

quasiben commented Aug 8, 2022

This is super cool to see! Can you also post one of the early benchmarks plot comparing performance?

@bdice bdice changed the title Groupby Apply with JIT (First Commit) Implement groupby apply with JIT Aug 16, 2022
Copy link
Contributor

@bdice bdice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Initial round of feedback attached. I offered a few specific solutions to a few challenges with re-using boilerplate code.

python/cudf/cudf/core/groupby/groupby.py Outdated Show resolved Hide resolved
python/cudf/cudf/core/groupby/groupby.py Outdated Show resolved Hide resolved
python/cudf/cudf/core/groupby/groupby.py Outdated Show resolved Hide resolved
python/cudf/cudf/core/groupby/groupby.py Outdated Show resolved Hide resolved
python/cudf/cudf/core/udf/function.cu Outdated Show resolved Hide resolved
python/cudf/cudf/core/udf/groupby_function.py Outdated Show resolved Hide resolved
python/cudf/cudf/core/udf/groupby_function.py Outdated Show resolved Hide resolved
python/cudf/cudf/core/udf/groupby_function.py Outdated Show resolved Hide resolved
python/cudf/cudf/core/udf/groupby_function.py Outdated Show resolved Hide resolved
python/cudf/cudf/core/udf/groupby_function.py Outdated Show resolved Hide resolved
@github-actions
Copy link

This PR has been labeled inactive-30d due to no recent activity in the past 30 days. Please close this PR if it is no longer required. Otherwise, please respond with a comment indicating any updates. This PR will be labeled inactive-90d if there is no activity in the next 60 days.

@github-actions github-actions bot added the CMake CMake build issue label Sep 23, 2022
@PointKernel
Copy link
Member

/ok to test

@PointKernel PointKernel requested a review from vyasr January 27, 2023 16:19
vyasr
vyasr previously requested changes Jan 27, 2023
Copy link
Contributor

@vyasr vyasr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Temporarily blocking since there's ongoing discussion about my question about needing an atomic.

@PointKernel
Copy link
Member

/ok to test

@PointKernel PointKernel requested a review from vyasr January 27, 2023 18:02
@PointKernel
Copy link
Member

/ok to test

1 similar comment
@PointKernel
Copy link
Member

/ok to test

Copy link
Contributor

@robertmaynard robertmaynard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CMake changes LGTM

@shwina shwina dismissed vyasr’s stale review January 27, 2023 22:07

The discussion in #11452 (comment) is wrapped, so I think we can unblock now.

@shwina
Copy link
Contributor

shwina commented Jan 27, 2023

/merge

@shwina shwina added the 5 - Ready to Merge Testing and reviews complete, ready to merge label Jan 27, 2023
@rapids-bot rapids-bot bot merged commit 7695850 into rapidsai:branch-23.02 Jan 27, 2023
@brandon-b-miller
Copy link
Contributor

FANTASTIC work and great job @bwyogatama !

@bwyogatama
Copy link
Contributor Author

Thank you so much @brandon-b-miller and @PointKernel for helping with this PR! This PR would not be possible without the help of you two!
Thank you as well for all the reviewers on their reviews to improve the code @vyasr @bdice @wence- @shwina @robertmaynard @jrhemstad @davidwendt @ajschmidt8 !
And also special thanks to @gmarkall and @GregoryKimball for all the helps during my internship!

I am super glad that this feature can finally be merged and I am looking forward to seeing how this feature would expand in the future!

rapids-bot bot pushed a commit that referenced this pull request Feb 8, 2023
This PR enables doctests for some GroupBy methods that are not currently tested due to not meeting the inclusion criteria in our doctest class. This includes enabling tests for `GroupBy.apply` with `engine='jit'`. 

came up during #11452

Authors:
  - https://github.com/brandon-b-miller

Approvers:
  - GALI PREM SAGAR (https://github.com/galipremsagar)

URL: #12658
rapids-bot bot pushed a commit that referenced this pull request Feb 21, 2023
Prior to #11452 cuDF Python did not require CUDA for compilation. When libcudf was is found by CMake, however, it triggers a compilation of the C++ library, which does require CUDA for compilation. In order to support this behavior, we included some extra logic in cuDF's CMake to ensure that the appropriate CUDA architectures are compiled for (respecting the extra options like `RAPIDS` and `NATIVE` that `rapids-cmake` offers). However, with the merge of #11452 this conditional is now redundant because cuDF requires CUDA compilation unconditionally, so we can remove the extra code.

Authors:
  - Vyas Ramasubramani (https://github.com/vyasr)
  - AJ Schmidt (https://github.com/ajschmidt8)

Approvers:
  - Bradley Dice (https://github.com/bdice)

URL: #12758
rapids-bot bot pushed a commit that referenced this pull request Feb 22, 2023
With the merge of #11452 we have the machinery to build and deploy PTX libraries of shim functions as part of cuDF's build process. With this there is no reason to keep the `strings_udf` code separate anymore. This PR removes the separate package and all of it's related CI plumbing as well as supports the strings feature by default, just like GroupBy.

Authors:
  - https://github.com/brandon-b-miller
  - Vyas Ramasubramani (https://github.com/vyasr)

Approvers:
  - Bradley Dice (https://github.com/bdice)
  - Vyas Ramasubramani (https://github.com/vyasr)
  - AJ Schmidt (https://github.com/ajschmidt8)

URL: #12669
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3 - Ready for Review Ready for review by team 5 - Ready to Merge Testing and reviews complete, ready to merge CMake CMake build issue feature request New feature or request non-breaking Non-breaking change Python Affects Python cuDF API.
Projects
None yet
Development

Successfully merging this pull request may close these issues.