Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge branch-22.10 into branch-22.12 #11801

Merged

Conversation

davidwendt
Copy link
Contributor

Description

Fixes merge conflict found in #11789 conflict. Conflict fixed in cudf_dev_cuda11.5.yml
Followed the instructions at https://docs.rapidsai/maintainers/gpuci/#forward-mergers to make this PR.

robertmaynard and others added 7 commits September 27, 2022 16:21
…11751)

With rapids-cmake now requiring CMake 3.23.1 update consumers to correctly express this requirement

Authors:
  - Robert Maynard (https://github.com/robertmaynard)

Approvers:
  - Nghia Truong (https://github.com/ttnghia)
  - GALI PREM SAGAR (https://github.com/galipremsagar)
  - Ray Douglass (https://github.com/raydouglass)
  - Vyas Ramasubramani (https://github.com/vyasr)

URL: rapidsai#11751
…talls (rapidsai#11565)

After dask/dask#9367 was fixed in dask upstream we had to bump the minimum version of dask to 2022.8.0 to correctly fetch nightly(if channel exists) or stable (if `dask/dev` label doesn't exist). Without this fix, conda builds were always picking up `2022.7.1` only and/or there would be a mix of nightly & stable packages in an env.

This PR also does some cleanup and makes the `build.sh` script easy to maintain.

Authors:
  - GALI PREM SAGAR (https://github.com/galipremsagar)

Approvers:
  - AJ Schmidt (https://github.com/ajschmidt8)
  - Charles Blackmon-Luca (https://github.com/charlesbluca)

URL: rapidsai#11565
…apidsai#11576)

This PR exposes an option to use Dask-CUDA's explicit-comms shuffle for the primary shuffle-based `dask_cudf.DataFrame` methods: `shuffle`, `sort_values`, and `set_index`. Although "explicit-comms" is still experimental, the explicit-shuffle algorithm is known to consistently outperform the "task"-based shuffle.

As far as I can tell, it is not currently possible to use an "explicit-comms" shuffle in `dask_cudf` without directly importing the function from Dask-CUDA (@madsbk - please do correct me if I am mistaken).  In order to simplify benchmarking, and to utilize the optimized shuffle within high-cardinality groupby code, I propose that we make it easier to access the explicit shuffle.

Authors:
  - Richard (Rick) Zamora (https://github.com/rjzamora)

Approvers:
  - Lawrence Mitchell (https://github.com/wence-)
  - Benjamin Zaitlen (https://github.com/quasiben)

URL: rapidsai#11576
…y` and update guide to UDFs notebook (rapidsai#11733)

This PR updates some docstrings around cuDF to show some examples of how to use strings inside UDFs, as well as provide some caveats. It also adds a section with some detail and examples to our guide to udfs ipython notebook.

Authors:
  - https://github.com/brandon-b-miller

Approvers:
  - Ashwin Srinath (https://github.com/shwina)
  - Lawrence Mitchell (https://github.com/wence-)

URL: rapidsai#11733
…ltiple levels. (rapidsai#11779)

`row_bit_count` keeps track of a stack of "branches" which represent a span of rows to be included in the computed size.  As you traverse through a hierarchy of lists, that span of rows is maintained as a stack.  The code that was handling jumping out from the bottom of a stack to a new column was making the faulty assumption that the jump was only 1 level up.

Authors:
  - https://github.com/nvdbaranec

Approvers:
  - Nghia Truong (https://github.com/ttnghia)
  - Mike Wilson (https://github.com/hyperbolic2346)
  - Alessandro Bellina (https://github.com/abellina)

URL: rapidsai#11779
@davidwendt davidwendt added 3 - Ready for Review Ready for review by team git improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels Sep 28, 2022
@davidwendt davidwendt requested review from a team as code owners September 28, 2022 12:40
@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@github-actions github-actions bot added CMake CMake build issue conda Java Affects Java cuDF API. Python Affects Python cuDF API. libcudf Affects libcudf (C++/CUDA) code. labels Sep 28, 2022
@galipremsagar galipremsagar added 5 - Ready to Merge Testing and reviews complete, ready to merge and removed 3 - Ready for Review Ready for review by team labels Sep 28, 2022
@codecov
Copy link

codecov bot commented Sep 28, 2022

Codecov Report

❗ No coverage uploaded for pull request base (branch-22.12@b8ab576). Click here to learn what that means.
Patch has no changes to coverable lines.

Additional details and impacted files
@@               Coverage Diff               @@
##             branch-22.12   #11801   +/-   ##
===============================================
  Coverage                ?   87.40%           
===============================================
  Files                   ?      133           
  Lines                   ?    21833           
  Branches                ?        0           
===============================================
  Hits                    ?    19084           
  Misses                  ?     2749           
  Partials                ?        0           

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

@raydouglass raydouglass merged commit 017d85f into rapidsai:branch-22.12 Sep 28, 2022
@davidwendt davidwendt deleted the branch-22.12-merge-22.10 branch September 28, 2022 18:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
5 - Ready to Merge Testing and reviews complete, ready to merge CMake CMake build issue improvement Improvement / enhancement to an existing function Java Affects Java cuDF API. libcudf Affects libcudf (C++/CUDA) code. non-breaking Non-breaking change Python Affects Python cuDF API.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants