Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge latest 24.04 #14893

Merged

Conversation

vyasr
Copy link
Contributor

@vyasr vyasr commented Jan 26, 2024

Description

I've opened this as a PR rather than just pushing the changes so that we can run the test suite for comparison and make sure that I didn't mess up anything significant in the merge.

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

brandon-b-miller and others added 30 commits January 17, 2024 19:17
This PR fixes an issue where cuDF fails to import on machines with no NVIDIA GPU present. 

cc @shwina

Authors:
  - https://github.com/brandon-b-miller
  - Vyas Ramasubramani (https://github.com/vyasr)

Approvers:
  - Bradley Dice (https://github.com/bdice)
  - Ashwin Srinath (https://github.com/shwina)
  - Vyas Ramasubramani (https://github.com/vyasr)

URL: rapidsai#13690
…s to cuIO code (rapidsai#14665)

This refactors the ORC reader, moving ORC code around to facilitate the upcoming support for chunked reading of the input files.

No new functionality/implementation is added in this PR. Only the existing code is moving around, except that some small issues of the related ORC/cuIO code are also fixed.

Authors:
  - Nghia Truong (https://github.com/ttnghia)
  - Vukasin Milovanovic (https://github.com/vuule)

Approvers:
  - Robert Maynard (https://github.com/robertmaynard)
  - Vukasin Milovanovic (https://github.com/vuule)
  - Bradley Dice (https://github.com/bdice)

URL: rapidsai#14665
This PR leverages [Breathe](https://breathe.readthedocs.io/en/latest/) to pull the cudf C++ API documentation into the python Sphinx docs build, generating a single unified build of the documentation that supports cross-linking between language libraries and also simplifies cross-linking from other libraries that wish to link here.

This PR also revealed numerous other issues with our doxygen docs. I've submitted those as separate patches to control the diff here, but it's worth noting that Sphinx is much louder with warnings than doxygen and will help us avoid many more issues with broken documentation than doxygen alone could.

Resolves rapidsai#11481

Authors:
  - Vyas Ramasubramani (https://github.com/vyasr)

Approvers:
  - AJ Schmidt (https://github.com/ajschmidt8)
  - Ashwin Srinath (https://github.com/shwina)
  - Karthikeyan (https://github.com/karthikeyann)
  - David Wendt (https://github.com/davidwendt)

URL: rapidsai#13846
Reduced time from 90s to 25s on local system. Very few tests are impacted, and there should be no impact on code coverage.

Authors:
  - Vukasin Milovanovic (https://github.com/vuule)
  - Nghia Truong (https://github.com/ttnghia)

Approvers:
  - Bradley Dice (https://github.com/bdice)
  - Karthikeyan (https://github.com/karthikeyann)
  - Mike Wilson (https://github.com/hyperbolic2346)
  - Nghia Truong (https://github.com/ttnghia)

URL: rapidsai#14750
Implements `cudf.MultiIndex.from_arrays`

Authors:
  - Matthew Roeschke (https://github.com/mroeschke)

Approvers:
  - Michael Wang (https://github.com/isVoid)

URL: rapidsai#14740
…ai#14755)

In the spirit of reducing redundant methods, `_from_columns` just calls `_from_data` (hoping to rename to `_from_mapping` or similar) so removing the need for `_from_columns`.

Hoping to do the same for the `_from_columns_like_self` in a follow up.

Authors:
  - Matthew Roeschke (https://github.com/mroeschke)

Approvers:
  - Michael Wang (https://github.com/isVoid)

URL: rapidsai#14755
Resolves issue [rapidsai#14716](rapidsai#14716)

- Eliminated unnecessary recursive self-calls in the `superimpose_nulls_no_sanitize` function, addressing performance issues in `make_structs_column`.
- Introduced `STRUCT_CREATION_NVBENCH` to assess the performance of the `create_structs_data` function.

Authors:
  - Suraj Aralihalli (https://github.com/SurajAralihalli)
  - Nghia Truong (https://github.com/ttnghia)

Approvers:
  - Nghia Truong (https://github.com/ttnghia)
  - David Wendt (https://github.com/davidwendt)

URL: rapidsai#14761
In my initial pass through enabling Breathe I tried to leave a minimal footprint of external modification to the generated files. In this particular case it looks like the problematic attributes can appear in more places than I originally observed. I never observed this behavior in that PR, but I don't know if these now appear due to something that merged after rapidsai#13846 or something else changing in the environment (e.g. a Sphinx or doxygen behavior etc). Nonetheless, blanket wiping these is the simpler and safer option. The docs build successfully locally with this change.

Authors:
  - Vyas Ramasubramani (https://github.com/vyasr)

Approvers:
  - Bradley Dice (https://github.com/bdice)

URL: rapidsai#14780
Forward-merge branch-24.02 to branch-24.04
Reverts changes from rapidsai#14756.

* updates `cudf`'s tests to be compatible with the latest `pytest-cases` ([version 3.8.2](https://pypi.org/project/pytest-cases/#history))
* puts a floor of `pytest-cases>=3.8.2` on that project to be sure older versions aren't used

Authors:
  - James Lamb (https://github.com/jameslamb)

Approvers:
  - Matthew Roeschke (https://github.com/mroeschke)
  - Vyas Ramasubramani (https://github.com/vyasr)
  - Ray Douglass (https://github.com/raydouglass)

URL: rapidsai#14764
Forward-merge branch-24.02 to branch-24.04
Additionally adds some typing and remove validation done by `cudf.dtype` and add a unit test to ensure numpy dtype objects are accepted in the constructor

Authors:
  - Matthew Roeschke (https://github.com/mroeschke)
  - Vyas Ramasubramani (https://github.com/vyasr)

Approvers:
  - Vyas Ramasubramani (https://github.com/vyasr)

URL: rapidsai#14774
Forward-merge branch-24.02 to branch-24.04
Similar to rapidsai#14638, use isinstance when we know we are checking a dtype instance

Authors:
  - Matthew Roeschke (https://github.com/mroeschke)
  - Vyas Ramasubramani (https://github.com/vyasr)

Approvers:
  - Vyas Ramasubramani (https://github.com/vyasr)

URL: rapidsai#14641
Forward-merge branch-24.02 to branch-24.04
Fixes deprecation warnings introduced when rapidsai#14202 merged.
Most of these are for calls to `cudf::make_strings_column` which deprecated the chars-column function overload.

Authors:
  - David Wendt (https://github.com/davidwendt)

Approvers:
  - Bradley Dice (https://github.com/bdice)
  - Vyas Ramasubramani (https://github.com/vyasr)

URL: rapidsai#14771
Forward-merge branch-24.02 to branch-24.04
A new release of `pydata_sphinx_theme` [from last night](https://github.com/pydata/pydata-sphinx-theme/releases/tag/v0.15.2) includes pydata/pydata-sphinx-theme#1642, which marks the theme as unsafe for parallel writing.

Authors:
  - Vyas Ramasubramani (https://github.com/vyasr)

Approvers:
  - Ray Douglass (https://github.com/raydouglass)
  - Bradley Dice (https://github.com/bdice)

URL: rapidsai#14796
Forward-merge branch-24.02 to branch-24.04
This PR adds `pynvjitlink` as a hard dependency for cuDF. This should allow for MVC when launching numba kernels across minor versions of CUDA 12 up to the version of `nvjitlink` statically shipped with `pynvjitlink`. 

cc @bdice

Authors:
  - https://github.com/brandon-b-miller
  - https://github.com/jakirkham
  - Bradley Dice (https://github.com/bdice)

Approvers:
  - Bradley Dice (https://github.com/bdice)
  - Ray Douglass (https://github.com/raydouglass)

URL: rapidsai#14763
Forward-merge branch-24.02 to branch-24.04
This PR adds pip installation instructions to the README.

Authors:
  - Ashwin Srinath (https://github.com/shwina)
  - Vyas Ramasubramani (https://github.com/vyasr)

Approvers:
  - Bradley Dice (https://github.com/bdice)

URL: rapidsai#13677
Forward-merge branch-24.02 to branch-24.04
Aligns the constructor closer to `DatetimeIndex.__init__`: rapidsai#14774

Authors:
  - Matthew Roeschke (https://github.com/mroeschke)
  - Vyas Ramasubramani (https://github.com/vyasr)

Approvers:
  - Vyas Ramasubramani (https://github.com/vyasr)

URL: rapidsai#14775
Forward-merge branch-24.02 to branch-24.04
Contributes to rapidsai#925. Introduces cuda_stream parameter for downstream users to provide for `labeling_bins`

Authors:
  - Danial Javady (https://github.com/ZelboK)
  - Bradley Dice (https://github.com/bdice)
  - Nghia Truong (https://github.com/ttnghia)
  - Vyas Ramasubramani (https://github.com/vyasr)

Approvers:
  - Nghia Truong (https://github.com/ttnghia)
  - Bradley Dice (https://github.com/bdice)

URL: rapidsai#14401
…i#14779)

Moves the `cudf::char_utf8` definition from the `cudf/strings/detail/utf8.hpp` to `cudf/types.hpp` since it is declared in the public namespace and used in public functions.

Reference: https://github.com/rapidsai/cudf/blob/9acddc08cc209e8d6b94891be6131edd63ff5b43/docs/cudf/source/conf.py#L372-L375

Authors:
  - David Wendt (https://github.com/davidwendt)
  - Vyas Ramasubramani (https://github.com/vyasr)

Approvers:
  - Yunsong Wang (https://github.com/PointKernel)
  - Nghia Truong (https://github.com/ttnghia)
  - Vyas Ramasubramani (https://github.com/vyasr)

URL: rapidsai#14779
@vyasr vyasr added improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels Jan 26, 2024
@vyasr vyasr self-assigned this Jan 26, 2024
@vyasr vyasr requested review from a team as code owners January 26, 2024 01:29
@vyasr vyasr requested review from wence-, isVoid, mythrocks and nvdbaranec and removed request for a team January 26, 2024 01:29
Copy link

copy-pr-bot bot commented Jan 26, 2024

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@github-actions github-actions bot added libcudf Affects libcudf (C++/CUDA) code. Python Affects Python cuDF API. CMake CMake build issue conda Java Affects Java cuDF API. labels Jan 26, 2024
@vyasr
Copy link
Contributor Author

vyasr commented Jan 26, 2024

/ok to test

@vyasr vyasr marked this pull request as draft January 26, 2024 01:35
@vyasr
Copy link
Contributor Author

vyasr commented Jan 26, 2024

Sorry to all the reviewers for the noise. This PR can be ignored by everyone except probably @mroeschke @galipremsagar and @shwina

@vyasr
Copy link
Contributor Author

vyasr commented Jan 26, 2024

OK looks like the dask failures are something else. For some reason the packages are being made incompatible by the new constraints. I'll have to see what's going on there, but we can probably work on unblocking that independently of other efforts.

@vyasr
Copy link
Contributor Author

vyasr commented Jan 26, 2024

The dask failures are rapidsai/dask-cuda#1306. I'll work on resolving that separately.

@galipremsagar galipremsagar merged commit 6368c47 into rapidsai:pandas_2.0_feature_branch Jan 26, 2024
27 of 32 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CMake CMake build issue improvement Improvement / enhancement to an existing function Java Affects Java cuDF API. libcudf Affects libcudf (C++/CUDA) code. non-breaking Non-breaking change Python Affects Python cuDF API.
Projects
None yet
Development

Successfully merging this pull request may close these issues.