[RELEASE] cudf v24.02 #14901

raydouglass · 2024-01-26T19:18:20Z

❄️ Code freeze for `branch-24.02` and v24.02 release

What does this mean?

Only critical/hotfix level issues should be merged into branch-24.02 until release (merging of this PR).

What is the purpose of this PR?

Update documentation
Allow testing for the new release
Enable a means to merge branch-24.02 into main for the release

Forward-merge branch-23.12 to branch-24.02

) If we pass sort=True to merges we are on the hook to sort the result in order with respect to the key columns. If those key columns have repeated values there is still some space for ambiguity. Currently we get a result back whose order (for the repeated key values) is determined by the gather map that libcudf returns for the join. This does not come with any ordering guarantees. When sort=False, pandas has join-type dependent ordering guarantees which we also do not match. To fix this, in pandas-compatible mode only, reorder the gather maps according to the order of the input keys. When sort=False this means that our result matches pandas ordering. When sort=True, it ensures that (if we use a stable sort) the tie-break for equal sort keys is the input dataframe order. While we're here, switch from argsort + gather to sort_by_key when sorting results. - Closes #14001 Authors: - Lawrence Mitchell (https://github.com/wence-) Approvers: - Ashwin Srinath (https://github.com/shwina) - Bradley Dice (https://github.com/bdice) URL: #14428

Forward-merge branch-23.12 to branch-24.02

…#14444) Added the true/false string scalars to `column_to_strings_fn` so they are created once, instead of creating new scalars for each boolean column (using default stream). Authors: - Vukasin Milovanovic (https://github.com/vuule) Approvers: - Nghia Truong (https://github.com/ttnghia) - https://github.com/shrshi URL: #14444

Forward-merge branch-23.12 to branch-24.02

`pandas.core` is technically private and methods could be moved at any time. Avoiding places in the codepace where they could be avoided Authors: - Matthew Roeschke (https://github.com/mroeschke) Approvers: - Bradley Dice (https://github.com/bdice) - Lawrence Mitchell (https://github.com/wence-) URL: #14421

* add devcontainers * fix tag for CUDA 12.0 * use CUDA 11.8 for now * default to CUDA 12.0 * install cuda-cupti-dev in conda environment * remove MODIFY_PREFIX_PATH so the driver is found * install cuda-nvtx-dev in conda environment * update conda env * add MODIFY_PREFIX_PATH back * temporarily default to my branch with the fix for MODIFY_PREFIX_PATH in conda envs * remove temporary rapids-cmake pin * build all RAPIDS archs to take maximum advantage of sccache * add clangd and nsight vscode customizations * copy in default clangd config * remove options for pip vs. conda unless using the launch script * fix unified mounts * ensure dirs exist before mounting * add compile_commands to .gitignore * allow defining cudf and cudf_kafka include dirs via envvars * add kvikio * use volumes for isolated devcontainer source dirs * update README.md * update to rapidsai/devcontainers 23.10 * update rapids-build-utils version to 23.10 * add .clangd config file * update RAPIDS versions in devcontainer files * ensure the directory for the generated jitify kernels is exists after configuring * add clang and clang-tools 16 * remove isolated and unified devcontainers, make single the default * separate CUDA 11.8 and 12.0 devcontainers * fix version string for requirements.txt * update conda envs * clean up envvars, mounts, and build args, add codespaces post-attach command workaround * consolidate common vscode customizations * enumerate CUDA 11 packages, include up to CUDA 12.2 * include protoc-wheel when generating requirements.txt * default to cuda-python for cu11 * separate devcontainer mounts by CUDA version * add devcontainer build jobs to PR workflow * use pypi.nvidia.com instead of pypi.ngc.nvidia.com * fix venvs mount path * fix lint * ensure rmm-cuXX is included in pip requirements * disable libcudf_kakfa build for now * build dask-cudf * be more explicit in update-versions.sh, make devcontainer build required in pr jobs * revert rename devcontainer job * install librdkafka-dev in pip containers so we can build libcudf_kafka and cudf_kafka * separate cupy, cudf, and cudf_kafka matrices for CUDA 11 and 12 * add fallback include path for RMM * fallback to CUDA_PATH if CUDA_HOME is not set * define envvars in dockerfile * define envvars for cudf_kafka * build verbose * include wheel and setuptools in requirements.txt * switch workflow to branch-23.10 * update clang-tools version to 16.0.6 * fix update-version.sh * Use 24.02 branches. * fix version numbers * Fix dependencies.yaml. * Update .devcontainer/Dockerfile --------- Co-authored-by: Bradley Dice <[email protected]>

Forward-merge branch-23.12 to branch-24.02

`volatile` should no be required in our code, unless there are compiler or synchronization issues. This PR removes the use in Parquet reader and writer. Authors: - Vukasin Milovanovic (https://github.com/vuule) Approvers: - David Wendt (https://github.com/davidwendt) - Nghia Truong (https://github.com/ttnghia) URL: #14448

Move `from_delayed` and `concat` to appropriate subsections. - Closes #14299 Authors: - Lawrence Mitchell (https://github.com/wence-) - Vyas Ramasubramani (https://github.com/vyasr) Approvers: - Bradley Dice (https://github.com/bdice) - Vyas Ramasubramani (https://github.com/vyasr) URL: #14454

…tion (#14381) `.column` used to always return `pd.Index([], dtype=object)` even if an empty-dtyped columns was passed into the DataFrame constructor e.g. `DatetimeIndex([])`. Needed to preserved some information about what column dtype was passed in so we can return a correctly type Index Authors: - Matthew Roeschke (https://github.com/mroeschke) Approvers: - Lawrence Mitchell (https://github.com/wence-) URL: #14381

Add stream parameter to public APIs: - `nvtext::is_letter()` - `nvtext::porter_stemmer_measure` - `nvtext::edit_distance()` - `nvtext::edit_distance_matrix()` Also cleaned up some of the doxygen comments and added stream gtests. Reference #13744 Authors: - David Wendt (https://github.com/davidwendt) Approvers: - Vyas Ramasubramani (https://github.com/vyasr) - Mark Harris (https://github.com/harrism) URL: #14456

A count aggregation should always return an int64 column, even if the grouped dataframe is empty. Previously we did not do this because the short-circuiting for empty inputs was hit before handling the count case. Fix this by reordering the conditions. - Closes #14200 Authors: - Lawrence Mitchell (https://github.com/wence-) Approvers: - Bradley Dice (https://github.com/bdice) - GALI PREM SAGAR (https://github.com/galipremsagar) URL: #14473

…14475) Removes a non-empty null entry from a test strings column utility in `rank_tests.cpp`. Behavior with unsanitized nulls in at best UB and should not be included in unit tests. Authors: - David Wendt (https://github.com/davidwendt) Approvers: - Bradley Dice (https://github.com/bdice) - Vyas Ramasubramani (https://github.com/vyasr) URL: #14475

A couple links in the README redirect to other pages. This PR replaces those links with the destination to which they redirect. Authors: - Bradley Dice (https://github.com/bdice) Approvers: - Ashwin Srinath (https://github.com/shwina) URL: #14378

) This PR addresses the issue at #14409. I would like to propose the addition of unit tests that involve scenarios like having 100 or 1000 elements in a tree, reaching 100 levels of depth, with diferent data types and similar stress tests. The purpose of these tests is to conduct comprehensive testing and stress the Abstract Syntax Tree (AST), ultimately aiding in the identification and resolution of any potential issues. By introducing these pathological tests, we aim to ensure the robustness and reliability of our codebase. These tests can help us uncover edge cases and performance bottlenecks that might otherwise go unnoticed. Authors: - Alexander Ocsa (https://github.com/aocsa) Approvers: - Vyas Ramasubramani (https://github.com/vyasr) - Bradley Dice (https://github.com/bdice) URL: #14459

Previously a number of algorithms on Columns first converted to a single column frame and called a frame-based algorithm (which calls directly into libcudf using the column we first thought of). This is unnecessary since we already have the column to hand when calling the same algorithm at the column level. Moreover, in many cases where the algorithm is a user-facing API, the frame-based approach does more work (for example conversions and dtype matching). By removing this round trip we reduce some (unnecessary) overhead, and also make the memory footprint and behaviour of column-based methods more transparent. - Closes #13565 Authors: - Lawrence Mitchell (https://github.com/wence-) Approvers: - GALI PREM SAGAR (https://github.com/galipremsagar) - Bradley Dice (https://github.com/bdice) URL: #14491

…ctive (#14474) Closes #14471. This PR makes `Timestamp` objects picklable by registering a custom reducer for `pd.Timestamp` objects when `cudf.pandas` is active. Authors: - Ashwin Srinath (https://github.com/shwina) Approvers: - Vyas Ramasubramani (https://github.com/vyasr) URL: #14474

This PR fixes some errors in the doxygen docs and adds groups for some items that were previously missing altogether. Authors: - Vyas Ramasubramani (https://github.com/vyasr) Approvers: - David Wendt (https://github.com/davidwendt) URL: #14469

* Adding validation to `closed`, `dtype` arguments in `ItervalIndex.__init__` * Ensure `closed` attribute always maps to `IntervalDtype.closed` * `build_interval_column` was no longer necessary by using `IntervalColumn` directly Authors: - Matthew Roeschke (https://github.com/mroeschke) Approvers: - Bradley Dice (https://github.com/bdice) - Vyas Ramasubramani (https://github.com/vyasr) URL: #14778

Adds CI checks so that libcudf doesn't reintroduce weak/external CUDA kernels. Authors: - Robert Maynard (https://github.com/robertmaynard) Approvers: - Bradley Dice (https://github.com/bdice) - Vyas Ramasubramani (https://github.com/vyasr) - Ray Douglass (https://github.com/raydouglass) URL: #14768

For methods that essentially do ```python def select_by_foo(self, ...): ... return self.__class__(data={subset of self._data}) ``` The `return` would perform validation on the returned subset of column, but I think that's unnecessary since that was done during initialization Additionally * Removed `_create_unsafe` in favor of a `verify=True|False` keyword in the constructor * `_column_length` == `nrows` so removed `_column_length` * Renamed `_compare_keys` to `_keys_equal` * Remove seldom used/unnecessary methods Authors: - Matthew Roeschke (https://github.com/mroeschke) Approvers: - Vyas Ramasubramani (https://github.com/vyasr) URL: #14758

The goal of this PR is to address [10004](#10004) by supporting parsing of JSON files containing single quotes for field/value strings. This is a follow-up work to the POC [PR 14545](#14545) Authors: - Shruti Shivakumar (https://github.com/shrshi) Approvers: - Andy Grove (https://github.com/andygrove) - Vyas Ramasubramani (https://github.com/vyasr) - Vukasin Milovanovic (https://github.com/vuule) - Elias Stehle (https://github.com/elstehle) - Robert (Bobby) Evans (https://github.com/revans2) URL: #14729

closes #14270 Implementation of sub-rowgroup reading of Parquet files. This PR implements an additional layer on top of the existing chunking system. Currently, the reader takes two parameters: `input_pass_read_limit` which specifies a limit on temporary memory usage when reading and decompressing file data; and `output_pass_read_limit` which specifies a limit on how large an output chunk (a table) can be. Currently when the user specifies a limit via `input_pass_read_limit`, the reader will perform multiple `passes` over the file at row-group granularity. That is, it will control how many row groups it will read at once to conform to the specified limit. However, there are cases where this is not sufficient. So this PR changes things so that we now have `subpasses` below the top level `passes`. It works as follows: - We read a set of input chunks based on the `input_pass_read_limit` but we do not decompress them immediately. This constitutes a `pass`. - Within each pass of compressed data, we progressively decompress batches of pages as `subpasses`. - Within each `subpass` we apply the output limit to produce `chunks`. So the overall structure of the reader is: (read) `pass` -> (decompress) `subpass` -> (decode) `chunk` Major sections of code changes: - Previously the incoming page data in the file was unsorted. To handle this we later on produced a `page_index` that could be applied to the array to get them in schema-sorted order. This was getting very unwieldy so I just sort the pages up front now and the `page_index` array has gone away. - There are now two sets of pages to be aware of in the code. Within each `pass_intermediate_data` there is the set of all pages within the current set of loaded row groups. And then within the `subpass_intermediate_data` struct there is a separate array of pages representing the current batch of decompressed data we are processing. To keep the confusion down I changed a good amount of code to always reference it's array though it's associated struct. Ie, `pass.pages` or `subpass.pages`. In addition, I removed the `page_info` from `ColumnChunkDesc` to help prevent the kernels from getting confused. `ColumnChunkDesc` now only has a `dict_page` field which is constant across all subpasses. - The primary entry point for the chunking mechanism is in `handle_chunking`. Here we iterate through passes, subpasses and output chunks. Successive subpasses are computed and preprocessed through here. - The volume of diffs you'll see in `reader_impl_chunking.cu` is a little deceptive. A lot of this is just functions (or pieces of functions) that have been moved over from either `reader_impl_preprocess.cu` or `reader_impl_helpers.cpp`. The most relevant actual changes are in: ` handle_chunking`, `compute_input_passes`, `compute_next_subpass`, and `compute_chunks_for_subpass`. Note on tests: I renamed `parquet_chunked_reader_tests.cpp` to `parquet_chunked_reader_test.cu` as I needed to use thrust. The only actual changes in the file are the addition of the `ParquetChunkedReaderInputLimitConstrainedTest` and `ParquetChunkedReaderInputLimitTest` test suites at the bottom. Authors: - https://github.com/nvdbaranec - Nghia Truong (https://github.com/ttnghia) Approvers: - Vyas Ramasubramani (https://github.com/vyasr) - Nghia Truong (https://github.com/ttnghia) - Vukasin Milovanovic (https://github.com/vuule) URL: #14360

This PR fixes an error in `Index.difference` where the function keeps duplicate elements while pandas removes the duplicates. The tests had no inputs with duplicates, so I added new tests too (I added the test from the original issue). - closes #14489 Authors: - AmirAli Mirian (https://github.com/amiralimi) - Ashwin Srinath (https://github.com/shwina) Approvers: - Lawrence Mitchell (https://github.com/wence-) URL: #14789

…hunking. (#14889) Fixes #14883 The core issue was that the output chunking code was expecting all columns to have terminating pages that end in the same row count. Previously this was the case because we always processed entire row groups. But now with the subrowgroup reader, we can split on page boundaries that cause a jagged max row index for different columns. Example: ``` 0 100 200 Col A [-----------][--------------] 300 Col B [-----------][----------------------] ``` The input chunking would have computed a max row index of 200 for the subpass. But when computing the _output_ chunks, there was code that would have tried finding where row 300 was in column A, resulting in an out-of-bounds read. The fix is simply to cap the max row seen for column B to be the max expected row for the subpass. Authors: - https://github.com/nvdbaranec Approvers: - Nghia Truong (https://github.com/ttnghia) - Vukasin Milovanovic (https://github.com/vuule) - Mike Wilson (https://github.com/hyperbolic2346) URL: #14889

copy-pr-bot · 2024-01-26T19:18:25Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

review-notebook-app · 2024-01-26T19:18:29Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

This pull request reverses the modifications made to the sum/product aggregation target type, ensuring it always produces int64. The changes implemented by PR [14679](#14679) which led to degraded performance when the aggregation column had an unsigned type, are reverted. Additional details can be found in the issue [14886](#14886). Authors: - Suraj Aralihalli (https://github.com/SurajAralihalli) Approvers: - David Wendt (https://github.com/davidwendt) - Nghia Truong (https://github.com/ttnghia) - Karthikeyan (https://github.com/karthikeyann)

Closes #14932 ORC writer uses uncompressed stream sizes when allocating the bounce buffer. This can lead to issues when all uncompressed streams are larger than the GDS threshold, but compressed size is not. In this scenario, the bounce buffer is not allocated, and writing the compressed stream through the bounce buffer causes a crash. This PR moves the computation of the bounce buffer size until after compression, so it works with correct stream sizes. Authors: - Vukasin Milovanovic (https://github.com/vuule) Approvers: - Nghia Truong (https://github.com/ttnghia) - Bradley Dice (https://github.com/bdice)

raydouglass and others added 30 commits November 9, 2023 16:27

v24.02 Updates [skip ci]

deeabc9

Merge branch-23.12 into branch-24.02

d56a70f

Update cudf_kafka_version.

e4e6975

Merge pull request #14422 from bdice/branch-24.02-merge-23.12

6b4deeb

Forward-merge branch-23.12 to branch-24.02

Merge pull request #14406 from rapidsai/branch-23.12

427390f

Forward-merge branch-23.12 to branch-24.02

Merge branch-23.12 into branch-24.02

f593ae4

Merge pull request #14426 from bdice/branch-24.02-merge-23.12

abca0f4

Forward-merge branch-23.12 to branch-24.02

Merge pull request #14425 from rapidsai/branch-23.12

de4a51c

Forward-merge branch-23.12 to branch-24.02

Merge pull request #14433 from rapidsai/branch-23.12

58d82b0

Forward-merge branch-23.12 to branch-24.02

Merge pull request #14436 from rapidsai/branch-23.12

a51ab18

Forward-merge branch-23.12 to branch-24.02

Merge pull request #14442 from rapidsai/branch-23.12

eb964fb

Forward-merge branch-23.12 to branch-24.02

Merge pull request #14447 from rapidsai/branch-23.12

8917870

Forward-merge branch-23.12 to branch-24.02

Merge pull request #14449 from rapidsai/branch-23.12

9971790

Forward-merge branch-23.12 to branch-24.02

Merge pull request #14455 from rapidsai/branch-23.12

50a0413

Forward-merge branch-23.12 to branch-24.02

Merge pull request #14457 from rapidsai/branch-23.12

c8603c2

Forward-merge branch-23.12 to branch-24.02

mroeschke and others added 7 commits January 23, 2024 16:19

raydouglass requested review from a team as code owners January 26, 2024 19:18

raydouglass requested review from vyasr, galipremsagar and karthikeyann January 26, 2024 19:18

github-actions bot added libcudf Affects libcudf (C++/CUDA) code. Python Affects Python cuDF API. CMake CMake build issue conda Java Affects Java cuDF API. labels Jan 26, 2024

SurajAralihalli and others added 5 commits January 29, 2024 13:34

Pin pytest to <8 (#14920)

fc2b977

Exclude tests from builds (#14981)

c78033b

Update Changelog [skip ci]

82f6a53

raydouglass merged commit 379584c into main Feb 12, 2024
4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RELEASE] cudf v24.02 #14901

[RELEASE] cudf v24.02 #14901

raydouglass commented Jan 26, 2024

copy-pr-bot bot commented Jan 26, 2024

review-notebook-app bot commented Jan 26, 2024

[RELEASE] cudf v24.02 #14901

[RELEASE] cudf v24.02 #14901

Conversation

raydouglass commented Jan 26, 2024

❄️ Code freeze for branch-24.02 and v24.02 release

What does this mean?

What is the purpose of this PR?

copy-pr-bot bot commented Jan 26, 2024

review-notebook-app bot commented Jan 26, 2024

❄️ Code freeze for `branch-24.02` and v24.02 release