TEST Introduce Comprehensive Pathological Unit Tests for Issue #14409 #14467

aocsa · 2023-11-21T21:24:38Z

Description

This PR addresses the issue at #14409. I would like to propose the addition of unit tests that involve scenarios like having 100 or 1000 elements in a tree, reaching 100 levels of depth, with diferent data types and similar stress tests. The purpose of these tests is to conduct comprehensive testing and stress the Abstract Syntax Tree (AST), ultimately aiding in the identification and resolution of any potential issues.

By introducing these pathological tests, we aim to ensure the robustness and reliability of our codebase. These tests can help us uncover edge cases and performance bottlenecks that might otherwise go unnoticed.

Checklist

I am familiar with the Contributing Guidelines.
New or existing tests cover these changes.
The documentation is up to date with these changes.

Forward-merge branch-23.12 to branch-24.02

…idsai#14428) If we pass sort=True to merges we are on the hook to sort the result in order with respect to the key columns. If those key columns have repeated values there is still some space for ambiguity. Currently we get a result back whose order (for the repeated key values) is determined by the gather map that libcudf returns for the join. This does not come with any ordering guarantees. When sort=False, pandas has join-type dependent ordering guarantees which we also do not match. To fix this, in pandas-compatible mode only, reorder the gather maps according to the order of the input keys. When sort=False this means that our result matches pandas ordering. When sort=True, it ensures that (if we use a stable sort) the tie-break for equal sort keys is the input dataframe order. While we're here, switch from argsort + gather to sort_by_key when sorting results. - Closes rapidsai#14001 Authors: - Lawrence Mitchell (https://github.com/wence-) Approvers: - Ashwin Srinath (https://github.com/shwina) - Bradley Dice (https://github.com/bdice) URL: rapidsai#14428

Forward-merge branch-23.12 to branch-24.02

…rapidsai#14444) Added the true/false string scalars to `column_to_strings_fn` so they are created once, instead of creating new scalars for each boolean column (using default stream). Authors: - Vukasin Milovanovic (https://github.com/vuule) Approvers: - Nghia Truong (https://github.com/ttnghia) - https://github.com/shrshi URL: rapidsai#14444

Forward-merge branch-23.12 to branch-24.02

`pandas.core` is technically private and methods could be moved at any time. Avoiding places in the codepace where they could be avoided Authors: - Matthew Roeschke (https://github.com/mroeschke) Approvers: - Bradley Dice (https://github.com/bdice) - Lawrence Mitchell (https://github.com/wence-) URL: rapidsai#14421

* add devcontainers * fix tag for CUDA 12.0 * use CUDA 11.8 for now * default to CUDA 12.0 * install cuda-cupti-dev in conda environment * remove MODIFY_PREFIX_PATH so the driver is found * install cuda-nvtx-dev in conda environment * update conda env * add MODIFY_PREFIX_PATH back * temporarily default to my branch with the fix for MODIFY_PREFIX_PATH in conda envs * remove temporary rapids-cmake pin * build all RAPIDS archs to take maximum advantage of sccache * add clangd and nsight vscode customizations * copy in default clangd config * remove options for pip vs. conda unless using the launch script * fix unified mounts * ensure dirs exist before mounting * add compile_commands to .gitignore * allow defining cudf and cudf_kafka include dirs via envvars * add kvikio * use volumes for isolated devcontainer source dirs * update README.md * update to rapidsai/devcontainers 23.10 * update rapids-build-utils version to 23.10 * add .clangd config file * update RAPIDS versions in devcontainer files * ensure the directory for the generated jitify kernels is exists after configuring * add clang and clang-tools 16 * remove isolated and unified devcontainers, make single the default * separate CUDA 11.8 and 12.0 devcontainers * fix version string for requirements.txt * update conda envs * clean up envvars, mounts, and build args, add codespaces post-attach command workaround * consolidate common vscode customizations * enumerate CUDA 11 packages, include up to CUDA 12.2 * include protoc-wheel when generating requirements.txt * default to cuda-python for cu11 * separate devcontainer mounts by CUDA version * add devcontainer build jobs to PR workflow * use pypi.nvidia.com instead of pypi.ngc.nvidia.com * fix venvs mount path * fix lint * ensure rmm-cuXX is included in pip requirements * disable libcudf_kakfa build for now * build dask-cudf * be more explicit in update-versions.sh, make devcontainer build required in pr jobs * revert rename devcontainer job * install librdkafka-dev in pip containers so we can build libcudf_kafka and cudf_kafka * separate cupy, cudf, and cudf_kafka matrices for CUDA 11 and 12 * add fallback include path for RMM * fallback to CUDA_PATH if CUDA_HOME is not set * define envvars in dockerfile * define envvars for cudf_kafka * build verbose * include wheel and setuptools in requirements.txt * switch workflow to branch-23.10 * update clang-tools version to 16.0.6 * fix update-version.sh * Use 24.02 branches. * fix version numbers * Fix dependencies.yaml. * Update .devcontainer/Dockerfile --------- Co-authored-by: Bradley Dice <[email protected]>

Forward-merge branch-23.12 to branch-24.02

`volatile` should no be required in our code, unless there are compiler or synchronization issues. This PR removes the use in Parquet reader and writer. Authors: - Vukasin Milovanovic (https://github.com/vuule) Approvers: - David Wendt (https://github.com/davidwendt) - Nghia Truong (https://github.com/ttnghia) URL: rapidsai#14448

Move `from_delayed` and `concat` to appropriate subsections. - Closes rapidsai#14299 Authors: - Lawrence Mitchell (https://github.com/wence-) - Vyas Ramasubramani (https://github.com/vyasr) Approvers: - Bradley Dice (https://github.com/bdice) - Vyas Ramasubramani (https://github.com/vyasr) URL: rapidsai#14454

…tion (rapidsai#14381) `.column` used to always return `pd.Index([], dtype=object)` even if an empty-dtyped columns was passed into the DataFrame constructor e.g. `DatetimeIndex([])`. Needed to preserved some information about what column dtype was passed in so we can return a correctly type Index Authors: - Matthew Roeschke (https://github.com/mroeschke) Approvers: - Lawrence Mitchell (https://github.com/wence-) URL: rapidsai#14381

copy-pr-bot · 2023-11-21T21:24:42Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

raydouglass and others added 23 commits November 9, 2023 16:27

v24.02 Updates [skip ci]

deeabc9

Merge branch-23.12 into branch-24.02

d56a70f

Update cudf_kafka_version.

e4e6975

Merge pull request rapidsai#14422 from bdice/branch-24.02-merge-23.12

6b4deeb

Forward-merge branch-23.12 to branch-24.02

Merge pull request rapidsai#14406 from rapidsai/branch-23.12

427390f

Forward-merge branch-23.12 to branch-24.02

Merge branch-23.12 into branch-24.02

f593ae4

Merge pull request rapidsai#14426 from bdice/branch-24.02-merge-23.12

abca0f4

Forward-merge branch-23.12 to branch-24.02

Merge pull request rapidsai#14425 from rapidsai/branch-23.12

de4a51c

Forward-merge branch-23.12 to branch-24.02

Merge pull request rapidsai#14433 from rapidsai/branch-23.12

58d82b0

Forward-merge branch-23.12 to branch-24.02

Merge pull request rapidsai#14436 from rapidsai/branch-23.12

a51ab18

Forward-merge branch-23.12 to branch-24.02

Merge pull request rapidsai#14442 from rapidsai/branch-23.12

eb964fb

Forward-merge branch-23.12 to branch-24.02

Merge pull request rapidsai#14447 from rapidsai/branch-23.12

8917870

Forward-merge branch-23.12 to branch-24.02

Merge pull request rapidsai#14449 from rapidsai/branch-23.12

9971790

Forward-merge branch-23.12 to branch-24.02

Merge pull request rapidsai#14455 from rapidsai/branch-23.12

50a0413

Forward-merge branch-23.12 to branch-24.02

Merge pull request rapidsai#14457 from rapidsai/branch-23.12

c8603c2

Forward-merge branch-23.12 to branch-24.02

add unit test

66b5074

aocsa requested review from a team as code owners November 21, 2023 21:24

aocsa requested a review from wence- November 21, 2023 21:24

aocsa requested review from bdice and mythrocks November 21, 2023 21:24

github-actions bot added libcudf Affects libcudf (C++/CUDA) code. Python Affects Python cuDF API. CMake CMake build issue conda Java Affects Java cuDF API. labels Nov 21, 2023

aocsa marked this pull request as draft November 21, 2023 21:25

aocsa closed this Nov 21, 2023

aocsa changed the title ~~Introduce Comprehensive Pathological Unit Tests for Issue #14409~~ TEST Introduce Comprehensive Pathological Unit Tests for Issue #14409 Nov 22, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TEST Introduce Comprehensive Pathological Unit Tests for Issue #14409 #14467

TEST Introduce Comprehensive Pathological Unit Tests for Issue #14409 #14467

aocsa commented Nov 21, 2023

copy-pr-bot bot commented Nov 21, 2023

TEST Introduce Comprehensive Pathological Unit Tests for Issue #14409 #14467

TEST Introduce Comprehensive Pathological Unit Tests for Issue #14409 #14467

Conversation

aocsa commented Nov 21, 2023

Description

Checklist

copy-pr-bot bot commented Nov 21, 2023