Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle nans in groupby-aggregations in polars executor #16233

Merged
merged 9 commits into from
Jul 12, 2024

Conversation

wence-
Copy link
Contributor

@wence- wence- commented Jul 9, 2024

Description

Polars min and max by default ignore nans (treating them as nulls), to mimic this behaviour we must mask out nans before performing a min/max aggregation.

Do this by exposing nans_to_nulls in pylibcudf and implementing a with_mask method on pylibcudf Columns.

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

@wence- wence- added improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels Jul 9, 2024
@wence- wence- requested a review from a team as a code owner July 9, 2024 16:02
@wence- wence- requested review from Matt711 and lithomas1 July 9, 2024 16:02
@github-actions github-actions bot added Python Affects Python cuDF API. CMake CMake build issue cudf.polars Issues specific to cudf.polars pylibcudf Issues specific to the pylibcudf package labels Jul 9, 2024
@wence- wence- force-pushed the wence/fea/polars-nan-aggs branch from 587e373 to cff4fdc Compare July 9, 2024 16:56
Copy link
Contributor

@lithomas1 lithomas1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a test in pylibcudf_tests as well?

Otherwise LGTM.

python/cudf/cudf/_lib/pylibcudf/transform.pyx Outdated Show resolved Hide resolved
@wence-
Copy link
Contributor Author

wence- commented Jul 12, 2024

Can you add a test in pylibcudf_tests as well?

Otherwise LGTM.

Done, thanks.

@wence-
Copy link
Contributor Author

wence- commented Jul 12, 2024

/merge

@rapids-bot rapids-bot bot merged commit 4fc8e79 into rapidsai:branch-24.08 Jul 12, 2024
81 checks passed
@wence- wence- deleted the wence/fea/polars-nan-aggs branch July 12, 2024 16:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CMake CMake build issue cudf.polars Issues specific to cudf.polars improvement Improvement / enhancement to an existing function non-breaking Non-breaking change pylibcudf Issues specific to the pylibcudf package Python Affects Python cuDF API.
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

2 participants