Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement scan-based whole-frame aggregations for cudf-polars #16509

Merged
merged 5 commits into from
Aug 20, 2024

Conversation

lithomas1
Copy link
Contributor

@lithomas1 lithomas1 commented Aug 7, 2024

Description

contributes to #16478

This implements "cum_min", "cum_max", "cum_prod", "cum_sum"

"cum_count" is not implemented for now, since there's no exact libcudf match (I imagine the non-grouped case is also not used that much but haven't checked).
I suppose we could implement it by creating a column of 1s and copying the null mask over, and doing a cum_sum on that.
Let me know if you want to try that.

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

@lithomas1 lithomas1 added feature request New feature or request non-breaking Non-breaking change labels Aug 7, 2024
@github-actions github-actions bot added Python Affects Python cuDF API. cudf.polars Issues specific to cudf.polars labels Aug 7, 2024
@lithomas1 lithomas1 marked this pull request as ready for review August 7, 2024 23:15
@lithomas1 lithomas1 requested a review from a team as a code owner August 7, 2024 23:15
@lithomas1 lithomas1 requested review from vyasr and isVoid and removed request for a team August 7, 2024 23:15
@@ -26,7 +26,7 @@ def test_translation_assert_raises():
class E(Exception):
pass

unsupported = df.group_by("a").agg(pl.col("a").cum_max().alias("b"))
unsupported = df.group_by("a").agg(pl.col("a").upper_bound().alias("b"))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is fine for now since it's already doing this just with another function, but I'll do some thinking about how we can test the IR translation utility without needing to chase an actually unsupported feature

Copy link

copy-pr-bot bot commented Aug 20, 2024

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@lithomas1
Copy link
Contributor Author

/ok to test

@lithomas1
Copy link
Contributor Author

/merge

@rapids-bot rapids-bot bot merged commit 152111b into rapidsai:feature/cudf-polars Aug 20, 2024
76 of 77 checks passed
@lithomas1 lithomas1 deleted the scan-aggs branch August 20, 2024 15:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cudf.polars Issues specific to cudf.polars feature request New feature or request non-breaking Non-breaking change Python Affects Python cuDF API.
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

2 participants