Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement cudf-polars datetime extraction methods #16500

Merged
merged 15 commits into from
Sep 5, 2024

Conversation

lithomas1
Copy link
Contributor

Description

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

@lithomas1 lithomas1 added feature request New feature or request non-breaking Non-breaking change cudf.polars Issues specific to cudf.polars labels Aug 6, 2024
@lithomas1 lithomas1 requested a review from a team as a code owner August 6, 2024 21:14
@lithomas1 lithomas1 requested review from bdice and galipremsagar and removed request for a team August 6, 2024 21:14
@github-actions github-actions bot added Python Affects Python cuDF API. pylibcudf Issues specific to the pylibcudf package labels Aug 6, 2024
Copy link

copy-pr-bot bot commented Aug 30, 2024

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@brandon-b-miller
Copy link
Contributor

/ok to test

@brandon-b-miller
Copy link
Contributor

/ok to test

@brandon-b-miller
Copy link
Contributor

/ok to test

@brandon-b-miller
Copy link
Contributor

/ok to test

Copy link
Contributor

@bdice bdice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally this seems on target! We will have some follow-up work to implement single-kernel solutions for micros / nanos but otherwise all seems fine.

python/cudf_polars/cudf_polars/dsl/expr.py Outdated Show resolved Hide resolved
"nanosecond",
]

unsupported_extract_fields = ["millennium"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess we could extract this from years rather than not support it…

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was a rough solution for maintaining 100% coverage of the code, specifically getting us to hit the error line that complains about a rust node that is exposed but that we don't support. Because of this, adding millennium would in some sense just kick the can down the road until we come to a day where all the rust nodes are supported in python as well and we can get rid of the error.

Instead of doing that I went with an approach that we liked in a similar situation where we monkeypatch things to trigger an error at the desired location, regardless of what we support. See the latest commits for details and let me know what you think.

Copy link
Contributor

@wence- wence- left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few additional comments in the same vein as @bdice's

python/cudf_polars/cudf_polars/dsl/expr.py Outdated Show resolved Hide resolved
python/cudf_polars/cudf_polars/dsl/expr.py Outdated Show resolved Hide resolved
python/cudf_polars/cudf_polars/dsl/expr.py Outdated Show resolved Hide resolved
python/cudf_polars/cudf_polars/dsl/expr.py Outdated Show resolved Hide resolved
Co-authored-by: Bradley Dice <[email protected]>
Co-authored-by: Lawrence Mitchell <[email protected]>
python/cudf_polars/cudf_polars/dsl/expr.py Outdated Show resolved Hide resolved
python/cudf_polars/cudf_polars/dsl/expr.py Outdated Show resolved Hide resolved
python/cudf_polars/cudf_polars/dsl/expr.py Outdated Show resolved Hide resolved
python/cudf_polars/cudf_polars/dsl/expr.py Outdated Show resolved Hide resolved
python/cudf_polars/cudf_polars/dsl/expr.py Outdated Show resolved Hide resolved
python/cudf_polars/cudf_polars/dsl/expr.py Outdated Show resolved Hide resolved
python/cudf_polars/cudf_polars/dsl/expr.py Outdated Show resolved Hide resolved
python/cudf_polars/cudf_polars/dsl/translate.py Outdated Show resolved Hide resolved
python/cudf_polars/cudf_polars/dsl/translate.py Outdated Show resolved Hide resolved
@brandon-b-miller
Copy link
Contributor

/ok to test

@wence-
Copy link
Contributor

wence- commented Sep 4, 2024

Can fix the test that asserts that dtypes don't match for year extraction? https://github.com/rapidsai/cudf/actions/runs/10699704960/job/29662654826?pr=16500#step:8:432

@brandon-b-miller
Copy link
Contributor

/ok to test

# libcudf can lose data here.
# https://github.com/rapidsai/cudf/issues/16196
assert_gpu_result_equal(q)
q = ldf.select(field(pl.col("dates").dt))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now the dtypes should match, so we need to remove the check_dtypes=False part below

@brandon-b-miller
Copy link
Contributor

/ok to test

Copy link
Contributor

@wence- wence- left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wence- wence- merged commit ccb8061 into rapidsai:feature/cudf-polars Sep 5, 2024
80 of 81 checks passed
@lithomas1 lithomas1 deleted the polars-dt-methods branch September 5, 2024 18:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cudf.polars Issues specific to cudf.polars feature request New feature or request non-breaking Non-breaking change pylibcudf Issues specific to the pylibcudf package Python Affects Python cuDF API.
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

4 participants