Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Parquet Reader options classes to pylibcudf #17464

Merged
merged 14 commits into from
Dec 6, 2024

Conversation

Matt711
Copy link
Contributor

@Matt711 Matt711 commented Nov 27, 2024

Description

Follow up of #17263, this PR adds the parquet reader options classes to pylibcudf and plumbs the changes through cudf python.

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

@Matt711 Matt711 added feature request New feature or request non-breaking Non-breaking change labels Nov 27, 2024
Copy link

copy-pr-bot bot commented Nov 27, 2024

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@github-actions github-actions bot added libcudf Affects libcudf (C++/CUDA) code. Python Affects Python cuDF API. pylibcudf Issues specific to the pylibcudf package labels Nov 27, 2024
@@ -410,6 +410,7 @@ class parquet_reader_options_builder {
*
* @param val Boolean value whether to read matching projected and filter columns from mismatched
* Parquet sources.
*
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the reviewer: I couldn't find this function in the docs, so I thought it may not be rendering correctly without this new line. I confirmed that you can see the function in the docs with it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, thanks for catching this! We usually don’t include a blank line between @param and @return, but it might be time to change that.

@Matt711 Matt711 marked this pull request as ready for review December 3, 2024 18:41
@Matt711 Matt711 requested review from a team as code owners December 3, 2024 18:41
Copy link

copy-pr-bot bot commented Dec 3, 2024

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@Matt711
Copy link
Contributor Author

Matt711 commented Dec 3, 2024

/ok to test

@github-actions github-actions bot added the cudf.polars Issues specific to cudf.polars label Dec 4, 2024
@Matt711 Matt711 added the 3 - Ready for Review Ready for review by team label Dec 4, 2024
@Matt711
Copy link
Contributor Author

Matt711 commented Dec 4, 2024

/ok to test

@Matt711
Copy link
Contributor Author

Matt711 commented Dec 4, 2024

/ok to test

@Matt711
Copy link
Contributor Author

Matt711 commented Dec 4, 2024

/ok to test

@Matt711
Copy link
Contributor Author

Matt711 commented Dec 5, 2024

Note: The telemetry job failure is non-blocking

Copy link
Member

@PointKernel PointKernel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the doc fix.

@@ -410,6 +410,7 @@ class parquet_reader_options_builder {
*
* @param val Boolean value whether to read matching projected and filter columns from mismatched
* Parquet sources.
*
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, thanks for catching this! We usually don’t include a blank line between @param and @return, but it might be time to change that.

Copy link
Contributor

@mythrocks mythrocks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not well versed with the Python side of the project, but this looks mostly good to my eyes. Thank you for this change.

I have a couple of requests for clarification, and a minor suggestion to fix a copy/paste error.

python/cudf/cudf/_lib/parquet.pyx Show resolved Hide resolved
python/cudf_polars/cudf_polars/dsl/ir.py Outdated Show resolved Hide resolved
python/pylibcudf/pylibcudf/io/parquet.pyx Outdated Show resolved Hide resolved
Matt711 and others added 2 commits December 5, 2024 13:51
@Matt711
Copy link
Contributor Author

Matt711 commented Dec 6, 2024

/ok to test

@Matt711 Matt711 requested a review from mythrocks December 6, 2024 00:14
@Matt711
Copy link
Contributor Author

Matt711 commented Dec 6, 2024

/merge

Copy link
Contributor

@mythrocks mythrocks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for accommodating the review suggestions. 👍

@mythrocks
Copy link
Contributor

Might need a rebase.

@Matt711
Copy link
Contributor Author

Matt711 commented Dec 6, 2024

/ok to test

@rapids-bot rapids-bot bot merged commit cbeefd8 into rapidsai:branch-25.02 Dec 6, 2024
107 of 108 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3 - Ready for Review Ready for review by team cudf.polars Issues specific to cudf.polars feature request New feature or request libcudf Affects libcudf (C++/CUDA) code. non-breaking Non-breaking change pylibcudf Issues specific to the pylibcudf package Python Affects Python cuDF API.
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

5 participants