-
Notifications
You must be signed in to change notification settings - Fork 915
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement chunked parquet reader in cudf-python #15728
Conversation
/okay to test |
@GregoryKimball This PR is ready for review, I'll add the chunked concat and then enable using chunked parquet reader in |
Thank you @galipremsagar! This looks like a great addition, the debut of chunked parquet reading to cudf python ❤️ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR!
Just a heads up:
Eventually, we'll probably want the binding for this to live in pylibcudf (so we'd need to rewrite the stuff added in this PR again at a later date).
Unfortunately, bindings for I/O haven't landed in the dev branch yet (I just started porting over a bunch of the classes we'd need for I/O like TableWithMetadata
in #15899).
I think I'll be able to get round to this in a weeks/2 weeks time after my PR lands, but I think it's still OK to put this in before then, even if we need to rewrite it a bit for pylibcudf later.
@lithomas1 This is now ready for a re-review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me now (just one final comment).
I think someone else should probably take a look too, since I'm still pretty new to the codebase.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @galipremsagar
/merge |
Description
Partially Addresses: #14966
This PR implements chunked parquet bindings in python.
Checklist