-
Notifications
You must be signed in to change notification settings - Fork 915
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add V2 page header support to parquet reader #11778
Conversation
Codecov Report
Additional details and impacted files@@ Coverage Diff @@
## branch-22.12 #11778 +/- ##
===============================================
Coverage ? 88.14%
===============================================
Files ? 133
Lines ? 21982
Branches ? 0
===============================================
Hits ? 19376
Misses ? 2606
Partials ? 0 Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report at Codecov. |
rerun tests |
1 similar comment
rerun tests |
Co-authored-by: Vukasin Milovanovic <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good overall, just worried about the synchronization.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Python changes LGTM. Minor nit that could be addressed
@gpucibot merge |
Description
Adds support for reading parquet files with V2 page headers. Fixes #11686
Submitting as draft for now because I'm not sure how to do unit tests for this. libcudf cannot produce files with V2 headers, so I would need to either add files to a data directory somewhere, or add raw binary of some parquet files to parquet_test.cpp. Given the comment on theDecimalRead
test, neither seems attractive. Suggestions are welcome. Perhaps use python to test?Checklist