-
Notifications
You must be signed in to change notification settings - Fork 919
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FEA] Support for missing ORC column statistics #7087
Comments
hasNull
ORC column statistics
This issue has been marked stale due to no recent activity in the past 30d. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed. This issue will be marked rotten if there is no activity in the next 60d. |
This issue has been labeled |
Adds the ability for ORC statistics reader to read the value `ColumnStatistics::hasNull`. Contributes to #7087. Does not close it because the issue also requires the ability to write the field in the orc writer. Authors: - Devavret Makkar (https://github.com/devavret) Approvers: - Nghia Truong (https://github.com/ttnghia) - Vukasin Milovanovic (https://github.com/vuule) URL: #11747
Closes #7087, closes #13793, closes #13899 This PR adds support for several cases and statistics types: - sum statistics are included even when all elements are null (no minmax); - sum statistics are included in double stats; - minimum/maximum and minimumNanos/maximumNanos are included in timestamp stats; - hasNull field is written for all columns. - decimal statistics Added tests for all supported stats. Authors: - Vukasin Milovanovic (https://github.com/vuule) - Karthikeyan (https://github.com/karthikeyann) Approvers: - Lawrence Mitchell (https://github.com/wence-) - Robert (Bobby) Evans (https://github.com/revans2) - Vyas Ramasubramani (https://github.com/vyasr) - Karthikeyan (https://github.com/karthikeyann) URL: #13848
The column statistics encoding in the writer is missing support for a few fields:
Also, the ProtobufReader does not support bool fields (needed to read the hasNull field without a Python API).
The text was updated successfully, but these errors were encountered: