-
Notifications
You must be signed in to change notification settings - Fork 915
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix logic while parsing the sum statistic for numerical orc columns #9183
Fix logic while parsing the sum statistic for numerical orc columns #9183
Conversation
Codecov Report
@@ Coverage Diff @@
## branch-21.10 #9183 +/- ##
================================================
- Coverage 10.85% 10.80% -0.06%
================================================
Files 115 116 +1
Lines 19158 19318 +160
================================================
+ Hits 2080 2087 +7
- Misses 17078 17231 +153
Continue to review full report at Codecov.
|
rerun tests |
Seeing unrelated build errors:
rerun tests |
rerun tests |
rerun tests |
1 similar comment
rerun tests |
assert stripe_stats[0]["c"].get("sum") == minint64 + 1 | ||
|
||
|
||
def test_empty_statistics(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should there be one more test to check all the statistics value for a proper table of all types of column with values.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, right now the other read_orc_statistics test only checks for int and bool types. I'll add one/modify existing to include the different column types.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the interest of time, I'll add one more test in a followup PR.
Rest looks good. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good on my end!
@gpucibot merge |
Fixes #9182.
In cases where the
sum
statistic was not present in the orc file for int and float columns, the values would be incorrectly interpreted as 0 because of protobuf's default values when fields are missing.This PR adds a check for field presence before assignment.