Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Parquet decimal64 stats #15281

Merged
merged 3 commits into from
Mar 18, 2024
Merged

Conversation

etseidl
Copy link
Contributor

@etseidl etseidl commented Mar 12, 2024

Description

In the Parquet writer, decimal64 stats were being treated like decimal128 (i.e. written in network byte order), when they should be treated like an int64_t. This PR fixes that and adds tests of decimal32 and decimal64 statistics.

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

@etseidl etseidl requested a review from a team as a code owner March 12, 2024 21:55
@etseidl etseidl requested review from mythrocks and shrshi March 12, 2024 21:55
Copy link

copy-pr-bot bot commented Mar 12, 2024

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@github-actions github-actions bot added the libcudf Affects libcudf (C++/CUDA) code. label Mar 12, 2024
@davidwendt davidwendt added bug Something isn't working 3 - Ready for Review Ready for review by team non-breaking Non-breaking change labels Mar 13, 2024
@davidwendt
Copy link
Contributor

/ok to test

@etseidl
Copy link
Contributor Author

etseidl commented Mar 15, 2024

@vuule I just looked at this comment. Could you take a look at this fix in that light? The problem is that decimal128 stats are to be written big endian, while decimal64 should be little endian. The current code will only write out the high 8 bytes of the __int128_t, so all decimal64 stats wind up being 0.

Copy link
Contributor

@vuule vuule left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for fixing my mess :)

@vuule
Copy link
Contributor

vuule commented Mar 18, 2024

/ok to test

@vuule
Copy link
Contributor

vuule commented Mar 18, 2024

/merge

@rapids-bot rapids-bot bot merged commit e435953 into rapidsai:branch-24.04 Mar 18, 2024
75 checks passed
@etseidl etseidl deleted the decimal_stats branch March 18, 2024 23:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3 - Ready for Review Ready for review by team bug Something isn't working libcudf Affects libcudf (C++/CUDA) code. non-breaking Non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants