-
Notifications
You must be signed in to change notification settings - Fork 867
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix decimal repr in parquet schema printer #721
Conversation
Codecov Report
@@ Coverage Diff @@
## master #721 +/- ##
==========================================
+ Coverage 82.50% 82.51% +0.01%
==========================================
Files 168 168
Lines 47589 47597 +8
==========================================
+ Hits 39261 39273 +12
+ Misses 8328 8324 -4
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch, this is where named argument passing would be really helpful >_<
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
.with_repetition(Repetition::OPTIONAL) | ||
.build() | ||
.unwrap(), | ||
"OPTIONAL FIXED_LEN_BYTE_ARRAY (9) decimal (DECIMAL(19,4));", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In case anyone is interested, I double checked that test fails without the code change in this PR (unsurprisingly):
---- schema::printer::tests::test_print_flba_logical_types stdout ----
thread 'schema::printer::tests::test_print_flba_logical_types' panicked at 'called `Result::unwrap()` on an `Err` value: General("Cannot represent FIXED_LEN_BYTE_ARRAY as DECIMAL with length 8 and precision 19. The max precision can only be 18")', parquet/src/schema/printer.rs:644:18
Fixes #713 Co-authored-by: Sergii Mikhtoniuk <[email protected]>
Which issue does this PR close?
Closes #713.
Rationale for this change
Formatting of
DECIMAL
in schemas where onlyConvertedType
is present and notLogicalType
was broken.This is the case for parquet files produced by Spark.
What changes are included in this PR?
I also had to update
decimal_length_from_precision
function as the math there seemed off.When used as
.with_length(decimal_length_from_precision(19))
it resulted in a panic:Are there any user-facing changes?
print_schema
will no longer produce malformed results for decimals.