-
Notifications
You must be signed in to change notification settings - Fork 849
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
allocate enough bytes when writing booleans in parquet writer #658
Conversation
Codecov Report
@@ Coverage Diff @@
## master #658 +/- ##
==========================================
- Coverage 82.50% 82.43% -0.08%
==========================================
Files 168 168
Lines 47249 47265 +16
==========================================
- Hits 38984 38961 -23
- Misses 8265 8304 +39
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @bjchambers 👍 . This looks good to me.
I also double checked the test case covers the bug. Without the code changes, it fails like this:
---- arrow::arrow_writer::tests::bool_large_single_column stdout ----
thread 'arrow::arrow_writer::tests::bool_large_single_column' panicked at 'called `Result::unwrap()` on an `Err` value: EOF("unable to put boolean value")', parquet/src/arrow/arrow_writer.rs:1224:39
I would like at least one other person to review this change too (perhaps @sunchao or @nevi-me ) before we merge it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
* allocate enough bytes when writing booleans * round up to nearest multiple of 256
* allocate enough bytes when writing booleans * round up to nearest multiple of 256 Co-authored-by: Ben Chambers <[email protected]>
Which issue does this PR close?
Closes #657.
Rationale for this change
Without this change, writing a batch with more than 2048 boolean values may fail to extend the bit vector enough.
What changes are included in this PR?
Are there any user-facing changes?
No