-
Notifications
You must be signed in to change notification settings - Fork 855
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(ipc): Support writing dictionaries nested in structs and unions #870
Conversation
Dictionaries are lost when serializing a RecordBatch for IPC, producing invalid arrow data. This PR changes encoded_batch to recursively find all dictionary fields within the schema (currently only in structs and unions) so nested dictionaries are properly serialized.
Codecov Report
@@ Coverage Diff @@
## master #870 +/- ##
==========================================
- Coverage 82.45% 82.43% -0.02%
==========================================
Files 168 168
Lines 48150 48206 +56
==========================================
+ Hits 39700 39741 +41
- Misses 8450 8465 +15
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
||
if let DataType::Dictionary(_key_type, _value_type) = column.data_type() { | ||
) -> Result<()> { | ||
// TODO: Handle other nested types (map, list, etc) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
might be worth filing another tracking ticket if you have time
…870) * feat(ipc): Support for writing dictionaries nested in structs and unions Dictionaries are lost when serializing a RecordBatch for IPC, producing invalid arrow data. This PR changes encoded_batch to recursively find all dictionary fields within the schema (currently only in structs and unions) so nested dictionaries are properly serialized. * address lint and clippy
…870) (#915) * feat(ipc): Support for writing dictionaries nested in structs and unions Dictionaries are lost when serializing a RecordBatch for IPC, producing invalid arrow data. This PR changes encoded_batch to recursively find all dictionary fields within the schema (currently only in structs and unions) so nested dictionaries are properly serialized. * address lint and clippy Co-authored-by: Helgi Kristvin Sigurbjarnarson <[email protected]>
Which issue does this PR close?
Part of #846
Rationale for this change
Dictionaries are lost when serializing a
RecordBatch
for IPC, producing invalid arrow data. This PR changesencoded_batch
to recursively find all dictionary fields within the schema (currently only in structs and unions) so nested dictionaries are properly serialized.