-
Notifications
You must be signed in to change notification settings - Fork 841
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
arrow-ipc: Default to not preserving dict IDs #6788
Conversation
The integration tests seem to be legitimately failing on this |
Is this the appropriate docs page to look at for trying to reproduce this locally? https://github.com/apache/arrow-rs/tree/main/arrow-integration-testing |
So diving in, it looks like ipc, and c-data work fine, it's just flight. And surprisingly even rust-to-rust seems broken, which is what I'm going to start with, by adding more tests to arrow-flight. |
63a1952
to
75ceb39
Compare
Previously the integration tests forced preserving dict IDs in some places and used the default in others. This worked fine previously because preserving dict IDs used to be the default, but it isn't anymore.
75ceb39
to
206f7f4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense to me, I've labelled it an API change so it is rendered as a breaking change in the changelog
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This also makes sense to me too
Thank you @brancz 🙏 |
Thank you!! |
Which issue does this PR close?
Related to #5981
Rationale for this change
This is the first step towards removing the
dict_id
field as discussed in #5981. With this patch the default behavior changes to what the behavior will be once the field is fully removed.The previous behavior can still be restored by passing
with_preserve_dict_id(true)
, however, doing so is now deprecated and will be removed together with thedict_id
in the next (March) DataFusion release.What changes are included in this PR?
Default to not preserving the dict ID from the schema field
dict_id
.Are there any user-facing changes?
Not a breaking change to an API, but the default behavior changes.
@tustvold @alamb