-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
allow to override Base(Large)RepeatedValueViewVector #488
base: main
Are you sure you want to change the base?
Conversation
This is done as different arrow implementation have different default list item field name For example, Rust implementation of Arrow (`arrow-rs`) has default value of `item` (https://github.com/apache/arrow-rs/blob/4f1f6e57c568fae8233ab9da7d7c7acdaea4112a/arrow-schema/src/field.rs#L125). And according to the specification, the name of the list is not required to be a specific string. so to make it easier to use Java Arrow implementation with other implementation of Arrow, I'm making the `DATA_VECTOR_NAME` configurable
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it'd make more sense to allow changing defaultDataVectorName
for an instance? I'm not a big fan of mutable globals
Changing |
The child vector is created lazily, though |
But it can still be too late, is there a way to change a field (or field name) of existing list? |
I don't see how a mutable global lets you change the name of an existing list either, though. Once the child is created I think you'd have to recreate the child instead of being able to mutate the name. |
Do you have a sample to show the scenario you are concerned about? |
This is done as different arrow implementation have different default list item field name
For example, Rust implementation of Arrow (
arrow-rs
) has default value ofitem
(https://github.com/apache/arrow-rs/blob/4f1f6e57c568fae8233ab9da7d7c7acdaea4112a/arrow-schema/src/field.rs#L125).And according to the specification, the name of the list is not required to be a specific string. so to make it easier to use Java Arrow implementation with other implementation of Arrow, I'm making the
DATA_VECTOR_NAME
configurableI will add tests after I know this change is desired and will be approved
Another solution that is good for me, is to allow changing the field name of existing list (as sometimes I'm not the one that creates the list but all the lists are coming through me, currently it is not possible as far as I can tell.
the only place I see in the codebase that do something like that is in:
arrow-java/vector/src/test/java/org/apache/arrow/vector/pojo/TestConvert.java
Line 97 in be627bf
which convert the schema to json, change the field name and create the schema again (making sure the new field name is used)