Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ProjectionExec Loses Field Metadata #1361

Closed
tustvold opened this issue Nov 25, 2021 · 2 comments · Fixed by #1378
Closed

ProjectionExec Loses Field Metadata #1361

tustvold opened this issue Nov 25, 2021 · 2 comments · Fixed by #1378
Labels
bug Something isn't working datafusion Changes in the datafusion crate

Comments

@tustvold
Copy link
Contributor

Describe the bug

ProjectionExec does not preserve field metadata when generating the projected schema. I believe this will also cause it to lose dictionary information for the field.

I think fixing this will require extending PhysicalExpr with a fn field(&self, input_schema: &Schema) -> Result<Field>; or similar to replace the current logic that just constructs a new Field

To Reproduce

Create a RecordBatch with a schema with field metadata and pass it through ProjectionExec.

Expected behavior

There is potentially discussion to be had w.r.t schema-level metadata, but I would expect field-level metadata to be preserved for the fields that are projected

@tustvold tustvold added the bug Something isn't working label Nov 25, 2021
@hntd187
Copy link
Contributor

hntd187 commented Nov 28, 2021

So I was looking into this, how do you expect things like BinaryExpr to work here? Do you expect them to merge the column metadata?

@tustvold
Copy link
Contributor Author

tustvold commented Nov 28, 2021

I guess I was expecting that the Column expression would preserve field metadata, and all other expressions would continue to do what they currently do. Part of the challenge is that ProjectionExec can be used for more than just basic column projection due to its formulation in terms of expressions. This isn't necessarily a problem, but I'd just like to be able to project a schema without losing field metadata - as within IOx we use this for additional field type information

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working datafusion Changes in the datafusion crate
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants