Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
feat(rust,python): cast each parquet file to delta schema (#2615)
# Description By casting the read record batch to the delta schema datafusion can read tables where the underlying parquet files can be cast to the desired schema. Fixes: - Errors querying data where some of the parquet files may not have columns that were added later because of schema migration. This includes nested columns for structs that are in Maps, Lists, or children of other structs - maps and lists written with different different element names - timestamps of different units. - Any other cast supported by arrow-cast. This can be done now since data-fusion exposes a SchemaAdapter which can be overwritten. We should note that this makes all times being read by delta-rs as having microsecond precision to match the Delta protocol. # Related Issue(s) - This makes solving #2478 and #2341 just a matter of adding code to delta-rs cast. --------- Co-authored-by: Alex Wilcoxson <[email protected]>
- Loading branch information