-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support to skip schema inference #45
Conversation
looks good, thanks! |
@samansmink - Hi, I downloaded DuckDB nightly and didn't find this feature (skip_schema_inference) |
@harel-e are you sure? for me it works: force install iceberg from 'http://nightly-extensions.duckdb.org';
load iceberg;
FROM iceberg_metadata("my_iceberg_table", skip_schema_inference = true); |
@samansmink - I wasn't aware of force install, but it still failed. Using the nightly build binary ./duckdb D force install iceberg from 'http://nightly-extensions.duckdb.org'; Are you using a development build? In this case, extensions might not (yet) be uploaded. |
@harel-e yea we don't have good update semantics (yet) for extensions. Force installing will override your current installation with whatever you provide, otherwise DuckDB will not update thinking that iceberg is already installed.
That's a bit quirky atm: we distribute nightly binaries for extensions that target the latest stable release of duckdb, and we distribute nightly binaries of duckdb with stable versions of extensions. But we do not distribute nightly extensions for nightly binaries of duckdb automatically so these can be behind sometimes. I will bump the iceberg extension in duckdb main which should resolve this |
@samansmink - Thank you for making this change available in the extensions. |
Support to skip schema inference
The current version does not support complex data type parsing while inferring the schema from within the snapshot.
By the time support for complex data type comes, I am introducing a flag that can be used to skip this flow. This will offload schema parsing to the underlying parquet extension. Here is how you can do it -
scan data:
scan metadata:
scan snapshots:
Note - I am closing an earlier PR that was requesting these changes and was a bit complex to understand - #43