You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* #2061 support "PARTITIONED BY" in CreateExternalTable DDL for datafusion
* support table_partition_cols in ballista and add ParquetReadOptions
* fix a few usage of read_parquet
* fix CsvReadOption clone due to removing the copy trait
* fix CsvReadOption clone due to removing the copy trait
* fix "missing documentation for a struct field"
* fix a few usage of register_parquet
* Allow ParquetReadOption to receive parquet_pruning from execution::Context::SessionConfig
https://github.com/apache/arrow-datafusion/blob/73ea6e16f5c8f34526c01490a5ec277a68f33791/datafusion/tests/parquet_pruning.rs#L143
* fix benches import
* Apply suggestions from code review (lint)
Is your feature request related to a problem or challenge? Please describe what you are trying to do
Assume we have a data lake stores as
Currently, CreateExternalTable supports defining columns and location (e.g.
table/
)https://github.com/apache/arrow-datafusion/blob/5936edc2a94d5fb20702a41eab2b80695961b9dc/datafusion/src/sql/parser.rs#L70-L81
a sql query of
select * from table where year = '2022' and month = '03' and day = '20'
seems to scan all files undertable/
.Describe the solution you'd like
same as existing ListingOption,
PARTITIONED BY
only supports Stringhttps://github.com/apache/arrow-datafusion/blob/5936edc2a94d5fb20702a41eab2b80695961b9dc/datafusion/src/datasource/listing/table.rs#L178
Describe alternatives you've considered
Additional context
partitioned by
is also used in Trino and AWS Athenahttps://trino.io/episodes/5.html
https://docs.aws.amazon.com/athena/latest/ug/create-table.html
I notice that
ListingOptions
supportstable_partition_cols
and alsopartition pruning
, but justCreateExternalTable
does not accept such input and pass throughhttps://github.com/apache/arrow-datafusion/blob/5936edc2a94d5fb20702a41eab2b80695961b9dc/datafusion/src/datasource/listing/table.rs#L165-L186
https://github.com/apache/arrow-datafusion/blob/5936edc2a94d5fb20702a41eab2b80695961b9dc/datafusion/src/datasource/listing/table.rs#L358-L365
The text was updated successfully, but these errors were encountered: