We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
@mnorfolk03 added planning benchmark for more sophisticated queries here #13085 ❤️
The benchmarks are in https://github.com/apache/datafusion/blob/main/datafusion/core/benches/sql_planner.rs
However, the planning benchmarks we have now don't reflect querying an actual data source such as parquet (they query an empty in-memory table)
One thing that might be helpful to improve more would be adding a ParquetExec as well as queries that have sortedness to reflect more real world cases
I would like some planning benchmarks equivalent of planning against tables like this (docs here): https://datafusion.apache.org/user-guide/sql/ddl.html#create-external-table
CREATE EXTERNAL TABLE foo STORED AS PARQUET LOCATION '..'
CREATE EXTERNAL TABLE test ( c1 VARCHAR NOT NULL, c2 INT NOT NULL, c3 SMALLINT NOT NULL, c4 SMALLINT NOT NULL, c5 INT NOT NULL, c6 BIGINT NOT NULL, c7 SMALLINT NOT NULL, c8 INT NOT NULL, c9 BIGINT NOT NULL, c10 VARCHAR NOT NULL, c11 FLOAT NOT NULL, c12 DOUBLE NOT NULL, c13 VARCHAR NOT NULL ) STORED AS CSV WITH ORDER (c2 ASC, c5 + c8 DESC NULL FIRST) LOCATION '/path/to/aggregate_test_100.csv' OPTIONS ('has_header' 'true');
One possibility could be to add a benchmark for planning the clickbench queries: https://github.com/apache/datafusion/tree/main/benchmarks/queries/clickbench
We could either use the smaller hits.parquet file here: https://github.com/apache/datafusion/blob/main/datafusion/core/tests/data/clickbench_hits_10.parquet
No response
The text was updated successfully, but these errors were encountered:
take
Sorry, something went wrong.
Omega359
Successfully merging a pull request may close this issue.
Is your feature request related to a problem or challenge?
@mnorfolk03 added planning benchmark for more sophisticated queries here #13085 ❤️
The benchmarks are in https://github.com/apache/datafusion/blob/main/datafusion/core/benches/sql_planner.rs
However, the planning benchmarks we have now don't reflect querying an actual data source such as parquet (they query an empty in-memory table)
One thing that might be helpful to improve more would be adding a ParquetExec as well as queries that have sortedness to reflect more real world cases
Describe the solution you'd like
I would like some planning benchmarks equivalent of planning against tables like this (docs here): https://datafusion.apache.org/user-guide/sql/ddl.html#create-external-table
Describe alternatives you've considered
One possibility could be to add a benchmark for planning the clickbench queries: https://github.com/apache/datafusion/tree/main/benchmarks/queries/clickbench
We could either use the smaller hits.parquet file here: https://github.com/apache/datafusion/blob/main/datafusion/core/tests/data/clickbench_hits_10.parquet
Additional context
No response
The text was updated successfully, but these errors were encountered: