Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: LogicalPlan::TableScan now refers to table provider by name #2266

Closed
wants to merge 17 commits into from

Conversation

andygrove
Copy link
Member

@andygrove andygrove commented Apr 18, 2022

Which issue does this PR close?

Closes #2247.

Note that this is an early WIP and not functional yet.

I am putting up this PR in an early state to get feedback on the general approach.

Rationale for this change

The logical plan currently depends on the physical plan, which prevents us from moving the logical plan into its own crate alongside the logical expressions (which is needed in order to support subqueries, and also to better support the use case where DataFusion is being used just for SQL query planning and then using a different execution engine).

The reason that the logical plan depends on the physical plan is that LogicalPlan::TableScan has a reference to a TableProvider which has a scan method that returns ExecutionPlan.

What changes are included in this PR?

  • TableScan struct no longer has a direct reference to a TableProvider instance. The TableProvider is looked up when needed, based on table_name.
  • CatalogList is now passed to optimizer rules, allowing tables to be resolved during optimization
  • TableScan now contains full_filters, partial_filters and unsupported_filters so that we can still show them in fmt::Display without having access to the TableProvider
  • Fix Ballista serde
  • Update many tests to provide a valid CatalogList

Are there any user-facing changes?

Yes, this is an API change.

@andygrove andygrove added the api change Changes the API exposed to users of the crate label Apr 18, 2022
@github-actions github-actions bot added ballista datafusion Changes in the datafusion crate labels Apr 18, 2022
@matthewmturner
Copy link
Contributor

@andygrove would you be able to provide color on the use case for passing CatalogList to optimizer rules / resolving tables during optimization?

@andygrove
Copy link
Member Author

This was too ambitious. Replace with simpler solution in #2284

@andygrove andygrove closed this Apr 20, 2022
@andygrove andygrove deleted the table-scan-provider-name branch January 27, 2023 18:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api change Changes the API exposed to users of the crate datafusion Changes in the datafusion crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

LogicalPlan::TableScan should not depend on the physical plan
2 participants