Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve unparsing after optimize_projections optimization #13599

Conversation

sgrebnov
Copy link
Member

Which issue does this PR close?

Follow-up item for Support unparsing plans after applying the optimize_projections rule PR, focusing on the unparsing enhancement to support the described scenario.

With the optimize_projections optimization, columns are pushed down to the TableScan, resulting in both Projection and TableScan nodes containing projection information. As a result, the generated SQL contains a duplicate subquery, e.g., SELECT a, b FROM (SELECT a, b FROM my_table).

What changes are included in this PR?

PR updates unparse_table_scan_pushdown to prevent the creation of an additional projection if the unparser already includes a projection (SELECT) part.

Are these changes tested?

Yes, TPC-H, TPC-DS benchmark tests, unit tests.

Are there any user-facing changes?

Improved Plan-to-SQL conversion when a TableScan with projection information is wrapped in a Projection.

@github-actions github-actions bot added the sql SQL Planner label Nov 29, 2024
@sgrebnov sgrebnov force-pushed the sgrebnov/improve-unparsing-optimize-projections branch from c5cbff5 to 5ef7fb3 Compare November 29, 2024 07:01
@alamb
Copy link
Contributor

alamb commented Dec 2, 2024

@phillipleblanc do you potentially have time to review this PR?

Copy link
Contributor

@goldmedal goldmedal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @sgrebnov it's a nice improvement. Looks good to me 👍

Copy link
Contributor

@phillipleblanc phillipleblanc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These changes make sense to me.

It basically leverages the fact that since we know we already have a projection, the column projection information in the table scan is redundant - it would be an error for the table scan column projection to have fewer columns than the outer projection expects, and having more columns doesn't matter, since they get filtered by the outer projection anyway.

@goldmedal goldmedal merged commit a62033e into apache:main Dec 4, 2024
25 checks passed
@goldmedal
Copy link
Contributor

Thanks @sgrebnov and @alamb @phillipleblanc for reviewing

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
sql SQL Planner
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants