Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support for non-correlated subqueries #3287

Merged
merged 4 commits into from
Aug 30, 2022

Conversation

kmitchener
Copy link
Contributor

Which issue does this PR close?

Closes #3266.

Rationale for this change

What changes are included in this PR?

Handle case where subquery doesn't have a filter.

Are there any user-facing changes?

@github-actions github-actions bot added the optimizer Optimizer rules label Aug 29, 2022
@avantgardnerio
Copy link
Contributor

Thanks for looking at this! It's good to have another person touching this code. I'm happy to help in any way if I can!

@kmitchener kmitchener marked this pull request as ready for review August 29, 2022 18:39
@kmitchener
Copy link
Contributor Author

@andygrove PTAL

@kmitchener
Copy link
Contributor Author

I tested the SQL in #3266 and that works locally. As well as the query in TPCH 15.

@avantgardnerio
Copy link
Contributor

@@ -702,4 +727,31 @@ mod tests {
assert_optimized_plan_eq(&DecorrelateScalarSubquery::new(), &plan, expected);
Ok(())
}

/// Test for non-correlated scalar subquery with no filters
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we rename this if it also does uncorrelated subqueries now? Maybe scalar_subquery_to_join?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean rename the whole rule? Probably so .. good idea. You want to do that as part of this issue?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't want to, but we probably should. Maybe we can just doc it for now with a TODO and address it in a future PR?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It touches a bunch of code, but yes, your suggested name is much more clear. I'll do a separate PR for it this week if this gets merged in.


/// Test for non-correlated scalar subquery with no filters
#[test]
fn scalar_subquery_non_correlated_no_filters() -> Result<()> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like 1 branch was added, and 1 test. LGTM!

Copy link
Contributor

@avantgardnerio avantgardnerio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The unwrap is gone, a test was added for the additional code path... LGTM!

@codecov-commenter
Copy link

Codecov Report

Merging #3287 (f395f77) into master (7aed4d6) will increase coverage by 0.00%.
The diff coverage is 95.23%.

@@           Coverage Diff           @@
##           master    #3287   +/-   ##
=======================================
  Coverage   85.92%   85.93%           
=======================================
  Files         294      294           
  Lines       53469    53483   +14     
=======================================
+ Hits        45945    45960   +15     
+ Misses       7524     7523    -1     
Impacted Files Coverage Δ
datafusion/optimizer/src/utils.rs 92.17% <ø> (ø)
...usion/optimizer/src/decorrelate_scalar_subquery.rs 93.07% <95.23%> (+0.35%) ⬆️
datafusion/expr/src/logical_plan/plan.rs 78.73% <0.00%> (ø)
datafusion/expr/src/window_frame.rs 93.27% <0.00%> (+0.84%) ⬆️

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

Copy link
Member

@andygrove andygrove left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @kmitchener.

Changes look reasonable to me, but I am mostly approving based on the review from @avantgardnerio, who wrote the original code. Thanks for the review @avantgardnerio!

@andygrove andygrove merged commit fb2f0db into apache:master Aug 30, 2022
@ursabot
Copy link

ursabot commented Aug 30, 2022

Benchmark runs are scheduled for baseline = 256ea91 and contender = fb2f0db. fb2f0db is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Skipped ⚠️ Benchmarking of arrow-datafusion-commits is not supported on ec2-t3-xlarge-us-east-2] ec2-t3-xlarge-us-east-2
[Skipped ⚠️ Benchmarking of arrow-datafusion-commits is not supported on test-mac-arm] test-mac-arm
[Skipped ⚠️ Benchmarking of arrow-datafusion-commits is not supported on ursa-i9-9960x] ursa-i9-9960x
[Skipped ⚠️ Benchmarking of arrow-datafusion-commits is not supported on ursa-thinkcentre-m75q] ursa-thinkcentre-m75q
Buildkite builds:
Supported benchmarks:
ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
test-mac-arm: Supported benchmark langs: C++, Python, R
ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
optimizer Optimizer rules
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add support for non-correlated subqueries
5 participants