Add line that prevents display_name from being called on Wildcard #4682

andre-cc-natzka · 2022-12-20T18:07:41Z

Which issue does this PR close?

Closes #4681.

Rationale for this change

As detailed in the issue, there was a problem with create_name() being called on Expr::Wildcard in the type_coercion and unwrap_cast_in_comparison optimization rules, which eventually led to these optimization rules not being applied.

What changes are included in this PR?

In the issue description, two different solutions were suggested. Here, I chose the first one, i.e. to add one line to the name_for_alias function, ensuring that we individualize the Expr::Wildcard case, and naming the expression in a reasonable way (e.g. "*") instead of calling display_name().

Are these changes tested?

No tests were added in this PR, but we tested the code with the suggested change on our own and the bug is solved, since the warnings are gone and the optimization rules are applied as intended.

If tests are not included in your PR, please explain why (for example, are they covered by existing tests)?

We still want to hear what you think about the best way to solve the issue.

If there are user-facing changes then we may require documentation to be updated before approving the PR.

No.

If there are any breaking changes to public APIs, please add the `api change` label.

No.

alamb

Thanks @andre-cc-natzka

I think this change is fine and we could merge it as is -- however, I suggest we update Expr::display_name instead

alamb · 2022-12-20T19:01:00Z

datafusion/optimizer/src/utils.rs

@@ -583,6 +583,7 @@ where
 fn name_for_alias(expr: &Expr) -> Result<String> {
    match expr {
        Expr::Sort { expr, .. } => name_for_alias(expr),
+        Expr::Wildcard => Ok("*".to_string()),
        expr => expr.display_name(),


As you note in your PR description, another way to fix this would be to support Expr::Wildcard in Expr::display_name

Unless there is some reason to not support WildCards in Expr::display_name I would personally prefer adding the support there. It seems strange to put code for Wildcard in a function that seem to be trying to unwrap Expr::Sorts 🤔

https://github.com/apache/arrow-datafusion/blob/fddb3d3651041f41d66a801f10e27387e84374f7/datafusion/expr/src/expr.rs#L1358-L1360

Thank you very much @alamb! I agree with you, so I've just moved that line to the create_name function called by Expr::display_name. If this is fine for you, I guess we can merge it.

ursabot · 2022-12-21T20:21:53Z

Benchmark runs are scheduled for baseline = ac2e5d1 and contender = bfef105. bfef105 is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Skipped ⚠️ Benchmarking of arrow-datafusion-commits is not supported on ec2-t3-xlarge-us-east-2] ec2-t3-xlarge-us-east-2
[Skipped ⚠️ Benchmarking of arrow-datafusion-commits is not supported on test-mac-arm] test-mac-arm
[Skipped ⚠️ Benchmarking of arrow-datafusion-commits is not supported on ursa-i9-9960x] ursa-i9-9960x
[Skipped ⚠️ Benchmarking of arrow-datafusion-commits is not supported on ursa-thinkcentre-m75q] ursa-thinkcentre-m75q
Buildkite builds:
Supported benchmarks:
ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
test-mac-arm: Supported benchmark langs: C++, Python, R
ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java

andre-cc-natzka · 2022-12-22T10:17:10Z

Thanks for taking care of this! Have a Merry Christmas :)

Add line that prevents display_name from being called on Wildcard

92370c4

github-actions bot added the optimizer Optimizer rules label Dec 20, 2022

alamb approved these changes Dec 20, 2022

View reviewed changes

Move wildcard line to create_name()

315c451

github-actions bot added logical-expr Logical plan and expressions and removed optimizer Optimizer rules labels Dec 21, 2022

alamb approved these changes Dec 21, 2022

View reviewed changes

alamb merged commit bfef105 into apache:master Dec 21, 2022

bseifert-natzka mentioned this pull request Feb 2, 2023

Add line that prevents display_name from being called on Wildcard NatzkaLabs/arrow-datafusion-v14-patched#2

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add line that prevents display_name from being called on Wildcard #4682

Add line that prevents display_name from being called on Wildcard #4682

andre-cc-natzka commented Dec 20, 2022 •

edited

Loading

alamb left a comment

alamb Dec 20, 2022

andre-cc-natzka Dec 21, 2022

ursabot commented Dec 21, 2022

andre-cc-natzka commented Dec 22, 2022

Add line that prevents display_name from being called on Wildcard #4682

Add line that prevents display_name from being called on Wildcard #4682

Conversation

andre-cc-natzka commented Dec 20, 2022 • edited Loading

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

Are these changes tested?

If tests are not included in your PR, please explain why (for example, are they covered by existing tests)?

If there are user-facing changes then we may require documentation to be updated before approving the PR.

If there are any breaking changes to public APIs, please add the api change label.

alamb left a comment

Choose a reason for hiding this comment

alamb Dec 20, 2022

Choose a reason for hiding this comment

andre-cc-natzka Dec 21, 2022

Choose a reason for hiding this comment

ursabot commented Dec 21, 2022

andre-cc-natzka commented Dec 22, 2022

andre-cc-natzka commented Dec 20, 2022 •

edited

Loading

If there are any breaking changes to public APIs, please add the `api change` label.