Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add line that prevents display_name from being called on Wildcard #4682

Merged

Conversation

andre-cc-natzka
Copy link
Contributor

@andre-cc-natzka andre-cc-natzka commented Dec 20, 2022

Which issue does this PR close?

Closes #4681.

Rationale for this change

As detailed in the issue, there was a problem with create_name() being called on Expr::Wildcard in the type_coercion and unwrap_cast_in_comparison optimization rules, which eventually led to these optimization rules not being applied.

What changes are included in this PR?

In the issue description, two different solutions were suggested. Here, I chose the first one, i.e. to add one line to the name_for_alias function, ensuring that we individualize the Expr::Wildcard case, and naming the expression in a reasonable way (e.g. "*") instead of calling display_name().

Are these changes tested?

No tests were added in this PR, but we tested the code with the suggested change on our own and the bug is solved, since the warnings are gone and the optimization rules are applied as intended.

If tests are not included in your PR, please explain why (for example, are they covered by existing tests)?

We still want to hear what you think about the best way to solve the issue.

If there are user-facing changes then we may require documentation to be updated before approving the PR.

No.

If there are any breaking changes to public APIs, please add the api change label.

No.

@github-actions github-actions bot added the optimizer Optimizer rules label Dec 20, 2022
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @andre-cc-natzka

I think this change is fine and we could merge it as is -- however, I suggest we update Expr::display_name instead

@@ -583,6 +583,7 @@ where
fn name_for_alias(expr: &Expr) -> Result<String> {
match expr {
Expr::Sort { expr, .. } => name_for_alias(expr),
Expr::Wildcard => Ok("*".to_string()),
expr => expr.display_name(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As you note in your PR description, another way to fix this would be to support Expr::Wildcard in Expr::display_name

Unless there is some reason to not support WildCards in Expr::display_name I would personally prefer adding the support there. It seems strange to put code for Wildcard in a function that seem to be trying to unwrap Expr::Sorts 🤔

https://github.com/apache/arrow-datafusion/blob/fddb3d3651041f41d66a801f10e27387e84374f7/datafusion/expr/src/expr.rs#L1358-L1360

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you very much @alamb! I agree with you, so I've just moved that line to the create_name function called by Expr::display_name. If this is fine for you, I guess we can merge it.

@github-actions github-actions bot added logical-expr Logical plan and expressions and removed optimizer Optimizer rules labels Dec 21, 2022
@alamb alamb merged commit bfef105 into apache:master Dec 21, 2022
@ursabot
Copy link

ursabot commented Dec 21, 2022

Benchmark runs are scheduled for baseline = ac2e5d1 and contender = bfef105. bfef105 is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Skipped ⚠️ Benchmarking of arrow-datafusion-commits is not supported on ec2-t3-xlarge-us-east-2] ec2-t3-xlarge-us-east-2
[Skipped ⚠️ Benchmarking of arrow-datafusion-commits is not supported on test-mac-arm] test-mac-arm
[Skipped ⚠️ Benchmarking of arrow-datafusion-commits is not supported on ursa-i9-9960x] ursa-i9-9960x
[Skipped ⚠️ Benchmarking of arrow-datafusion-commits is not supported on ursa-thinkcentre-m75q] ursa-thinkcentre-m75q
Buildkite builds:
Supported benchmarks:
ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
test-mac-arm: Supported benchmark langs: C++, Python, R
ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java

@andre-cc-natzka
Copy link
Contributor Author

Thanks for taking care of this! Have a Merry Christmas :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
logical-expr Logical plan and expressions
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Skipping optimizer rule due to create_name not supporting wildcard
3 participants