Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Minor: Add test case for reduce cross join #4577

Merged
merged 1 commit into from
Dec 10, 2022
Merged

Conversation

ygf11
Copy link
Contributor

@ygf11 ygf11 commented Dec 10, 2022

Which issue does this PR close?

Closes #.

This pr is to fix #4443 (review)
cc @liukun4515

Rationale for this change

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

@github-actions github-actions bot added the core Core DataFusion crate label Dec 10, 2022
for repartition_joins in test_repartition_joins {
let ctx = create_join_context("t1_id", "t2_id", repartition_joins)?;

let sql = "select *,t1.t1_id+11 from t1,t2 where t1.t1_id+11=t2.t2_id";
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

select *,t1.t1_id+10 from t1,t2 where t1.t1_id+10=t2.t2_id;
will generate empty result set, I change to t1_id+11.

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you

@alamb alamb merged commit 65e248e into apache:master Dec 10, 2022
@ygf11 ygf11 deleted the test-case branch December 10, 2022 11:35
"Explain [plan_type:Utf8, plan:Utf8]",
" Projection: t1.t1_id, t1.t1_name, t1.t1_int, t2.t2_id, t2.t2_name, t2.t2_int, CAST(t1.t1_id AS Int64) + Int64(11) [t1_id:UInt32;N, t1_name:Utf8;N, t1_int:UInt32;N, t2_id:UInt32;N, t2_name:Utf8;N, t2_int:UInt32;N, t1.t1_id + Int64(11):Int64;N]",
" Projection: t1.t1_id, t1.t1_name, t1.t1_int, t2.t2_id, t2.t2_name, t2.t2_int [t1_id:UInt32;N, t1_name:Utf8;N, t1_int:UInt32;N, t2_id:UInt32;N, t2_name:Utf8;N, t2_int:UInt32;N]",
" Inner Join: t1.t1_id + Int64(11) = CAST(t2.t2_id AS Int64) [t1_id:UInt32;N, t1_name:Utf8;N, t1_int:UInt32;N, t1.t1_id + Int64(11):Int64;N, t2_id:UInt32;N, t2_name:Utf8;N, t2_int:UInt32;N, CAST(t2.t2_id AS Int64):Int64;N]",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

left project is ...CAST(t1.t1_id AS Int64) + Int64(11), and the right project is ...CAST(t2.t2_id AS Int64) AS CAST(t2.t2_id AS Int64)
But the condition of join is t1.t1_id + Int64(11) = CAST(t2.t2_id AS Int64).

Do we need to do type coercion for the exprs in the Join Plan after this pr #4353?

cc @ygf11

Copy link
Contributor Author

@ygf11 ygf11 Dec 13, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But the condition of join is t1.t1_id + Int64(11) = CAST(t2.t2_id AS Int64).

The reason is projection will:

  • using expr.display_name() --- t1.t1_id + Int64(11) as field name to generate its schema, which we also use to find the field.
  • using full expr name --- CAST(t1.t1_id AS Int64) + Int64(11) to display the projection expression.

To address this gap, we can add alias or remove these projections #4389. I am working on #4389, and plan to submit a pr today or tomorrow.

Do we need to do type coercion for the exprs in the Join Plan after this pr #4353?

No, I think the type coercion already has done in this test case, because the join keys are from join filter which has been optimized.

https://github.com/apache/arrow-datafusion/blob/b822b0e5e582676eb58faf7fb89adc312dc95174/datafusion/expr/src/utils.rs#L481-L485

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core Core DataFusion crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants