-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: move coercion of union from builder to TypeCoercion
#11961
Conversation
@@ -1881,23 +1799,6 @@ mod tests { | |||
Ok(()) | |||
} | |||
|
|||
#[test] | |||
fn plan_builder_union_different_num_columns_error() -> Result<()> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved to slt.
@@ -127,7 +130,14 @@ mod tests { | |||
} | |||
|
|||
fn assert_optimized_plan_equal(plan: LogicalPlan, expected: &str) -> Result<()> { | |||
assert_optimized_plan_eq(Arc::new(EliminateNestedUnion::new()), plan, expected) | |||
let options = ConfigOptions::default(); | |||
let analyzed_plan = Analyzer::with_rules(vec![Arc::new(TypeCoercion::new())]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add TypeCoercion to avoid breaking the tests.
05)--------Projection: Int64(1) AS a | ||
06)----------EmptyRelation | ||
07)--Projection: Float64(2.1) + x.a AS Float64(0) + x.a | ||
08)----Aggregate: groupBy=[[Float64(2.1) + CAST(x.a AS Float64)]], aggr=[[]] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the only difference I observed from the previous test results because type coercion was performed, and x.a
had an additional cast.
@@ -2176,148 +2176,6 @@ fn union_all() { | |||
quick_test(sql, expected); | |||
} | |||
|
|||
#[test] | |||
fn union_with_different_column_names() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Converting these union tests to SLT is easier than moving them to TypeCoercion
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @jonahgao -- I agree it makes much more sense to have the type coercion for union happening as part of TypeCoercsion
rather than the SQL planner / builder.
return plan_err!("Empty UNION"); | ||
} | ||
|
||
// Temporarily use the schema from the left input and later rely on the analyzer to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
05)----EmptyRelation | ||
|
||
# union_with_incompatible_data_type() | ||
query error Error during planning: UNION Column 'Int64\(1\)' \(type: Int64\) is not compatible with other type: Interval\(MonthDayNano\) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I noticed that the error refers to the types in reverse order than they appear in the query
The error message might be better if it were something more like
Incompatible inputs for Union. Previous inputs were of type Interval\(MonthDayNano\), got incomaptible type 'Int64\(1\)' \(type: Int64\)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Thanks @alamb
@@ -809,7 +809,7 @@ fn coerce_case_expression(case: Case, schema: &DFSchema) -> Result<Case> { | |||
} | |||
|
|||
/// Get a common schema that is compatible with all inputs of UNION. | |||
fn coerce_union_schema(inputs: Vec<Arc<LogicalPlan>>) -> Result<DFSchema> { | |||
fn coerce_union_schema(inputs: &Vec<Arc<LogicalPlan>>) -> Result<DFSchema> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❤️
Looks great @jonahgao -- thank you |
Thanks for the review @alamb |
…pache#11961)" This reverts commit afa23ab.
…pache#11961)" This reverts commit afa23ab.
Which issue does this PR close?
Closes #11742.
Rationale for this change
Currently, in the LogicalPlan builder, we coerce the inputs of a union plan to a common schema.
But this common schema is inaccurate because it lacks type coercion for expressions.
For example, in #11742, the return type of the expression
nvl(v1, 0.5)
is INT before type coercion and will become Float64 afterward.The fix is to place coercion of the union after the type coercion of expressions.
What changes are included in this PR?
TypeCoercion
.Are these changes tested?
Yes
Are there any user-facing changes?
Yes
project_with_column_index
has become private. It was introduced for UNION in #2108, and I think it should only be used internally.