Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support struct coercion in type_union_resolution #12839

Merged
merged 10 commits into from
Oct 11, 2024

Conversation

jayzhan211
Copy link
Contributor

@jayzhan211 jayzhan211 commented Oct 10, 2024

Which issue does this PR close?

Closes #12843.

Previous I switch coercion rule of array and value from comparison_coercion to type_union_resolution. But some types are not covered in type_union_resolution.

Rationale for this change

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

Signed-off-by: jayzhan211 <[email protected]>
@github-actions github-actions bot added the sqllogictest SQL Logic Tests (.slt) label Oct 10, 2024
@jayzhan211 jayzhan211 marked this pull request as draft October 10, 2024 01:04
Signed-off-by: jayzhan211 <[email protected]>
@github-actions github-actions bot added the logical-expr Logical plan and expressions label Oct 10, 2024
@alamb
Copy link
Contributor

alamb commented Oct 10, 2024

wasmtest Ci failure has been fixed in #12844

@jayzhan211 jayzhan211 marked this pull request as ready for review October 10, 2024 14:22
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @jayzhan211 -- this looks great

This PR seems like an improvement to me, but I am a bit worried we still have a testing gap.

Since it is not clear what the actual semantics are supposed to be I can't fully tell if this fixes the problem @kazuyukitanimura was seeing

#12843 (comment)

return None;
}

let types = std::iter::zip(lhs.iter(), rhs.iter())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not familiar with the semantics, but as I read this code it would not coerce structs with fields named the same but in a different order.

For example

{ 
  a: 1,
  b: 2
}

Would not be coerceable / comparable to

{ 
  b: 20 // note the fields are in different order
  a: 10,
}

Is that intended? 🤔

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like a bug 👍

Copy link
Contributor Author

@jayzhan211 jayzhan211 Oct 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to fix Values first for this case, currently we doesn't have schema (column type) for Values, so we can't tell which order is correct if we got 2 different order struct. I think we should provider schema for building values or assume the first struct is the correct one.

#5046


Ok(valid_types)
// Every signature failed, return the joined error
if res.is_empty() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 for better errors

@@ -106,6 +108,32 @@ impl ScalarUDFImpl for MakeArray {

fn coerce_types(&self, arg_types: &[DataType]) -> Result<Vec<DataType>> {
if let Some(new_type) = type_union_resolution(arg_types) {
// TODO: Move the logic to type_union_resolution if this applies to other functions as well
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, this is the same question I had above -- this code seems to properly account for differences in field order

@@ -373,3 +373,30 @@ You reached the bottom!

statement ok
drop view complex_view;

# struct with different keys r1 and r2 is not valid
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you also please add a test that shows what happens with coercion when the fields are in a different order?

Like this

> create table t(a struct<r1 varchar, c int>) as values ({r1: 'foo', c:1}), ({c:2, r: 'bar'});

On main it seems to get an error:

Error during planning: Inconsistent data type across values list at row 1 column 0. Was Struct([Field { name: "r1", data_type: Utf8, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }, Field { name: "c", data_type: Int64, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }]) but found Struct([Field { name: "c", data_type: Int64, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }, Field { name: "r", data_type: Utf8, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }])

Copy link
Contributor

@kazuyukitanimura kazuyukitanimura left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @jayzhan211 for fixing so quickly

@jayzhan211 jayzhan211 marked this pull request as draft October 11, 2024 02:33
Signed-off-by: jayzhan211 <[email protected]>
Signed-off-by: jayzhan211 <[email protected]>
Signed-off-by: jayzhan211 <[email protected]>
Signed-off-by: jayzhan211 <[email protected]>
@jayzhan211 jayzhan211 marked this pull request as ready for review October 11, 2024 05:06
@jayzhan211
Copy link
Contributor Author

I will take a look on #5046 and struct in coalesce

@jayzhan211 jayzhan211 merged commit 3b6aac2 into apache:main Oct 11, 2024
24 checks passed
@jayzhan211 jayzhan211 deleted the array-coercion branch October 11, 2024 05:13
@jayzhan211
Copy link
Contributor Author

Thanks @alamb @kazuyukitanimura

(row('purple', 1), row('green', 2.3));

# out of order struct literal
# TODO: This query should not fail
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
logical-expr Logical plan and expressions sqllogictest SQL Logic Tests (.slt)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Regression on coercing Array of Structs
3 participants