Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

array[] / make_array() will error if the types can not be coerced to a common type #3170

Open
alamb opened this issue Aug 15, 2022 · 11 comments
Labels
bug Something isn't working

Comments

@alamb
Copy link
Contributor

alamb commented Aug 15, 2022

Describe the bug
make_array() will error if the types can not be coerced to a common type, as in postgres and spark

To Reproduce

Run this query (see query_array_scalar_coerce test added in #3122)

SELECT [1, 2, '3']

This results in an error:

"Error during planning: Coercion from [Int64, Int64, Utf8] to the signature VariadicEqual failed.",);

Expected behavior
I expect this to return like postgres where it has coerced all the arguments to int (tried to coerce all arguments to the same type as the first, which in this case is 1 and thus int

alamb=# select array[1, 2, '3'];
  array  
---------
 {1,2,3}
(1 row)

Additional context
Found in #3122

@alamb alamb added the bug Something isn't working label Aug 15, 2022
@alamb
Copy link
Contributor Author

alamb commented Aug 16, 2022

I actually think in general the coercion rules for uniform should attempt to coerce all arguments to the same type as the first argument

@comphead
Copy link
Contributor

@alamb this issue description confronts with #3123
Which one should be done?

@alamb
Copy link
Contributor Author

alamb commented Aug 31, 2022

@comphead I think #3123 and this ticket are different

This ticket is about the input types of the arguments to array[]

#3123 is about the output type produced by the array[] function

@comphead
Copy link
Contributor

comphead commented Sep 2, 2022

@alamb I checked 3 options

DataFusion CLI v11.0.0
❯ select make_array(1,2,'3');
+----------------------------------------+
| makearray(Int64(1),Int64(2),Utf8("3")) |
+----------------------------------------+
| [1, 2, 3]                              |
+----------------------------------------+
1 row in set. Query took 0.006 seconds.
❯ select [1, 2, '3'];
NotImplemented("Arrays with different types are not supported: {Int64, Utf8}")
❯ select array[1,2,'3'];
NotImplemented("Arrays with different types are not supported: {Utf8, Int64}") 

here I can see nonconsistent behaviuor. I suppose all 3 should try coerce to first datatype and fail if its not possible?

@alamb
Copy link
Contributor Author

alamb commented Sep 4, 2022

here I can see nonconsistent behaviuor. I suppose all 3 should try coerce to first datatype and fail if its not possible?

I think that would make sense -- thank you for looking into this @comphead

@comphead
Copy link
Contributor

@alamb it appears we have 2 code bases for supporting arrays:

  • built in scalar function that drives make_array(x, y) function
  • sql planner that drives array[x, y] or [x, y] constructions.

All those constructions do the same work, but they can be out of sync like now. do we need to support all 2 branches?

@alamb
Copy link
Contributor Author

alamb commented Sep 21, 2022

All those constructions do the same work, but they can be out of sync like now. do we need to support all 2 branches?

I am sorry for the delay in response. The short answer is I don't really know what the desired / correct behavior is here -- I think array[] syntax is from postgres and make_array() (formerly array()) is from Spark

@comphead
Copy link
Contributor

Yes, we can support all 3 of them. My concern there are 2 different codebases, and no easy way to find a common denominator for them. I can make all 3 functions works the same way, but in future those codebases can be out of sync again, leading make_array works differently comparing to array or [].

@alamb
Copy link
Contributor Author

alamb commented Sep 22, 2022

🤔

@comphead
Copy link
Contributor

@alamb one more thing I found
make_array works on coerce expressions, and coerce doesn' consider first element as driving cast element. Rather it finds common type among all array values. Are you ok to with current behaviuor or you would like to change it to make first datatype as target type?

@alamb
Copy link
Contributor Author

alamb commented Sep 30, 2022

Are you ok to with current behaviuor or you would like to change it to make first datatype as target type?

Ideally we would follow some well established model (like posgres) but if that is too challenging I think we can keep the existing behavior too

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
2 participants