-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: native map/array constructors #4232
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice one @agavra
There are some comments inline, plus:
- Syntax wise, all our other constructors use standard brackets. I think array should too. I know Postgres supports both
ARRAY[]
andARRAY()
. But IMHO I think it's cleaner to leave[]
for array access, not construction. - This PR would benefit from RQTT test cases for
INSERT VALUES
. I'm confident this will flag up missing functionality in the coercer code. - At the moment it looks like its not possible to have an empty arrays or maps, or arrays and maps with only null elements / values. I think this is fine for this initial PR, but this is a hole we'll want to fix quickly. We should raise, (and link), suitable issues to track this and, ideally, get this done soon. One possible solution is to support casting elements, e.g. Postgres supports
ARRAY(null)::int[]
,ARRAY()::int[]
, which casts all the elements in the array toINT
and the array type toARRAY<INT>
. - You've dropped
as_array
. Is there the option to keep it until just before we go version 1.0? We could mark it as deprecated in the UDFs description and remove it from the docs. We could even log a warning when a query starts that uses it. Thereby giving people time to move over.
ksql-execution/src/main/java/io/confluent/ksql/execution/util/ExpressionTypeManager.java
Outdated
Show resolved
Hide resolved
.collect(Collectors.toList()); | ||
|
||
if (sqlTypes.size() == 0) { | ||
throw new KsqlException("Cannot extract type from array of nulls!"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again with the exclamation marks ;)
If this is customer facing, which as a KsqlException
I'm assuming it is, then maybe something like:
Can not construct array with all NULL elements.
You could even link to the github issue that will add support for empty/all-null arrays so that people can upvote it.
Can the user work around this by casting one of the NULLs to a type? Maybe include this in the error message and add a QTT test case.
What about for INSERT VALUES
where the required array type is known.... can we support all nulls then? How about empty arrays?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can the user work around this by casting one of the NULLs to a type? Maybe include this in the error message and add a QTT test case.
Yes and done.
What about for INSERT VALUES where the required array type is known.... can we support all nulls then? How about empty arrays?
This is not a trivial change. I've thought about this in the context of various things (i.e. struct type inference) and it's out of scope for this PR, though technically possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we link in a github issue to track this second part please?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ksql-execution/src/main/java/io/confluent/ksql/execution/util/ExpressionTypeManager.java
Outdated
Show resolved
Hide resolved
ksql-execution/src/main/java/io/confluent/ksql/execution/util/ExpressionTypeManager.java
Outdated
Show resolved
Hide resolved
ksql-execution/src/main/java/io/confluent/ksql/execution/util/ExpressionTypeManager.java
Outdated
Show resolved
Hide resolved
|
||
if (sqlTypes.size() != 1) { | ||
throw new KsqlException( | ||
"Invalid map expression! All values must be of the same type. " + exp); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As with ARRAYs, it might help the user if the error included the types of the values, so that they could more easily see where the issue is., e.g. a UDF that returns a type different to what they thought.
Maybe add details of how the user could use CAST
to fix the issue?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a better error message, but I don't think we need to tell the user to CAST
as that may not be the right solution for them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the issue with suggesting CAST
as a possible solution? We're not forcing them to use it...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair enough
} | ||
|
||
if (sqlTypes.get(0) == null) { | ||
throw new KsqlException("Maps do not accept null values!"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe something like:
Can not construct MAP with all NULL values.
You could even link to the github issue that will add support for empty/all-null arrays so that people can upvote it.
Can the user work around this by casting one of the NULLs to a type? Maybe include this in the error message and add a QTT test case.
What about for INSERT VALUES
where the required map type is known.... can we support all nulls then? How about empty maps?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See above
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Second part you've covered above. Lets just make sure its captured in an issue, maybe linking to it from the error.
Still think the error message could be improve to:
| Can not construct MAP with all NULL values.
And add the trick of using CAST
.
ksql-functional-tests/src/test/resources/query-validation-tests/asarray.json
Outdated
Show resolved
Hide resolved
ksql-functional-tests/src/test/resources/query-validation-tests/asmap.json
Outdated
Show resolved
Hide resolved
Thanks for the review @big-andy-coates! I agree with your comments about testing and casting and I'll add those in. I don't think we should keep For the array constructor syntax, I think I'd prefer to keep it as is (which is the sql99 standard: http://web.cecs.pdx.edu/~len/sql1999.pdf):
A |
I'm fully supportive of the square-bracket syntax for array creation - it
is, after all, a language construct and not a function (at least it is now!
) and this matches the systems i've used in the past.
Introducing new syntax for CASTing, a la postgres, seems iffy though
without further discussion.
…On Tue, Jan 7, 2020 at 9:24 AM Almog Gavra ***@***.***> wrote:
Thanks for the review @big-andy-coates
<https://github.com/big-andy-coates>! I agree with your comments about
testing and casting and I'll add those in.
I don't think we should keep AS_ARRAY because anyone that is upgrading to
0.7 will already have lots of breaking changes to make and this one isn't
hard to change.
For the array constructor syntax, I think I'd prefer to keep it as is
(which is the sql99 standard: http://web.cecs.pdx.edu/~len/sql1999.pdf):
<array value list constructor> ::= ARRAY <left bracket or trigraph> <array
element list> <right bracket or trigraph>
A trigraph is ??(, which very few sql implementations seem to support.
All examples I've found online of sql systems use the ARRAY[] syntax.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#4232?email_source=notifications&email_token=ABCXJIBWZWBX6NAFNHDPRSTQ4S3DJA5CNFSM4KDNZDI2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEIJT4LA#issuecomment-571686444>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABCXJIC6JAHK56CXQVTJOIDQ4S3DJANCNFSM4KDNZDIQ>
.
|
a22eff5
to
841a14d
Compare
ksql-execution/src/main/java/io/confluent/ksql/execution/util/ExpressionTypeManager.java
Outdated
Show resolved
Hide resolved
.collect(Collectors.toList()); | ||
|
||
if (sqlTypes.size() == 0) { | ||
throw new KsqlException("Cannot extract type from array of nulls!"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we link in a github issue to track this second part please?
throw new KsqlException("Only STRING keys are supported in maps but got: " + gkeyTypes); | ||
} | ||
|
||
final List<SqlType> sqlTypes = exp.getMap() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
super nit: can we call this valueTypes
and the other keyTypes
to make it clear?
} | ||
|
||
if (sqlTypes.get(0) == null) { | ||
throw new KsqlException("Maps do not accept null values!"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Second part you've covered above. Lets just make sure its captured in an issue, maybe linking to it from the error.
Still think the error message could be improve to:
| Can not construct MAP with all NULL values.
And add the trick of using CAST
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @agavra for addressing my suggestions. Agree with everything else. Couple of points / nits below.
Andy
@@ -7,7 +7,7 @@ | |||
"name": "construct a list from two elements", | |||
"statements": [ | |||
"CREATE STREAM TEST (a INT, b INT) WITH (kafka_topic='test_topic', value_format='JSON');", | |||
"CREATE STREAM OUTPUT AS SELECT as_array(a, b, 3) as l FROM TEST;" | |||
"CREATE STREAM OUTPUT AS SELECT ARRAY[a, b, 3] as l FROM TEST;" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
might want to rename this test file... it's currently asarray.json
and AS_ARRAY
no longer exists ;) . How about arrays.json
:p
@@ -3,6 +3,19 @@ | |||
"Tests covering map creation" | |||
], | |||
"tests": [ | |||
{ | |||
"name": "create map from named tuples", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
likewise, consider renaming this test file.
{ | ||
"name": "should handle arbitrary nested expressions", | ||
"statements": [ | ||
"CREATE STREAM TEST (I INT, A ARRAY<ARRAY<INT>>) WITH (kafka_topic='test_topic', value_format='JSON');", | ||
"INSERT INTO TEST (I, A) VALUES (-1, ARRAY[ARRAY[1]]);" | ||
], | ||
"inputs": [ | ||
], | ||
"outputs": [ | ||
{"topic": "test_topic", "key": null, "value": {"I": -1, "A": [[1]]}} | ||
] | ||
}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we're fully testing the coercer with this test as the 1
from ARRAY[ARRAY[1]]
will default to a IntegerLiteral
and the type is ARRAY<ARRAY<INT>>
.
What happens if you make it ARRAY<ARRAY<BIGINT>>
?
Same goes for the map test below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
BREAKING CHANGE: the old syntax for creating arrays using the builtin AS_ARRAY function will no longer work
BREAKING CHANGE: the old syntax for creating arrays using the
builtin AS_ARRAY function will no longer work
Description
Follow-up on #4120 - this PR creates maps and arrays natively instead of going through the UDF path. This has the benefit of custom syntax and stricter type checking.
Testing done
Reviewer checklist