-
Notifications
You must be signed in to change notification settings - Fork 244
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support spark.sql.mapKeyDedupPolicy=LAST_WIN for TransformKeys #5505
Support spark.sql.mapKeyDedupPolicy=LAST_WIN for TransformKeys #5505
Conversation
Signed-off-by: Andy Grove <[email protected]>
build |
Moving back to draft. I need to make sure |
Actually that is an existing bug in this. So you can fix it if you want here, but it does not have to be. This is a little more complex than #5546 because we can add an arbitrary new key. Before we could assume that there were no null keys, but here we cannot. So if you want to be exact the it should look something like.
Because if a null is ever returned, then we have to throw an exception. A DataType in Spark itself has no indications of nullable vs not nullable. Only when it wrapped in another nested type do those start to show up. So there is no way for us to not know if the side effect is not possible with the information we have. |
Actually I am wrong. The expression itself is nullable. I feel dumb. It should be something like
|
sql-plugin/src/main/scala/com/nvidia/spark/rapids/higherOrderFunctions.scala
Outdated
Show resolved
Hide resolved
build |
Closes #5325
Adds support for
spark.sql.mapKeyDedupPolicy=LAST_WIN
forTransformKeys
.