-
Notifications
You must be signed in to change notification settings - Fork 240
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FEA] support ConcatWs sql function #63
Comments
Follow up work on cudf, concatenating arrays of strings: rapidsai/cudf#7727 |
The Spark behavior of concatws: Separator parameter Api differences:
Null Behavior:
Behavior:
|
Note that concat and concat_ws have different behavior for nulls when all rows are null:
|
similar cudf behavior for arrays with nulls doesn't match Spark, cudf will put null if any elements in array null, spark skips them. if all nulls, then get empty string
array handling for concat is different:
|
I discovered another weird case with the CPU where is you are concatenating an empty array and then another value, it leaves off the separator:
|
another example of arrays with null in middle:
|
example of spark with array of nulls, doesn't matter how many nulls in array, as long as all nulls, spark skips it and leaves off separator.
|
cudf Java layer PR: rapidsai/cudf#8289 |
Signed-off-by: spark-rapids automation <[email protected]>
Is your feature request related to a problem? Please describe.
it would be great to support the
concat_ws
SQL functionrapidsai/cudf#3726 was filed to get support from cudf.
The text was updated successfully, but these errors were encountered: