-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize the concat_ws
function
#3869
Optimize the concat_ws
function
#3869
Conversation
Signed-off-by: remzi <[email protected]>
Signed-off-by: remzi <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is beautiful @HaoYang670
@@ -53,6 +53,12 @@ impl Literal for String { | |||
} | |||
} | |||
|
|||
impl Literal for &String { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"TIL" lit()
👍
for arg in args { | ||
match arg { | ||
// filter out null args | ||
Expr::Literal(ScalarValue::Utf8(None) | ScalarValue::LargeUtf8(None)) => {} | ||
Expr::Literal(ScalarValue::Utf8(Some(v)) | ScalarValue::LargeUtf8(Some(v))) => { | ||
match contiguous_scalar { | ||
None => contiguous_scalar = Some(v.to_string()), | ||
Some(mut pre) => { | ||
pre += delimiter; | ||
pre += v; | ||
contiguous_scalar = Some(pre) | ||
} | ||
} | ||
} | ||
Expr::Literal(s) => return Err(DataFusionError::Internal(format!("The scalar {} should be casted to string type during the type coercion.", s))), | ||
// If the arg is not a literal, we should first push the current `contiguous_scalar` | ||
// to the `new_args` and reset it to None. | ||
// Then pushing this arg to the `new_args`. | ||
arg => { | ||
if let Some(val) = contiguous_scalar { | ||
new_args.push(lit(val)); | ||
} | ||
new_args.push(arg.clone()); | ||
contiguous_scalar = None; | ||
} | ||
} | ||
} | ||
if let Some(val) = contiguous_scalar { | ||
new_args.push(lit(val)); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This pattern of creating the contiguous scalar is so similar -- I wonder if it could be extracted out into a function -- perhaps as a follow on PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for reviewing @alamb
The logic for concat
and concat_ws
is a little different, because in concat_ws
we must consider the delimiter and we can't ignore the empty string literals. I will try to find a way to refactor them.
args: new_args, | ||
} | ||
} | ||
} => simpl_concat(args)?, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❤️
// the delimiter is not a literal | ||
{ | ||
let expr = concat_ws(col("c"), vec![lit("a"), null.clone(), lit("b")]); | ||
let expected = concat_ws(col("c"), vec![lit("a"), lit("b")]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so cool!
Thanks @HaoYang670 |
Signed-off-by: remzi [email protected]
Which issue does this PR close?
Closes #3856.
Closes #3857.
Rationale for this change
Simplify the
concat_ws
expression:null
if the delimiter is nullnull
argumentsconcat
to replaceconcat_ws
if the delimiter is an empty stringWhat changes are included in this PR?
Are there any user-facing changes?