-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix compilation, change row_number() expr_fn to 0 args #12043
Conversation
@@ -31,8 +31,8 @@ use datafusion_expr::{Expr, PartitionEvaluator, Signature, Volatility, WindowUDF | |||
|
|||
/// Create a [`WindowFunction`](Expr::WindowFunction) expression for | |||
/// `row_number` user-defined window function. | |||
pub fn row_number(args: Vec<Expr>) -> Expr { | |||
Expr::WindowFunction(WindowFunction::new(row_number_udwf(), args)) | |||
pub fn row_number() -> Expr { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note row_number() does not take an argument -- it was erroniously changed here #12030 and it happens we didn't have other tests for it
> select row_number() OVER ();
+-----------------------------------------------------------------------+
| ROW_NUMBER() ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING |
+-----------------------------------------------------------------------+
| 1 |
+-----------------------------------------------------------------------+
1 row(s) fetched.
Elapsed 0.042 seconds.
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given there is signature Any(0) but the test in round_trip still works.
I guess we should add data_types_with_window_udf
somewhere
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess we should add data_types_with_window_udf somewhere
I don't understand this comment 🤔
Thank you for the quick review @jayzhan211 -- I am going to merge this in to get the CI green again. I am happy to make additional changes as a follow on PR if other reviewers have suggestions |
@@ -904,7 +904,7 @@ async fn roundtrip_expr_api() -> Result<()> { | |||
vec![lit(1), lit(2), lit(3)], | |||
vec![lit(10), lit(20), lit(30)], | |||
), | |||
row_number(vec![col("a")]), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we did the signature check, this test should fail. Because the signature is Any(0)
, which should only accept zero arguments
The reason why it is not is because we didn't do signature check for window udf.
We have similar functions for scalar and udaf.
datafusion/datafusion/expr/src/type_coercion/functions.rs
Lines 42 to 96 in cb1e3f0
pub fn data_types_with_scalar_udf( | |
current_types: &[DataType], | |
func: &ScalarUDF, | |
) -> Result<Vec<DataType>> { | |
let signature = func.signature(); | |
if current_types.is_empty() { | |
if signature.type_signature.supports_zero_argument() { | |
return Ok(vec![]); | |
} else { | |
return plan_err!("{} does not support zero arguments.", func.name()); | |
} | |
} | |
let valid_types = | |
get_valid_types_with_scalar_udf(&signature.type_signature, current_types, func)?; | |
if valid_types | |
.iter() | |
.any(|data_type| data_type == current_types) | |
{ | |
return Ok(current_types.to_vec()); | |
} | |
try_coerce_types(valid_types, current_types, &signature.type_signature) | |
} | |
pub fn data_types_with_aggregate_udf( | |
current_types: &[DataType], | |
func: &AggregateUDF, | |
) -> Result<Vec<DataType>> { | |
let signature = func.signature(); | |
if current_types.is_empty() { | |
if signature.type_signature.supports_zero_argument() { | |
return Ok(vec![]); | |
} else { | |
return plan_err!("{} does not support zero arguments.", func.name()); | |
} | |
} | |
let valid_types = get_valid_types_with_aggregate_udf( | |
&signature.type_signature, | |
current_types, | |
func, | |
)?; | |
if valid_types | |
.iter() | |
.any(|data_type| data_type == current_types) | |
{ | |
return Ok(current_types.to_vec()); | |
} | |
try_coerce_types(valid_types, current_types, &signature.type_signature) | |
} |
data_types_with_scalar_udf
and data_types_with_aggregate_udf
.
After adding a new data_types_with_window_udf
function, we should get the failed msg for incorrect non-empty args for row_number
Maybe add it somewhere here
datafusion/datafusion/expr/src/expr_schema.rs
Lines 169 to 196 in cb1e3f0
Expr::WindowFunction(WindowFunction { fun, args, .. }) => { | |
let data_types = args | |
.iter() | |
.map(|e| e.get_type(schema)) | |
.collect::<Result<Vec<_>>>()?; | |
let nullability = args | |
.iter() | |
.map(|e| e.nullable(schema)) | |
.collect::<Result<Vec<_>>>()?; | |
match fun { | |
WindowFunctionDefinition::AggregateUDF(udf) => { | |
let new_types = data_types_with_aggregate_udf(&data_types, udf) | |
.map_err(|err| { | |
plan_datafusion_err!( | |
"{} {}", | |
err, | |
utils::generate_signature_error_msg( | |
fun.name(), | |
fun.signature().clone(), | |
&data_types | |
) | |
) | |
})?; | |
Ok(fun.return_type(&new_types, &nullability)?) | |
} | |
_ => fun.return_type(&data_types, &nullability), | |
} | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this was done in #12045
Which issue does this PR close?
N/A
Rationale for this change
Fix logical conflict between #12030 and #12000
What changes are included in this PR?
Are these changes tested?
Yes
Are there any user-facing changes?