-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Minor: Add multi ordering test for array agg order #8439
Minor: Add multi ordering test for array agg order #8439
Conversation
7,30,6 | ||
8,30,7 | ||
9,30,8 | ||
10,10,9 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason why we need csv test is that only csv file trigger merge_batch
, the normal table go to update_batch
only.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Normal table
logical_plan
Aggregate: groupBy=[[]], aggr=[[ARRAY_AGG(arrays.column1) ORDER BY [arrays.column1 ASC NULLS LAST]]]
--TableScan: arrays projection=[column1]
physical_plan
AggregateExec: mode=Single, gby=[], aggr=[ARRAY_AGG(arrays.column1)]
--SortExec: expr=[column1@0 ASC NULLS LAST]
----MemoryExec: partitions=1, partition_sizes=[1]
AggregateExec: mode is Single, so run update_batch
.
partition_ordering_values.push(v); | ||
// Get the inner struct array | ||
let ordering_array = agg_orderings.value(0); | ||
let orderings = ScalarValue::convert_array_to_scalar_vec(&ordering_array)? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we always have ListArray(StructArray) for agg_orderings
, so we can safely get orderings
with convert_array_to_scalar_vec
only.
Aggregate: groupBy=[[]], aggr=[[ARRAY_AGG(agg_order.c1) ORDER BY [agg_order.c2 DESC NULLS FIRST, agg_order.c3 ASC NULLS LAST]]] | ||
--TableScan: agg_order projection=[c1, c2, c3] | ||
physical_plan | ||
AggregateExec: mode=Final, gby=[], aggr=[ARRAY_AGG(agg_order.c1)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AggregateExec mode is final, so run merge_batch
@@ -329,6 +319,8 @@ impl OrderSensitiveArrayAggAccumulator { | |||
.collect(); | |||
let struct_type = DataType::Struct(Fields::from(fields)); | |||
|
|||
// TODO: Since ScalarValue::Struct is not yet includes StructArray, | |||
// so we need to convert it to ListArray for ScalarValue::List | |||
let arr = ScalarValue::new_list(&orderings, &struct_type); | |||
Ok(ScalarValue::List(arr)) |
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
Signed-off-by: jayzhan211 <[email protected]>
22f3288
to
984551e
Compare
for v in other_ordering_values.into_iter() { | ||
partition_ordering_values.push(v); | ||
|
||
for partition_ordering_rows in orderings.into_iter() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think convert_array_agg_to_orderings
is doing quite specific thing, so remove the function and do things here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code looks reasonable to me. Thank you @jayzhan211
I wonder if @mustafasrepo could give this a look as I think he contributed the original implementation of this aggregate
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is LGTM!. Thanks @jayzhan211 for this PR.
Thanks again |
Signed-off-by: jayzhan211 <[email protected]>
Signed-off-by: jayzhan211 <[email protected]>
Signed-off-by: jayzhan211 <[email protected]>
Which issue does this PR close?
Rationale for this change
I found the reason why we need level 3 vec in
convert_array_agg_to_orderings
https://github.com/apache/arrow-datafusion/pull/7629/files#r1357637090. For multi ordering cases, we have len > 1.Add multi ordering test to improve the test coverage, also simply the code.
What changes are included in this PR?
Are these changes tested?
Are there any user-facing changes?