You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem or challenge? Please describe what you are trying to do.
The function min_max_string does some unnecessary checks when null_count > 0 . For example, we don't need to check has_value in every loop because it will always be true after the first loop.
if null_count == 0{
n = array.value(0);for i in1..data.len(){let item = array.value(i);ifcmp(n, item){
n = item;}}}else{
n = "";letmut has_value = false;for i in0..data.len(){let item = array.value(i);if data.is_valid(i) && (!has_value || cmp(n, item)){
has_value = true;
n = item;}}}
Apart from that, I want this function to be cleaned up because the "for loops" here are not pretty.
Describe the solution you'd like
Performance should be improved when null_count > 0
No performance penalty is introduced in other cases
clean up the code. Maybe use some FP skills
Describe alternatives you've considered
We can also replace array.value(i) by array.value_unchecked(i). But it will introduce some "unsafe", so I am not sure.
The text was updated successfully, but these errors were encountered:
alamb
changed the title
Clean up and improve the performance of min_max_string
Improve performance of min and max aggregation kernels without nulls
Mar 3, 2022
Is your feature request related to a problem or challenge? Please describe what you are trying to do.
The function
min_max_string
does some unnecessary checks whennull_count > 0
. For example, we don't need to checkhas_value
in every loop because it will always betrue
after the first loop.https://github.com/apache/arrow-rs/blob/master/arrow/src/compute/kernels/aggregate.rs#L55-L64
Apart from that, I want this function to be cleaned up because the "for loops" here are not pretty.
Describe the solution you'd like
null_count > 0
Describe alternatives you've considered
We can also replace
array.value(i)
byarray.value_unchecked(i)
. But it will introduce some "unsafe", so I am not sure.The text was updated successfully, but these errors were encountered: