Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

alternative mixed field aggregation collection #2135

Merged
merged 2 commits into from
Jul 27, 2023
Merged

alternative mixed field aggregation collection #2135

merged 2 commits into from
Jul 27, 2023

Conversation

PSeitz
Copy link
Contributor

@PSeitz PSeitz commented Jul 25, 2023

instead of having multiple accessor in one AggregationWithAccessor split it into
multiple independent AggregationWithAccessor

Performance seems unchanged

Before
test aggregation::agg_bench::bench::bench_aggregation_terms_many_json_mixed_type_with_sub_agg                            ... bench: 173,939,311 ns/iter (+/- 35,629,524)
test aggregation::agg_bench::bench::bench_aggregation_terms_many_json_mixed_type_with_sub_agg_multi                      ... bench: 227,312,130 ns/iter (+/- 34,251,990)
test aggregation::agg_bench::bench::bench_aggregation_terms_many_json_mixed_type_with_sub_agg_opt                        ... bench: 216,645,922 ns/iter (+/- 72,184,100)
test aggregation::agg_bench::bench::bench_aggregation_terms_many_json_mixed_type_with_sub_agg_sparse                     ... bench:  43,224,161 ns/iter (+/- 11,300,519)

After
test aggregation::agg_bench::bench::bench_aggregation_terms_many_json_mixed_type_with_sub_agg                            ... bench: 171,519,343 ns/iter (+/- 76,513,404)
test aggregation::agg_bench::bench::bench_aggregation_terms_many_json_mixed_type_with_sub_agg_multi                      ... bench: 228,347,835 ns/iter (+/- 12,738,742)
test aggregation::agg_bench::bench::bench_aggregation_terms_many_json_mixed_type_with_sub_agg_opt                        ... bench: 218,119,685 ns/iter (+/- 72,979,994)
test aggregation::agg_bench::bench::bench_aggregation_terms_many_json_mixed_type_with_sub_agg_sparse                     ... bench:  41,439,166 ns/iter (+/- 10,048,723)


instead of having multiple accessor in one AggregationWithAccessor split it into
multiple independent AggregationWithAccessor
@PSeitz PSeitz requested a review from fulmicoton July 25, 2023 05:37
column_block_accessor: Default::default(),
})
})
.collect::<crate::Result<_>>()?;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the return type fixes the types here, so that let aggs: ..., Ok(aggs) and the ? are unnecessary, i.e. it could just end in }).collect().

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand, we need to collect to Result to forward the error

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Certainly, but the extra binding and type hints seem unnecessary because the type that is collected already matches the return type, does it not?

So in think the function could just end in

        acc_field_types
            .into_iter()
            .map(|(accessor, field_type)| {
                Ok(AggregationWithAccessor {
                    accessor,
                    field_type,
                    sub_aggregation: get_aggs_with_segment_accessor_and_validate(
                        sub_aggregation,
                        reader,
                        &limits,
                    )?,
                    agg: agg.clone(),
                    str_dict_column: str_dict_column.clone(),
                    limits: limits.new_guard(),
                    column_block_accessor: Default::default(),
                })
            })
            .collect()

Copy link
Contributor Author

@PSeitz PSeitz Jul 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we return a Result<Vec> not Vec<Result>

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, which is what you are collecting by giving the type hint ::<crate::Result<_>> but that type hint just matches the return type. I am not suggesting to change any types, just to drop the unnecessary temporary binding and error forwarding because the collection already produces exactly the required type.

(Turning Iterator<Item=Result<S>> into Result<T> where T: FromIterator<Item=S> is the usual FromIterator impl which you invoke either way.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the binding with type assignment increases readability

Copy link
Collaborator

@adamreichold adamreichold Jul 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can still spell out the full return type if desired in the turbo-fish if you want to, e.g.

.collect::<crate::Result<Vec<AggregationWithAccessor>>>()

but the extra binding, error forwarding and ok wrapping are unnecessary.

(There is even a Clippy lint for unnecessary let bindings but IIRC it does not firing when ok wrapping is involved because this is sometimes necessary to change the error type via the From invocation hidding in ?.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't agree, the binding and type serves the purpose of increased readability by giving it a name and a type, which is important if the transformation gets nested

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @PSeitz . It is more readable when we put the type on the variable side.

use AggregationVariants::*;
let (accessor, field_type) = match &agg.agg {
let acc_field_types: Vec<(Column, ColumnType)> = match &agg.agg {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for putting the type!

@PSeitz PSeitz merged commit c2be660 into main Jul 27, 2023
@PSeitz PSeitz deleted the refactor_agg branch July 27, 2023 10:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants