You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
If we know we aggregate over all values, we can preallocate or reserve the correct capacity on top level bucket aggs
We can bypass the multi-value/optional index and scan the fast field values directly
Downside
SegmentCollector usually includes a score. This a rather special use case. collect_range could be completely optional like this
pubtraitSegmentCollector:'static {
...
/// Only allowed to call collect_range, if `can_collect_range` returns truefncollect_range(&mutself,doc:RangeInclusive<DocId>){}fncan_collect_range(&self){false}}
Alternative
The aggregations caching layer (caches blocks of docids) could recognize consecutive docids and pass that as metadata, maybe increase caching to bigger blocks. In that case we can't preallocate efficiently. That could be done via hints maybe.
The text was updated successfully, but these errors were encountered:
Problem Outline
Pseudo Code current state, this is cause considerable overhead
Range Collection
We could optimize aggregations via collecting range of documents for use cases where aggregation is done over
Enabled Optimizations
Downside
SegmentCollector
usually includes a score. This a rather special use case.collect_range
could be completely optional like thisAlternative
The aggregations caching layer (caches blocks of docids) could recognize consecutive docids and pass that as metadata, maybe increase caching to bigger blocks. In that case we can't preallocate efficiently. That could be done via hints maybe.
The text was updated successfully, but these errors were encountered: