Skip to content

Commit

Permalink
apacheGH-45269: [C++][Compute] Add pivot function
Browse files Browse the repository at this point in the history
  • Loading branch information
pitrou committed Feb 17, 2025
1 parent 136ad9a commit 458d73c
Show file tree
Hide file tree
Showing 15 changed files with 1,798 additions and 36 deletions.
2 changes: 2 additions & 0 deletions cpp/src/arrow/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -752,10 +752,12 @@ if(ARROW_COMPUTE)
ARROW_COMPUTE_SRCS
compute/kernels/aggregate_basic.cc
compute/kernels/aggregate_mode.cc
compute/kernels/aggregate_pivot.cc
compute/kernels/aggregate_quantile.cc
compute/kernels/aggregate_tdigest.cc
compute/kernels/aggregate_var_std.cc
compute/kernels/hash_aggregate.cc
compute/kernels/pivot_internal.cc
compute/kernels/scalar_arithmetic.cc
compute/kernels/scalar_boolean.cc
compute/kernels/scalar_compare.cc
Expand Down
5 changes: 5 additions & 0 deletions cpp/src/arrow/acero/groupby_aggregate_node.cc
Original file line number Diff line number Diff line change
Expand Up @@ -282,6 +282,11 @@ Status GroupByNode::Merge() {
DCHECK(state0->agg_states[span_i]);
batch_ctx.SetState(state0->agg_states[span_i].get());

// XXX this resizes each KernelState (state0->agg_states[span_i]) multiple times.
// An alternative would be a two-pass algorithm:
// 1. Compute all transpositions (one per local state) and the final number of
// groups.
// 2. Process all agg kernels, resizing each KernelState only once.
RETURN_NOT_OK(
agg_kernels_[span_i]->resize(&batch_ctx, state0->grouper->num_groups()));
RETURN_NOT_OK(agg_kernels_[span_i]->merge(
Expand Down
Loading

0 comments on commit 458d73c

Please sign in to comment.