Skip to content

Commit

Permalink
Optimize DECIMAL128 sum aggregations [databricks] (#4688)
Browse files Browse the repository at this point in the history
* Optimize DECIMAL128 sum aggregations

Signed-off-by: Jason Lowe <[email protected]>

* Fix regression in window sum

* Update for review comments

* Explicitly upcast input to avoid libcudf sort-based aggregation issue

* Lower batch limit in agg tests to better exercise sort-based aggregations

* Remove redundant method override
  • Loading branch information
jlowe authored Feb 8, 2022
1 parent aa2126d commit f3a5cd3
Show file tree
Hide file tree
Showing 2 changed files with 382 additions and 294 deletions.
4 changes: 2 additions & 2 deletions integration_tests/src/main/python/hash_aggregate_test.py
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@
}

_no_nans_float_smallbatch_conf = copy_and_update(_no_nans_float_conf,
{'spark.rapids.sql.batchSizeBytes' : '1000'})
{'spark.rapids.sql.batchSizeBytes' : '250'})

_no_nans_float_conf_partial = copy_and_update(_no_nans_float_conf,
{'spark.rapids.sql.hashAgg.replaceMode': 'partial'})
Expand Down Expand Up @@ -339,7 +339,7 @@ def test_hash_reduction_sum_count_action(data_gen):
# Make sure that we can do computation in the group by columns
@ignore_order
def test_computation_in_grpby_columns():
conf = {'spark.rapids.sql.batchSizeBytes' : '1000'}
conf = {'spark.rapids.sql.batchSizeBytes' : '250'}
data_gen = [
('a', RepeatSeqGen(StringGen('a{1,20}'), length=50)),
('b', short_gen)]
Expand Down
Loading

0 comments on commit f3a5cd3

Please sign in to comment.