Skip to content

Commit

Permalink
Enable windowed collect_list by default (NVIDIA#2006)
Browse files Browse the repository at this point in the history
Signed-off-by: Firestarman <[email protected]>
  • Loading branch information
firestarman authored Mar 25, 2021
1 parent f4b5ec7 commit 8442066
Show file tree
Hide file tree
Showing 4 changed files with 4 additions and 6 deletions.
2 changes: 1 addition & 1 deletion docs/configs.md
Original file line number Diff line number Diff line change
Expand Up @@ -258,7 +258,7 @@ Name | SQL Function(s) | Description | Default Value | Notes
<a name="sql.expression.Year"></a>spark.rapids.sql.expression.Year|`year`|Returns the year from a date or timestamp|true|None|
<a name="sql.expression.AggregateExpression"></a>spark.rapids.sql.expression.AggregateExpression| |Aggregate expression|true|None|
<a name="sql.expression.Average"></a>spark.rapids.sql.expression.Average|`avg`, `mean`|Average aggregate operator|true|None|
<a name="sql.expression.CollectList"></a>spark.rapids.sql.expression.CollectList|`collect_list`|Collect a list of elements, now only supported by windowing.|false|This is disabled by default because for now the GPU collects null values to a list, but Spark does not. This will be fixed in future releases.|
<a name="sql.expression.CollectList"></a>spark.rapids.sql.expression.CollectList|`collect_list`|Collect a list of elements, now only supported by windowing.|true|None|
<a name="sql.expression.Count"></a>spark.rapids.sql.expression.Count|`count`|Count aggregate operator|true|None|
<a name="sql.expression.First"></a>spark.rapids.sql.expression.First|`first_value`, `first`|first aggregate operator|true|None|
<a name="sql.expression.Last"></a>spark.rapids.sql.expression.Last|`last`, `last_value`|last aggregate operator|true|None|
Expand Down
2 changes: 1 addition & 1 deletion docs/supported_ops.md
Original file line number Diff line number Diff line change
Expand Up @@ -16435,7 +16435,7 @@ Accelerator support is described below.
<td rowSpan="6">CollectList</td>
<td rowSpan="6">`collect_list`</td>
<td rowSpan="6">Collect a list of elements, now only supported by windowing.</td>
<td rowSpan="6">This is disabled by default because for now the GPU collects null values to a list, but Spark does not. This will be fixed in future releases.</td>
<td rowSpan="6">None</td>
<td rowSpan="2">aggregation</td>
<td>input</td>
<td><b>NS</b></td>
Expand Down
3 changes: 1 addition & 2 deletions integration_tests/src/main/python/window_function_test.py
Original file line number Diff line number Diff line change
Expand Up @@ -261,5 +261,4 @@ def test_window_aggs_for_rows_collect_list():
collect_list(c_struct) over
(partition by a order by b,c_int rows between CURRENT ROW and UNBOUNDED FOLLOWING) as collect_struct
from window_collect_table
''',
{'spark.rapids.sql.expression.CollectList': 'true'})
''')
Original file line number Diff line number Diff line change
Expand Up @@ -2416,8 +2416,7 @@ object GpuOverrides {
(c, conf, p, r) => new ExprMeta[CollectList](c, conf, p, r) {
override def convertToGpu(): GpuExpression = GpuCollectList(
childExprs.head.convertToGpu(), c.mutableAggBufferOffset, c.inputAggBufferOffset)
}).disabledByDefault("for now the GPU collects null values to a list, but Spark does not." +
" This will be fixed in future releases."),
}),
expr[ScalarSubquery](
"Subquery that will return only one row and one column",
ExprChecks.projectOnly(
Expand Down

0 comments on commit 8442066

Please sign in to comment.