[FEA] Support for non key nested types in cudf::merge #8050

revans2 · 2021-04-23T21:00:55Z

Is your feature request related to a problem? Please describe.
For sorting data in Spark we use merge if the data is too large to fit in a single batch. We need to be able to support sorting data that contains nested types (that are not necessarily the key we are sorting on).

Describe the solution you'd like
I would like to see merge support the same types for non-sort keys columns that gather supports so we can sort whatever it is we need to sort.

Describe alternatives you've considered
As a work around we will concat the tables together and sort them, but it is much slower, and not ideal.

The text was updated successfully, but these errors were encountered:

Partially addresses #8050 Adds support for merging of struct columns. The struct columns cannot be used as keys in the merge. Authors: - https://github.com/nvdbaranec Approvers: - Ram (Ramakrishna Prabhu) (https://github.com/rgsl888prabhu) - Christopher Harris (https://github.com/cwharris) - Conor Hoekstra (https://github.com/codereport) URL: #8422

firestarman · 2021-12-16T08:05:41Z

Reopen this since it is still missing the array and map type support.

vyasr · 2023-11-21T21:04:32Z

Closing as resolved by #14250.

revans2 added feature request New feature or request Needs Triage Need team to review and classify Spark Functionality that helps Spark RAPIDS labels Apr 23, 2021

revans2 mentioned this issue Apr 23, 2021

[TASK] Use merge sort for nested types when cudf supports it NVIDIA/spark-rapids#2252

Open

kkraus14 added libcudf Affects libcudf (C++/CUDA) code. and removed Needs Triage Need team to review and classify labels Apr 27, 2021

nvdbaranec self-assigned this May 20, 2021

nvdbaranec mentioned this issue Jun 1, 2021

STRUCT column support for cudf::merge. #8422

Merged

nvdbaranec closed this as completed Jul 20, 2021

firestarman reopened this Dec 16, 2021

jlowe mentioned this issue Jun 6, 2023

[FEA] Add support for shallow lists in cudf::merge #13514

Closed

ttnghia mentioned this issue Jun 12, 2023

[FEA] Fully support nested types in Spark SQL functions NVIDIA/spark-rapids#8550

Open

GregoryKimball mentioned this issue Jun 26, 2023

[FEA] Implement full support for nested types #11844

Closed

vyasr closed this as completed Nov 21, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FEA] Support for non key nested types in cudf::merge #8050

[FEA] Support for non key nested types in cudf::merge #8050

revans2 commented Apr 23, 2021

firestarman commented Dec 16, 2021

vyasr commented Nov 21, 2023

[FEA] Support for non key nested types in cudf::merge #8050

[FEA] Support for non key nested types in cudf::merge #8050

Comments

revans2 commented Apr 23, 2021

firestarman commented Dec 16, 2021

vyasr commented Nov 21, 2023