Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEA] Add support for shallow lists in cudf::merge #13514

Closed
jlowe opened this issue Jun 6, 2023 · 9 comments
Closed

[FEA] Add support for shallow lists in cudf::merge #13514

jlowe opened this issue Jun 6, 2023 · 9 comments
Labels
2 - In Progress Currently a work in progress feature request New feature or request libcudf Affects libcudf (C++/CUDA) code. Spark Functionality that helps Spark RAPIDS

Comments

@jlowe
Copy link
Member

jlowe commented Jun 6, 2023

Is your feature request related to a problem? Please describe.
Spark supports sorting on BinaryType which is represented as a libcudf LIST column which a child column of non-nullable UINT8. The RAPIDS Accelerator leverages cudf::merge to perform out-of-core sort algorithms. cudf::sorted_order supports sorting on the LIST column directly, but cudf::merge does not, failing with the error:

CUDF failure at:/..../cudf/cpp/src/merge/merge.cu:236: Unsupported type for merge.

Describe the solution you'd like
cudf::merge should support the same ordering types as supported by cudf::sorted_order

Describe alternatives you've considered
Applications would need to concatenate the tables to be merged together into one big table and then call cudf::sorted_order which is suboptimal to being able to perform the merged sort directly via cudf::merge.

@jlowe jlowe added feature request New feature or request Needs Triage Need team to review and classify libcudf Affects libcudf (C++/CUDA) code. Spark Functionality that helps Spark RAPIDS labels Jun 6, 2023
@bdice
Copy link
Contributor

bdice commented Jun 6, 2023

This should be covered by #13347. I believe this is targeting 23.08. cc: @divyegala

@jlowe
Copy link
Member Author

jlowe commented Jun 6, 2023

Thanks, @bdice! I just discovered that cudf::merge is not only unhappy with a shallow list as an ordering column, but it's also unhappy even if a shallow list is not an ordering column which is surprising. Will #13347 address that as well given it's focused on columns for comparison?

@bdice
Copy link
Contributor

bdice commented Jun 6, 2023

I bet the logic for raising an error is incorrect (too aggressive, fails even on non-ordering columns which seems unnecessary). We should definitely add a test for that. @jlowe Does it help you if the non-ordering columns issue is fixed sooner than #13347? Otherwise we should just add a test there.

@jlowe
Copy link
Member Author

jlowe commented Jun 6, 2023

Does it help you if the non-ordering columns issue is fixed sooner than #13347?

Slightly but may not be worth it. We can work around the issue by concatenating the tables and calling cudf::sorted_order as mentioned above which is slower but should hold us over until cudf::merge for lists is properly fixed.

@jlowe
Copy link
Member Author

jlowe commented Jun 6, 2023

I noticed that the mishandling of non-key nested types in cudf::merge is already tracked by #8050.

@chenya-zhang
Copy link

chenya-zhang commented Jun 6, 2023

Thank you @jlowe @bdice for inputs on this issue.

+1 on "the logic for raising an error is incorrect (too aggressive, fails even on non-ordering columns which seems unnecessary)".

If possible, we hope users do not experience unexpected job failure when moving Spark jobs from CPU to GPU. It may be helpful to give them a clear message to check on unsupported operations or be aware of any fallback. We don't seem to find more information for this one besides the exception (and stack trace):

Caused by: ai.rapids.cudf.CudfException: CUDF failure at:/home/jenkins/agent/workspace/jenkins-spark-rapids-jni_nightly-dev-426-cuda11/thirdparty/cudf/cpp/src/merge/merge.cu:236: Unsupported type for merge.

FYI: There are messages on other unsupported operations which fall back as expected. We will help to share more on them.

@GregoryKimball
Copy link
Contributor

@bdice @divyegala are #13514 and #8050 closed by #14250?

@divyegala
Copy link
Member

@GregoryKimball thanks. Yes, they should be

@vyasr
Copy link
Contributor

vyasr commented Nov 21, 2023

Closing as resolved by #14250.

@vyasr vyasr closed this as completed Nov 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
2 - In Progress Currently a work in progress feature request New feature or request libcudf Affects libcudf (C++/CUDA) code. Spark Functionality that helps Spark RAPIDS
Projects
None yet
Development

No branches or pull requests

6 participants