-
Notifications
You must be signed in to change notification settings - Fork 928
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] listReduce on a single row input with an empty list generates a zero row output #10556
Comments
This issue has been labeled |
This appears to be a bug in libcudf rather than the cudf Java/JNI layer. |
Apply the following patch and build
The problem occurs when the lists column is all nulls and the child column is empty (because there is no need for backing values in a list of all nulls). |
I just want to verify that the expected result is a single row column (
|
Yes, apologies for the bug in the test. |
Fixes `cudf::segmented_reduce` where the input `values` column is empty but the `offsets` are not. In this case, the `offsets` vector `{0,0}` specifies an empty segment which should result in a single null row. The logic has been fixed and new gtest cases have been added. Closes #10556 Authors: - David Wendt (https://github.com/davidwendt) Approvers: - Bradley Dice (https://github.com/bdice) - Jason Lowe (https://github.com/jlowe) - Nghia Truong (https://github.com/ttnghia) URL: #10876
Describe the bug
While testing
ArrayExists
in spark-rapids we hit an exception when a partition consists of a single row with an empty list. The root cause is somewhere behind the fact that listReduce produces an empty output instead of maintaining the invariant: output row count equals input row countSteps/Code to reproduce bug
The repro via JVM in Scala REPL is as follows.
The output is as expected if the input includes another row with a non-empty list
Expected behavior
Reduction of a single row with an empty list should produce a single row output
Environment overview (please complete the following information)
Environment details
TBD
Additional context
NVIDIA/spark-rapids#5108
The text was updated successfully, but these errors were encountered: