You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[2023-01-06T11:40:23.945Z] FAILED ../../src/main/python/array_test.py::test_arrays_zip[Array(Byte)] - py4j.protocol.Py4JJavaError: An error occurred while calling o304419.collec...
[2023-01-06T11:40:23.945Z] FAILED ../../src/main/python/array_test.py::test_arrays_zip[Array(Short)] - py4j.protocol.Py4JJavaError: An error occurred while calling o304598.collec...
[2023-01-06T11:40:23.945Z] FAILED ../../src/main/python/array_test.py::test_arrays_zip[Array(Integer)] - py4j.protocol.Py4JJavaError: An error occurred while calling o304777.collec...
[2023-01-06T11:40:23.945Z] FAILED ../../src/main/python/array_test.py::test_arrays_zip[Array(Long)] - py4j.protocol.Py4JJavaError: An error occurred while calling o304956.collec...
[2023-01-06T11:40:23.945Z] FAILED ../../src/main/python/array_test.py::test_arrays_zip[Array(Float)] - py4j.protocol.Py4JJavaError: An error occurred while calling o305135.collec...
[2023-01-06T11:40:23.945Z] FAILED ../../src/main/python/array_test.py::test_arrays_zip[Array(Double)] - py4j.protocol.Py4JJavaError: An error occurred while calling o305314.collec...
[2023-01-06T11:40:23.945Z] FAILED ../../src/main/python/array_test.py::test_arrays_zip[Array(String)] - py4j.protocol.Py4JJavaError: An error occurred while calling o305493.collec...
[2023-01-06T11:40:23.946Z] FAILED ../../src/main/python/array_test.py::test_arrays_zip[Array(Boolean)] - py4j.protocol.Py4JJavaError: An error occurred while calling o305672.collec...
[2023-01-06T11:40:23.946Z] FAILED ../../src/main/python/array_test.py::test_arrays_zip[Array(Date)] - py4j.protocol.Py4JJavaError: An error occurred while calling o305851.collec...
[2023-01-06T11:40:23.946Z] FAILED ../../src/main/python/array_test.py::test_arrays_zip[Array(Timestamp)] - py4j.protocol.Py4JJavaError: An error occurred while calling o306038.collec...
[2023-01-06T11:40:23.946Z] FAILED ../../src/main/python/array_test.py::test_arrays_zip[Array(Null)] - py4j.protocol.Py4JJavaError: An error occurred while calling o306217.collec...
[2023-01-06T11:40:23.946Z] FAILED ../../src/main/python/array_test.py::test_arrays_zip[Array(Decimal(7,3))] - py4j.protocol.Py4JJavaError: An error occurred while calling o306396.collec...
[2023-01-06T11:40:23.946Z] FAILED ../../src/main/python/array_test.py::test_arrays_zip[Array(Decimal(12,2))] - py4j.protocol.Py4JJavaError: An error occurred while calling o306575.collec...
[2023-01-06T11:40:23.946Z] FAILED ../../src/main/python/array_test.py::test_arrays_zip[Array(Decimal(20,2))] - py4j.protocol.Py4JJavaError: An error occurred while calling o306754.collec...
[2023-01-06T11:40:23.946Z] FAILED ../../src/main/python/array_test.py::test_arrays_zip[Array(Array(Short))] - py4j.protocol.Py4JJavaError: An error occurred while calling o306933.collec...
[2023-01-06T11:40:23.946Z] FAILED ../../src/main/python/array_test.py::test_arrays_zip[Array(Array(String))] - py4j.protocol.Py4JJavaError: An error occurred while calling o307112.collec...
[2023-01-06T11:40:23.946Z] FAILED ../../src/main/python/array_test.py::test_arrays_zip[Array(Struct(['child0', Byte],['child1', String],['child2', Float]))] - py4j.protocol.Py4JJavaError: An error occurred while calling o307291.collec...
[2023-01-06T11:40:23.946Z] FAILED ../../src/main/python/array_test.py::test_arrays_zip[Array(Map(String(not_null),String))] - py4j.protocol.Py4JJavaError: An error occurred while calling o307470.collec...
[2023-01-06T11:40:23.946Z] FAILED ../../src/main/python/array_test.py::test_arrays_zip[Array(Binary)] - py4j.protocol.Py4JJavaError: An error occurred while calling o307649.collec...
[2023-01-06T11:40:23.946Z] FAILED ../../src/main/python/array_test.py::test_arrays_zip_corner_cases - py4j.protocol.Py4JJavaError: An error occurred while calling o307828.collec...
[2023-01-06T11:40:23.946Z] = 20 failed, 15074 passed, 830 skipped, 572 xfailed, 235 xpassed, 911 warnings in 6812.06s (1:53:32) =
Errors:
[2023-01-06T11:40:23.930Z] E at java.lang.Thread.run(Thread.java:750)
[2023-01-06T11:40:23.930Z] E Caused by: java.lang.IllegalArgumentException: All columns must have the same number of rows
[2023-01-06T11:40:23.930Z] E at ai.rapids.cudf.ColumnView.makeStructView(ColumnView.java:3440)
[2023-01-06T11:40:23.930Z] E at ai.rapids.cudf.ColumnView.makeStructView(ColumnView.java:3458)
[2023-01-06T11:40:23.930Z] E at ai.rapids.cudf.ColumnVector.makeStruct(ColumnVector.java:395)
[2023-01-06T11:40:23.930Z] E at org.apache.spark.sql.rapids.GpuArraysZip.$anonfun$zipArrays$2(collectionOperations.scala:825)
[2023-01-06T11:40:23.931Z] E at com.nvidia.spark.rapids.Arm.withResource(Arm.scala:46)
[2023-01-06T11:40:23.931Z] E at com.nvidia.spark.rapids.Arm.withResource$(Arm.scala:44)
[2023-01-06T11:40:23.931Z] E at org.apache.spark.sql.rapids.GpuArraysZip.withResource(collectionOperations.scala:701)
[2023-01-06T11:40:23.931Z] E at org.apache.spark.sql.rapids.GpuArraysZip.$anonfun$zipArrays$1(collectionOperations.scala:824)
[2023-01-06T11:40:23.931Z] E at com.nvidia.spark.rapids.Arm.withResource(Arm.scala:28)
[2023-01-06T11:40:23.931Z] E at com.nvidia.spark.rapids.Arm.withResource$(Arm.scala:26)
[2023-01-06T11:40:23.931Z] E at org.apache.spark.sql.rapids.GpuArraysZip.withResource(collectionOperations.scala:701)
[2023-01-06T11:40:23.931Z] E at org.apache.spark.sql.rapids.GpuArraysZip.zipArrays(collectionOperations.scala:823)
[2023-01-06T11:40:23.931Z] E at org.apache.spark.sql.rapids.GpuArraysZip.$anonfun$columnarEval$5(collectionOperations.scala:782)
[2023-01-06T11:40:23.931Z] E at com.nvidia.spark.rapids.Arm.withResource(Arm.scala:46)
[2023-01-06T11:40:23.931Z] E at com.nvidia.spark.rapids.Arm.withResource$(Arm.scala:44)
[2023-01-06T11:40:23.931Z] E at org.apache.spark.sql.rapids.GpuArraysZip.withResource(collectionOperations.scala:701)
[2023-01-06T11:40:23.931Z] E at org.apache.spark.sql.rapids.GpuArraysZip.columnarEval(collectionOperations.scala:732)
[2023-01-06T11:40:23.931Z] E at com.nvidia.spark.rapids.RapidsPluginImplicits$ReallyAGpuExpression.columnarEval(implicits.scala:34)
[2023-01-06T11:40:23.931Z] E at com.nvidia.spark.rapids.GpuAlias.columnarEval(namedExpressions.scala:109)
[2023-01-06T11:40:23.931Z] E at com.nvidia.spark.rapids.RapidsPluginImplicits$ReallyAGpuExpression.columnarEval(implicits.scala:34)
The text was updated successfully, but these errors were encountered:
I was able to confirm that commit bd8df4efb9b7b973b3cdf13ab8aaf0e9ce1958be in CUDF Purge non-empty nulls in cudf::make_lists_column (rapidsai/cudf#12370) broke segmented gather for top level null values. I will file something against CUDF and see if I can figure out if there is a good work around for this.
Actually this is not a bug in CUDF. It looks like we are relying on broken behavior from CUDF in these cases and rapidsai/cudf#12370 fixed it. Prior to the change if we did a segmented gather with a NULL input LIST, and a gather map of {0} the output would be a single NULL LIST row, but the child column would have one entry in it that was also marked as NULL. That is no longer true after this change.
I think the only way to fix this properly is to replace the null values with an empty list on input. I could do it the cheap way and drop the validity and null count when making my own fake column view, but I think I want to do it the correct way for now and we can look into speeding that up if it shows up as a performance problem.
Describe the bug
Errors:
The text was updated successfully, but these errors were encountered: