Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Scala Test md5 can produce non-empty nulls (merge and set validity) #8185

Closed
revans2 opened this issue Apr 26, 2023 · 1 comment
Closed
Assignees
Labels
bug Something isn't working

Comments

@revans2
Copy link
Collaborator

revans2 commented Apr 26, 2023

Describe the bug
To be able to support sorting and comparisons on nested types we needed to get rid of all non-empty nulls. To help with this we added in some asserts when we see them. It looks like the GpuMd5 implementation is using mergeAndSetValidity which is producing an exception in this case.

Steps/Code to reproduce bug
Run the unit tests, but revert #8183 first

 org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 845.0 failed 1 times, most recent failure: Lost task 0.0 in stage 845.0 (TID 2577) (10.28.9.123 executor driver): java.lang.AssertionError: Column has non-empty nulls
	at ai.rapids.cudf.AssertEmptyNulls.assertNullsAreEmpty(AssertEmptyNulls.java:33)
	at ai.rapids.cudf.ColumnView.<init>(ColumnView.java:71)
	at ai.rapids.cudf.ColumnVector.<init>(ColumnVector.java:58)
	at ai.rapids.cudf.ColumnView.mergeAndSetValidity(ColumnView.java:835)
	at org.apache.spark.sql.rapids.GpuMd5.$anonfun$doColumnar$2(HashFunctions.scala:38)
	at com.nvidia.spark.rapids.Arm$.withResource(Arm.scala:29)
	at org.apache.spark.sql.rapids.GpuMd5.$anonfun$doColumnar$1(HashFunctions.scala:37)
	at com.nvidia.spark.rapids.Arm$.withResource(Arm.scala:29)
	at org.apache.spark.sql.rapids.GpuMd5.doColumnar(HashFunctions.scala:36)

Expected behavior
This should not happen, we either need to fix mergeAndSetValidity, which I thought we had done, but it might have been bitwiseMergeAndSetValidity. Not sure if we have two or not. Or we need to stop using it + #7698

@revans2 revans2 added bug Something isn't working ? - Needs Triage Need team to review and classify labels Apr 26, 2023
@razajafri razajafri self-assigned this Apr 26, 2023
@mattahrens mattahrens removed the ? - Needs Triage Need team to review and classify label May 2, 2023
@razajafri
Copy link
Collaborator

fixed by rapidsai/cudf#13335

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants