Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Integration test test_window_running_no_part can produce non-empty nulls (cudf scan) #8187

Closed
revans2 opened this issue Apr 26, 2023 · 6 comments
Assignees
Labels
bug Something isn't working cudf_dependency An issue or PR with this label depends on a new feature in cudf

Comments

@revans2
Copy link
Collaborator

revans2 commented Apr 26, 2023

Describe the bug
To be able to support sorting and comparisons on nested types we needed to get rid of all non-empty nulls. To help with this we added in some asserts when we see them. It looks like the ColumnView.scan in some cases can return non-empty nulls.

Steps/Code to reproduce bug
Run the integration tests, but revert #8183 first

E                   py4j.protocol.Py4JJavaError: An error occurred while calling o362293.collectToPython.
E                   : org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 9329.0 failed 1 times, most recent failure: Lost task 0.0 in stage 9329.0 (TID 37178) (10.28.9.123 executor driver): java.lang.Assert
ionError: Column has non-empty nulls
E                       at ai.rapids.cudf.AssertEmptyNulls.assertNullsAreEmpty(AssertEmptyNulls.java:33)
E                       at ai.rapids.cudf.ColumnView.<init>(ColumnView.java:71)
E                       at ai.rapids.cudf.ColumnVector.<init>(ColumnVector.java:58)
E                       at ai.rapids.cudf.ColumnView.scan(ColumnView.java:1712)
E                       at com.nvidia.spark.rapids.GroupedAggregations.$anonfun$doRunningWindowScan$4(GpuWindowExec.scala:925)
E                       at com.nvidia.spark.rapids.GroupedAggregations.$anonfun$doRunningWindowScan$4$adapted(GpuWindowExec.scala:918)
E                       at scala.collection.immutable.Range.foreach(Range.scala:158)
E                       at com.nvidia.spark.rapids.GroupedAggregations.$anonfun$doRunningWindowScan$3(GpuWindowExec.scala:918)
E                       at com.nvidia.spark.rapids.Arm$.withResource(Arm.scala:65)
E                       at com.nvidia.spark.rapids.GroupedAggregations.$anonfun$doRunningWindowScan$1(GpuWindowExec.scala:917)
E                       at com.nvidia.spark.rapids.GroupedAggregations.$anonfun$doRunningWindowScan$1$adapted(GpuWindowExec.scala:908)

This line corresponds to doing a Scan, specifically
inputCol.scan(agg, ScanType.INCLUSIVE, NullPolicy.EXCLUDE)

I don't know the exact agg being done, but it is being done where the data type is a String.

@revans2 revans2 added bug Something isn't working ? - Needs Triage Need team to review and classify labels Apr 26, 2023
@revans2
Copy link
Collaborator Author

revans2 commented Apr 26, 2023

I should add that this is likely a bug in CUDF that needs to be fixed.

@mattahrens mattahrens removed the ? - Needs Triage Need team to review and classify label May 2, 2023
@razajafri
Copy link
Collaborator

@ttnghia you suggested that cudf has removed all instances of creating non-empty nulls. Is this on the cudf agenda?
how do you want to proceed on this?
@revans2 should we handle this in Java for now and then revert when it's fixed in cudf?

@revans2
Copy link
Collaborator Author

revans2 commented May 19, 2023

@sameerz what is the priority on this? Do we need to get full nested type aggregation key and sorts into 23.06 or can it wait for 23.08.

@razajafri if we need it for 23.06 then we have to do something in java. If we are going to wait for 23.08, then we should just do it in CUDF correctly.

Either way we should come up with a repro case in CUDF C++ and file an issue against them.

@sameerz
Copy link
Collaborator

sameerz commented May 21, 2023

Do we need to get full nested type aggregation key and sorts into 23.06 or can it wait for 23.08.

This can wait until 23.08.

@sameerz sameerz added the cudf_dependency An issue or PR with this label depends on a new feature in cudf label May 31, 2023
@razajafri
Copy link
Collaborator

This depends on rapidsai/cudf#13455

@razajafri
Copy link
Collaborator

This has been fixed by the cudf PR rapidsai/cudf#13455

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working cudf_dependency An issue or PR with this label depends on a new feature in cudf
Projects
None yet
Development

No branches or pull requests

4 participants