-
Notifications
You must be signed in to change notification settings - Fork 240
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Switch and/or to use new cudf binops to improve performance #4501
Conversation
Signed-off-by: Robert (Bobby) Evans <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good (other than copyright)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me as well.
I forgot there are cases where code was using the AST version of the NULL_LOGICAL_OR operator, see https://github.com/NVIDIA/spark-rapids/blob/branch-22.02/sql-plugin/src/main/scala/com/nvidia/spark/rapids/conditionalExpressions.scala#L445-L451. It would be good to have that code switch to using this new direct binop instead. |
I ran the tests manually and fixed an issue with AST not being translated properly because it was separate from the normal code. That is fixed now. I ran TPC-DS at scale factor 3k and it looks to be generally a win, 1.1% faster. But there are about as many tests that are slower 51 as there are that are faster 54. I am going to see if I can reproduce some of the slowness locally to try and dig in a little deeper. |
sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuExpressions.scala
Outdated
Show resolved
Hide resolved
sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuExpressions.scala
Outdated
Show resolved
Hide resolved
Dependency just got merged into cudf. I'll move this out of draft when the dependency goes through CI |
build |
This depends on rapidsai/cudf#10016
Seeing some good performance improvements, but need to run some more tests.