CPU fallback for Map scalars with key vectors #6288

mythrocks · 2022-08-11T00:19:13Z

This change fixes the failure of queries where a scalar map is
looked up via a vector of keys. Rather than fall back to CPU,
such queries used to evade detection during planning, and failed
at runtime, as described in #6282:

java.lang.IllegalStateException: Map lookup keys must be scalar values
        at org.apache.spark.sql.rapids.GpuGetMapValue.doColumnar(complexTypeExtractors.scala:231)
        at com.nvidia.spark.rapids.GpuBinaryExpression.$anonfun$columnarEval$3(GpuExpressions.scala:258)
        at com.nvidia.spark.rapids.Arm.withResourceIfAllowed(Arm.scala:73)
        at com.nvidia.spark.rapids.Arm.withResourceIfAllowed$(Arm.scala:71)
        at org.apache.spark.sql.rapids.GpuGetMapValue.withResourceIfAllowed(complexTypeExtractors.scala:190)
        at com.nvidia.spark.rapids.GpuBinaryExpression.$anonfun$columnarEval$2(GpuExpressions.scala:253)
        at com.nvidia.spark.rapids.Arm.withResourceIfAllowed(Arm.scala:73)
        at com.nvidia.spark.rapids.Arm.withResourceIfAllowed$(Arm.scala:71)
...

This change allows the query plan to correctly fall back to
CPU mode.

(This change also includes fixes for incorrect whitespaces
and indentation errors in map_test.py.)

Signed-off-by: MithunR [email protected]

This change fixes the failure of queries where a scalar map is looked up via a vector of keys. Rather than fall back to CPU, such queries used to evade detection during planning, and failed at runtime, as described in NVIDIA#6282: ``` java.lang.IllegalStateException: Map lookup keys must be scalar values at org.apache.spark.sql.rapids.GpuGetMapValue.doColumnar(complexTypeExtractors.scala:231) at com.nvidia.spark.rapids.GpuBinaryExpression.$anonfun$columnarEval$3(GpuExpressions.scala:258) at com.nvidia.spark.rapids.Arm.withResourceIfAllowed(Arm.scala:73) at com.nvidia.spark.rapids.Arm.withResourceIfAllowed$(Arm.scala:71) at org.apache.spark.sql.rapids.GpuGetMapValue.withResourceIfAllowed(complexTypeExtractors.scala:190) at com.nvidia.spark.rapids.GpuBinaryExpression.$anonfun$columnarEval$2(GpuExpressions.scala:253) at com.nvidia.spark.rapids.Arm.withResourceIfAllowed(Arm.scala:73) at com.nvidia.spark.rapids.Arm.withResourceIfAllowed$(Arm.scala:71) ... ``` This change allows the query plan to correctly fall back to CPU mode. (This change also includes fixes for incorrect whitespaces and indentation errors in `map_test.py`.) Signed-off-by: MithunR <[email protected]>

mythrocks · 2022-08-11T00:19:25Z

Build

andygrove

LGTM.

Very minor nit: Because there are two unrelated changes (whitespace fixes + bug fix), splitting these into separate commits would have made the review slightly easier.

mythrocks · 2022-08-11T15:12:31Z

splitting these into separate commits would have made the review slightly easier.

Point taken. I was conflicted about combining them in the PR. Putting the changes in separate commits would've been a good compromise. I'll bear this in mind next time.

Thanks for the review and guidance, @andygrove.

revans2

I agree that the formatting generally should be in a separate PR or at least a separate commit. I also am a bit curious about why we could not just turn the scalar into a vector and do the lookup. We do that in other places. I know it is not ideal from a memory standpoint, but it works.

mythrocks · 2022-08-11T16:25:08Z

why we could not just turn the scalar into a vector and do the lookup. We do that in other places. I know it is not ideal from a memory standpoint, but it works.

I thought about that as well. I didn't think it viable since it would be blowing up an array<struct<K,V>>, potentially much more expensive than just the keys.

I'd be happy to do that in 22.10, if you think that's wise. It'll save my writing a map_scalar_view, or equivalent.

revans2 · 2022-08-11T16:28:32Z

I'd be happy to do that in 22.10, if you think that's wise. It'll save my writing a map_scalar_view, or equivalent.

No I just think we should file a follow on issue to look at it. I mostly agree with you that this is a rare enough case that not doing it now is probably best, but it would be interesting to look at how we could do it efficiently in the future, when a customer asks for it, and I know that someone will. One idea I had, was because it is a scalar map, we could always turn the map into an nested set of if/else like operations. But there could be a lot of other better ways to do it.

mythrocks · 2022-08-11T16:38:21Z

Will do. I'll merge this change, to restore behaviour, and file another to track adding support for map scalars.

mythrocks · 2022-08-11T17:14:14Z

I have filed #6296 to add support. We'll fall back to CPU in the 22.08 release.

mythrocks · 2022-08-11T17:15:07Z

This PR is merged. Thank you, @andygrove and @revans2.

andygrove approved these changes Aug 11, 2022

View reviewed changes

revans2 approved these changes Aug 11, 2022

View reviewed changes

mythrocks mentioned this pull request Aug 11, 2022

[FEA] Support querying map scalars with key vectors #6296

Open

mythrocks merged commit 23517e6 into NVIDIA:branch-22.08 Aug 11, 2022

mythrocks mentioned this pull request Aug 11, 2022

[BUG] Missing CPU fallback for GetMapValue on scalar map, vector key #6282

Closed

sameerz added the bug Something isn't working label Aug 11, 2022

sameerz added this to the Aug 8 - Aug 19 milestone Aug 11, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CPU fallback for Map scalars with key vectors #6288

CPU fallback for Map scalars with key vectors #6288

mythrocks commented Aug 11, 2022

mythrocks commented Aug 11, 2022

andygrove left a comment

mythrocks commented Aug 11, 2022

revans2 left a comment

mythrocks commented Aug 11, 2022

revans2 commented Aug 11, 2022

mythrocks commented Aug 11, 2022

mythrocks commented Aug 11, 2022

mythrocks commented Aug 11, 2022

CPU fallback for Map scalars with key vectors #6288

CPU fallback for Map scalars with key vectors #6288

Conversation

mythrocks commented Aug 11, 2022

mythrocks commented Aug 11, 2022

andygrove left a comment

Choose a reason for hiding this comment

mythrocks commented Aug 11, 2022

revans2 left a comment

Choose a reason for hiding this comment

mythrocks commented Aug 11, 2022

revans2 commented Aug 11, 2022

mythrocks commented Aug 11, 2022

mythrocks commented Aug 11, 2022

mythrocks commented Aug 11, 2022