Add support for DecimalType in Remainder for Spark 3.4 and DB 11.3 [databricks] #8302

NVnavkumar · 2023-05-16T21:57:52Z

This reverts the changes to integration tests made in #7609, and adds support for Remainder with DecimalType in Spark 3.4 and Databricks 11.3.

This does this by casting the operands using the following formula:

scale = max(s1, s2)
precision = max(p1-s1,p2-p2) + scale
lhsType = rhsType = DecimalType(precision, scale)

This enables just enough precision and scale to compute the remainder. This gives enough scale to produce the part of the output that is < 1, and enough precision to contain the values of both operands.

The output type is actually this formula:

scale = max(s1, s2)
precision = min(p1-s1,p2-p2) + scale
outputType = DecimalType(precision, scale)

Which is only just enough precision and scale to store the remainder.

…3.4 and Databricks 11.3 Signed-off-by: Navin Kumar <[email protected]>

…cimal-remainder-340

Signed-off-by: Navin Kumar <[email protected]>

NVnavkumar · 2023-05-16T22:01:01Z

build

...gin/src/main/spark330db/scala/com/nvidia/spark/rapids/shims/DecimalArithmeticOverrides.scala

sql-plugin/src/main/spark330db/scala/org/apache/spark/sql/rapids/arithmetic.scala

Signed-off-by: Navin Kumar <[email protected]>

…der normal circumstances would cause an overflow. Spark 3.4 handles this differently and produces a correct value Signed-off-by: Navin Kumar <[email protected]>

NVnavkumar · 2023-05-18T23:04:44Z

build

revans2

Feel free to ignore the nit if it means we can be faster at getting the full fix in by 23.06.

revans2 · 2023-05-19T14:13:24Z

...gin/src/main/spark330db/scala/com/nvidia/spark/rapids/shims/DecimalArithmeticOverrides.scala

-          ("lhs", TypeSig.integral + TypeSig.fp, TypeSig.cpuNumeric),
-          ("rhs", TypeSig.integral + TypeSig.fp, TypeSig.cpuNumeric)),
+          ("lhs", TypeSig.gpuNumeric, TypeSig.cpuNumeric),
+          ("rhs", TypeSig.gpuNumeric, TypeSig.cpuNumeric)),


nit: it would be nice to add in a psNote for DECIMAL128 here to explain what we don't fully support. This is really minor if we think we can get full remainder support in before we ship 23.06. (especially minor because we don't generate the support docs for anything but the oldest version of Spark that we support) Oh well....

revans2 · 2023-05-19T14:15:23Z

Looks like you have some lines over 100 chars that need to be fixed.

Signed-off-by: Navin Kumar <[email protected]>

NVnavkumar · 2023-05-19T17:53:07Z

build

revans2 · 2023-05-19T18:34:02Z

build

NVnavkumar · 2023-05-19T19:58:38Z

Follow up for Decimal128 support is here: #8330

gerashegalov · 2023-05-20T01:11:27Z

...gin/src/main/spark330db/scala/com/nvidia/spark/rapids/shims/DecimalArithmeticOverrides.scala

+            if (a.left.dataType.isInstanceOf[DecimalType] &&
+                a.right.dataType.isInstanceOf[DecimalType]) {
+              val lhsType = a.left.dataType.asInstanceOf[DecimalType]
+              val rhsType = a.right.dataType.asInstanceOf[DecimalType]
+              val needed = DecimalRemainderChecks.neededPrecision(lhsType, rhsType)
+              if (needed > DType.DECIMAL128_MAX_PRECISION) {
+                willNotWorkOnGpu(s"needed intermediate precision ($needed) will overflow " +
+                  s"outside of the maximum available decimal128 precision")
+              }
+            }


Suggested change

if (a.left.dataType.isInstanceOf[DecimalType] &&

a.right.dataType.isInstanceOf[DecimalType]) {

val lhsType = a.left.dataType.asInstanceOf[DecimalType]

val rhsType = a.right.dataType.asInstanceOf[DecimalType]

val needed = DecimalRemainderChecks.neededPrecision(lhsType, rhsType)

if (needed > DType.DECIMAL128_MAX_PRECISION) {

willNotWorkOnGpu(s"needed intermediate precision ($needed) will overflow " +

s"outside of the maximum available decimal128 precision")

}

}

(a.left.dataType, a.right.dataType) match {

case (lhsType: DecimalType, rhsType: DecimalType) =>

val needed = DecimalRemainderChecks.neededPrecision(lhsType, rhsType)

if (needed > DType.DECIMAL128_MAX_PRECISION) {

willNotWorkOnGpu(s"needed intermediate precision ($needed) will overflow " +

s"outside of the maximum available decimal128 precision")

}

case _ => ()

}

NVnavkumar added 3 commits May 16, 2023 13:03

Added GpuDecimalRemainder to handle modulo with DecimalType in Spark …

d218f07

…3.4 and Databricks 11.3 Signed-off-by: Navin Kumar <[email protected]>

Merge branch 'branch-23.06' of github.com:NVIDIA/spark-rapids into de…

ba9a12d

…cimal-remainder-340

Fix integration test debug

18b485f

Signed-off-by: Navin Kumar <[email protected]>

NVnavkumar requested a review from razajafri May 16, 2023 21:57

NVnavkumar self-assigned this May 16, 2023

NVnavkumar requested a review from revans2 May 16, 2023 21:58

NVnavkumar added Spark 3.4+ Spark 3.4+ issues bug Something isn't working labels May 16, 2023

revans2 reviewed May 17, 2023

View reviewed changes

NVnavkumar added 2 commits May 17, 2023 12:27

Address feedback and restore the actual original integration test

8b3686a

Signed-off-by: Navin Kumar <[email protected]>

Fall back to CPU when the precision requirements to compute modulo un…

8aa6df1

…der normal circumstances would cause an overflow. Spark 3.4 handles this differently and produces a correct value Signed-off-by: Navin Kumar <[email protected]>

revans2 previously approved these changes May 19, 2023

View reviewed changes

Fix scalastyle and add comment for follow up issue

1ec88fa

Signed-off-by: Navin Kumar <[email protected]>

NVnavkumar dismissed revans2’s stale review via 1ec88fa May 19, 2023 17:49

NVnavkumar requested a review from revans2 May 19, 2023 17:59

revans2 approved these changes May 19, 2023

View reviewed changes

gerashegalov reviewed May 20, 2023

View reviewed changes

NVnavkumar merged commit be82a20 into NVIDIA:branch-23.06 May 21, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for DecimalType in Remainder for Spark 3.4 and DB 11.3 [databricks] #8302

Add support for DecimalType in Remainder for Spark 3.4 and DB 11.3 [databricks] #8302

NVnavkumar commented May 16, 2023

NVnavkumar commented May 16, 2023

NVnavkumar commented May 18, 2023

revans2 left a comment

revans2 May 19, 2023

revans2 commented May 19, 2023

NVnavkumar commented May 19, 2023

revans2 commented May 19, 2023

NVnavkumar commented May 19, 2023

gerashegalov May 20, 2023 •

edited

Loading

Add support for DecimalType in Remainder for Spark 3.4 and DB 11.3 [databricks] #8302

Add support for DecimalType in Remainder for Spark 3.4 and DB 11.3 [databricks] #8302

Conversation

NVnavkumar commented May 16, 2023

NVnavkumar commented May 16, 2023

NVnavkumar commented May 18, 2023

revans2 left a comment

Choose a reason for hiding this comment

revans2 May 19, 2023

Choose a reason for hiding this comment

revans2 commented May 19, 2023

NVnavkumar commented May 19, 2023

revans2 commented May 19, 2023

NVnavkumar commented May 19, 2023

gerashegalov May 20, 2023 • edited Loading

Choose a reason for hiding this comment

gerashegalov May 20, 2023 •

edited

Loading