Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for DecimalType in Remainder for Spark 3.4 and DB 11.3 [databricks] #8302

Merged
merged 6 commits into from
May 21, 2023

Conversation

NVnavkumar
Copy link
Collaborator

Fixes #8161.

This reverts the changes to integration tests made in #7609, and adds support for Remainder with DecimalType in Spark 3.4 and Databricks 11.3.

This does this by casting the operands using the following formula:

scale = max(s1, s2)
precision = max(p1-s1,p2-p2) + scale
lhsType = rhsType = DecimalType(precision, scale)

This enables just enough precision and scale to compute the remainder. This gives enough scale to produce the part of the output that is < 1, and enough precision to contain the values of both operands.

The output type is actually this formula:

scale = max(s1, s2)
precision = min(p1-s1,p2-p2) + scale
outputType = DecimalType(precision, scale)

Which is only just enough precision and scale to store the remainder.

@NVnavkumar NVnavkumar requested a review from razajafri May 16, 2023 21:57
@NVnavkumar NVnavkumar self-assigned this May 16, 2023
@NVnavkumar NVnavkumar requested a review from revans2 May 16, 2023 21:58
@NVnavkumar NVnavkumar added Spark 3.4+ Spark 3.4+ issues bug Something isn't working labels May 16, 2023
@NVnavkumar
Copy link
Collaborator Author

build

…der normal circumstances would cause an overflow. Spark 3.4 handles this differently and produces a correct value

Signed-off-by: Navin Kumar <[email protected]>
@NVnavkumar
Copy link
Collaborator Author

build

revans2
revans2 previously approved these changes May 19, 2023
Copy link
Collaborator

@revans2 revans2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feel free to ignore the nit if it means we can be faster at getting the full fix in by 23.06.

("lhs", TypeSig.integral + TypeSig.fp, TypeSig.cpuNumeric),
("rhs", TypeSig.integral + TypeSig.fp, TypeSig.cpuNumeric)),
("lhs", TypeSig.gpuNumeric, TypeSig.cpuNumeric),
("rhs", TypeSig.gpuNumeric, TypeSig.cpuNumeric)),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: it would be nice to add in a psNote for DECIMAL128 here to explain what we don't fully support. This is really minor if we think we can get full remainder support in before we ship 23.06. (especially minor because we don't generate the support docs for anything but the oldest version of Spark that we support) Oh well....

@revans2
Copy link
Collaborator

revans2 commented May 19, 2023

Looks like you have some lines over 100 chars that need to be fixed.

@NVnavkumar
Copy link
Collaborator Author

build

@NVnavkumar NVnavkumar requested a review from revans2 May 19, 2023 17:59
@revans2
Copy link
Collaborator

revans2 commented May 19, 2023

build

@NVnavkumar
Copy link
Collaborator Author

Follow up for Decimal128 support is here: #8330

Comment on lines +114 to +123
if (a.left.dataType.isInstanceOf[DecimalType] &&
a.right.dataType.isInstanceOf[DecimalType]) {
val lhsType = a.left.dataType.asInstanceOf[DecimalType]
val rhsType = a.right.dataType.asInstanceOf[DecimalType]
val needed = DecimalRemainderChecks.neededPrecision(lhsType, rhsType)
if (needed > DType.DECIMAL128_MAX_PRECISION) {
willNotWorkOnGpu(s"needed intermediate precision ($needed) will overflow " +
s"outside of the maximum available decimal128 precision")
}
}
Copy link
Collaborator

@gerashegalov gerashegalov May 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (a.left.dataType.isInstanceOf[DecimalType] &&
a.right.dataType.isInstanceOf[DecimalType]) {
val lhsType = a.left.dataType.asInstanceOf[DecimalType]
val rhsType = a.right.dataType.asInstanceOf[DecimalType]
val needed = DecimalRemainderChecks.neededPrecision(lhsType, rhsType)
if (needed > DType.DECIMAL128_MAX_PRECISION) {
willNotWorkOnGpu(s"needed intermediate precision ($needed) will overflow " +
s"outside of the maximum available decimal128 precision")
}
}
(a.left.dataType, a.right.dataType) match {
case (lhsType: DecimalType, rhsType: DecimalType) =>
val needed = DecimalRemainderChecks.neededPrecision(lhsType, rhsType)
if (needed > DType.DECIMAL128_MAX_PRECISION) {
willNotWorkOnGpu(s"needed intermediate precision ($needed) will overflow " +
s"outside of the maximum available decimal128 precision")
}
case _ => ()
}

@NVnavkumar NVnavkumar merged commit be82a20 into NVIDIA:branch-23.06 May 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working Spark 3.4+ Spark 3.4+ issues
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add support for Remainder[DecimalType] for Spark 3.4 and DB 11.3
3 participants