Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes creation of invalid DecimalType in GpuDivide.tagExprForGpu #1991

Merged
merged 3 commits into from
Mar 23, 2021

Conversation

razajafri
Copy link
Collaborator

While supporting DIV changes, I completely overlooked the fact that the precision and scale can be truncated.
This addresses that case.

fixes #1984

Signed-off-by: Raza Jafri [email protected]

Signed-off-by: Raza Jafri <[email protected]>
@razajafri
Copy link
Collaborator Author

@andygrove thank you for finding this. Please take a look at this when you get a chance. This will basically behave the same as before, which is preclude the Exp from being on the GPU.

@razajafri
Copy link
Collaborator Author

build

@andygrove
Copy link
Contributor

@razajafri I tested locally and can confirm that it fixed the issue when running with NDS query 58.

// To find out if outputType.precision < outputType.scale simplifies to p1 < 0
// which is never possible
// Case 1: OutputType.precision doesn't get truncated
// We will never hit a case where outputType.precision < outputType.scale + r.scale.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be good (and fairly easy) to add a unit test for this. I had a go and found some values that do hit this case but it is possible that I am using values that couldn't happen in real life. I am not sure.

Here is one example:

l = DecimalType(3,0)
r = DecimalType(21,17)
outputType = DecimalType(38,23)
38 < 23 + 17

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps I am hitting case 2 here where precision is getting truncated?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does look like I am hitting case 2 here so I think the logic in the comments is sound, but a unit test would make me more confident and would protect against future regressions if any of this code gets updated.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 for the unit test to verify that this is working. The logic looks good to me. I am a little concerned about the coupling between this code and the code in GpuDivideUtil, but that is minor for now.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I wrote a unit test locally to verify this. I don't know why I didn't check it in. Let me do that

Copy link
Collaborator Author

@razajafri razajafri Mar 23, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

l = DecimalType(3,0)
r = DecimalType(21,17)
outputType = DecimalType(38,23)
38 < 23 + 17

So, like you said this won't happen in reality as by the time we get the call both l and r Spark will be the same precision and scale, I think it will upcast the Decimal(3,0) to Decimal(21,17).

Signed-off-by: Raza Jafri <[email protected]>
@razajafri
Copy link
Collaborator Author

@andygrove @revans2 I have added a unit test PTAL

andygrove
andygrove previously approved these changes Mar 23, 2021
Copy link
Contributor

@andygrove andygrove left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks @razajafri

@razajafri
Copy link
Collaborator Author

build

@pytest.mark.parametrize('data_gen', [double_gen, decimal_gen_neg_scale, DecimalGen(6, 3),
DecimalGen(5, 5), DecimalGen(6, 0),
pytest.param(DecimalGen(38, 21), marks=pytest.mark.xfail(reason="The precision is too large to be supported on the GPU")),
pytest.param(DecimalGen(21, 17), marks=pytest.mark.xfail(reason="The precision is too large to be supported on the GPU"))], ids=idfn)
Copy link
Collaborator

@revans2 revans2 Mar 23, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I remove your patch and keep the test there will be no real difference in the results, unless someone goes through the logs manually and looks that in one case we failed by falling back to the CPU, and in another case we failed by throwing an exception about going over the limit. In both cases the xfail ignored the exception that was thrown.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have updated the test to check if the test raises IllegalArgumentException. Let me know if you want me to tighten it down even more. Ideally we should be checking the message to make sure its failing because its not columnar but I am not sure how to accomplish that using the pytest xfail

@razajafri
Copy link
Collaborator Author

@revans2 I tested the new test on top of branch-0.5 and it fails without my fix. PTAL

@sameerz sameerz added the bug Something isn't working label Mar 23, 2021
@razajafri
Copy link
Collaborator Author

build

@razajafri
Copy link
Collaborator Author

@revans2 do you have any more concerns?

@razajafri
Copy link
Collaborator Author

razajafri commented Mar 23, 2021

The CI has been waiting complaining insufficient CPU

@razajafri razajafri merged commit cc91b54 into NVIDIA:branch-0.5 Mar 23, 2021
@razajafri razajafri deleted the div-fix branch March 23, 2021 23:48
nartal1 pushed a commit to nartal1/spark-rapids that referenced this pull request Jun 9, 2021
…DIA#1991)

* fixes NVIDIA#1984

Signed-off-by: Raza Jafri <[email protected]>

* added unit tests

Signed-off-by: Raza Jafri <[email protected]>

* make sure the exception is not AnalysisException

Signed-off-by: Raza Jafri <[email protected]>

Co-authored-by: Raza Jafri <[email protected]>
nartal1 pushed a commit to nartal1/spark-rapids that referenced this pull request Jun 9, 2021
…DIA#1991)

* fixes NVIDIA#1984

Signed-off-by: Raza Jafri <[email protected]>

* added unit tests

Signed-off-by: Raza Jafri <[email protected]>

* make sure the exception is not AnalysisException

Signed-off-by: Raza Jafri <[email protected]>

Co-authored-by: Raza Jafri <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] NDS q58 Decimal scale (59) cannot be greater than precision (38).
4 participants