-
Notifications
You must be signed in to change notification settings - Fork 915
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix for integer overflow in contiguous-split #10437
Fix for integer overflow in contiguous-split #10437
Conversation
Codecov Report
@@ Coverage Diff @@
## branch-22.04 #10437 +/- ##
================================================
+ Coverage 86.13% 86.18% +0.04%
================================================
Files 139 139
Lines 22438 22468 +30
================================================
+ Hits 19328 19363 +35
+ Misses 3110 3105 -5
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great. Thanks for helping clean up the code!
rerun tests |
They python build failure looks like build system issue:
|
Thanks for the review @ttnghia! The build failures appear to be unrelated - do we need to wait until they are resolved before we can merge this? |
The CI failure fix has been merged (see #10442). @jbrennan333 Could you merge with the latest |
@gpucibot merge |
Thanks for the reviews @PointKernel, @nvdbaranec, and @ttnghia! And thanks for testing the patch @nvdbaranec! |
This is a fix for an integer overflow in
contiguous_split.cu::copy_data
.We encountered this problem when running a large query with the spark-rapids plug-in. The plug-in OOM callback was reporting negative numbers for some failed allocations. It treats the allocation size as a
long
. The allocations were coming fromcontiguous_split
with a large number of rows (399,000,000) of typeUINT64
.With help from @nvdbaranec, I was able to isolate the problem to this line:
The two operands are both
INT32
. So the result of the multiply is being sign-extended when assigning tobytes
.The fix is to cast these to
size_t
before the multiply.