-
Notifications
You must be signed in to change notification settings - Fork 919
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] fused_concatenate_kernel
can cause illegal memory access
#10333
Labels
bug
Something isn't working
Comments
abellina
added
bug
Something isn't working
Needs Triage
Need team to review and classify
labels
Feb 18, 2022
abellina
changed the title
[BUG]
[BUG] Feb 18, 2022
fused_concatenate_kernel
can overflowfused_concatenate_kernel
can cause illegal memory access
I'll have this up for review today. I added a test that reproduces it and building cuDF with the solution. |
I put up the PR #10344, to address the I've seen the same bug in other kernels. I am looking for input on how that should be handled. Does it seem we want 1 PR that handles the issue as a whole? For example:
|
rapids-bot bot
pushed a commit
that referenced
this issue
Feb 28, 2022
Fixes #10333. The repro case in the issue showed an illegal access error where the `output_index` of the strided loop in `fused_concatenate_kernel` can overflow for a large number of rows. For example, given 5 tables of exactly 250M rows each we would expect a result with 1,250,000,000 rows. The kernel is launched with 4,882,813 blocks (# of rows / 256 threads rounded up) with a stride of 1,250,000,128 (256 * 4,882,813). When `output_index` reaches 897,483,520, it overflows `output_index` on the first iteration. The change below prevents the overflow by making `output_index` an `int64_t` and adds a test that shows that we can now concatenate up to `size_type::max - 1` rows. Authors: - Alessandro Bellina (https://github.com/abellina) Approvers: - Nghia Truong (https://github.com/ttnghia) - Jake Hemstad (https://github.com/jrhemstad) - MithunR (https://github.com/mythrocks) URL: #10344
This was referenced Aug 9, 2023
rapids-bot bot
pushed a commit
that referenced
this issue
Aug 16, 2023
If data is sufficiently large, `fused_concatenate_string_chars_kernel` will attempt to read out of bounds and ultimately cause CUDA to raise `cudaErrorIllegalAddress`. Details on how the issue was encountered are in #13771, although this was an [already known problem](#10333 (comment)). Fixes #13771 . Authors: - Peter Andreas Entschev (https://github.com/pentschev) Approvers: - Bradley Dice (https://github.com/bdice) - Yunsong Wang (https://github.com/PointKernel) - Nghia Truong (https://github.com/ttnghia) URL: #13838
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
When concatenating via the
fused_concatenate_kernel
we see an issue where its strided loop can overflow, even if the output number of rows is < 2B. Here is a small scala repro case:Will produce:
After adding printfs, the grid size is very large and the stride update: https://github.com/rapidsai/cudf/blob/branch-22.04/cpp/src/copying/concatenate.cu#L200 can cause the
size_type
index to become negative.It seems the simplest fix is to change
output_index
to bestd::size_t
(https://github.com/rapidsai/cudf/blob/branch-22.04/cpp/src/copying/concatenate.cu#L200).Thanks to @nvdbaranec and @jlowe for debugging this with me.
The text was updated successfully, but these errors were encountered: