Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] contiguous_split can crash when an output partition exceeds 2GB #7514

Closed
jlowe opened this issue Mar 4, 2021 · 1 comment
Closed
Assignees
Labels
bug Something isn't working libcudf Affects libcudf (C++/CUDA) code.

Comments

@jlowe
Copy link
Member

jlowe commented Mar 4, 2021

Describe the bug
When contiguous_split is directed to build an output partition of a size > 2GB then one of the following can occur:

  • An exception is thrown with the message: cuDF failure at: workspace/spark/cudf18_nightly/cpp/src/copying/pack.cpp:113: Encountered column data outside the range of input buffer
  • GPU crash with illegal memory access

Steps/Code to reproduce bug

  1. Build up a table with columns that are rather large, e.g.: between 1 to 2GB each.
  2. Make sure the table has enough of these columns to exceed 2GB of total space then add yet one more of these large columns to the table
  3. Call contiguous_split with no split indices, i.e.: generate only one output partition consisting of the entire input table

Expected behavior
contiguous_split should not crash

Environment overview (please complete the following information)
cudf 0.18 run with the RAPIDS Accelerator for Apache Spark

@jlowe jlowe added bug Something isn't working Needs Triage Need team to review and classify libcudf Affects libcudf (C++/CUDA) code. labels Mar 4, 2021
@nvdbaranec
Copy link
Contributor

PR coming shortly for 0.19

@jlowe jlowe removed the Needs Triage Need team to review and classify label Mar 4, 2021
@rapids-bot rapids-bot bot closed this as completed in 42c6d15 Mar 10, 2021
hyperbolic2346 pushed a commit to hyperbolic2346/cudf that referenced this issue Mar 25, 2021
…apidsai#7515)

Fixes:
rapidsai#7514

Related:
NVIDIA/spark-rapids#1861

There were a couple of places where 32 bit values were being used for buffer sizes that needed to be 64 bit.

Authors:
  - @nvdbaranec

Approvers:
  - Vukasin Milovanovic (@vuule)
  - Jake Hemstad (@jrhemstad)

URL: rapidsai#7515
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working libcudf Affects libcudf (C++/CUDA) code.
Projects
None yet
Development

No branches or pull requests

2 participants