-
Notifications
You must be signed in to change notification settings - Fork 917
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] "RuntimeError: Total number of concatenated rows exceeds size_type range" for 2.2 million row series #8748
Comments
List columns are limited by number of elements (not rows). The total number of elements in a list column cannot exceed Internally, a list column holds a column composed of all its elements, i.e., the following list column:
internally holds the column |
Thanks for the clarification. But isn't the max value of an int32 around 2 billion, whereas 2,200,000 * 512 is only 1.1 billion? Also why does accessing rows [0:60] not throw an error, but [0:61] does? |
Thanks -- you're right, that is a bug. Investigating. |
Slightly simpler, more revealing reproducer (although I still haven't root-caused this):
cc: @nvdbaranec |
…ecking. (#8760) Fixes #8748 Note: `concatenate_tests.cpp` was renamed to `concatenate_tests.cu` because of the addition of some thrust calls. Existing overflow tests moved to `OverflowTest` section. New tests specific to this PR are: `Overflowtest.Presliced` `OverflowTest.BigColumnsSmallSlices` Authors: - https://github.com/nvdbaranec Approvers: - Nghia Truong (https://github.com/ttnghia) - Michael Wang (https://github.com/isVoid) - GALI PREM SAGAR (https://github.com/galipremsagar) - Charles Blackmon-Luca (https://github.com/charlesbluca) URL: #8760
Describe the bug
A cudf.Series containing lists will throw a RuntimeError if one tries to access a certain number of rows, depending on the length of the contained lists.
Steps/Code to reproduce bug
This will display just fine (2 million rows).
2.2 million rows. This will throw a "RuntimeError: cuDF failure at: ../src/copying/concatenate.cu:359: Total number of concatenated rows exceeds size_type range". However, accesssing an element directly by index will work fine. Also, accessing elements [0:60] will work, but [0:61] will throw the same exception.
This will display just fine (2.2 million rows, but shorter lists).
Expected behavior
Expected behavior would be for the series to access rows without throwing an error, independent of the dimensions of the object contained, same as in e.g. a pandas series.
Environment overview (please complete the following information)
Environment is a jupyterlab notebook hosted in the rapidsai docker container, using an RTX3090 with 24GB VRAM. cuml version is 21.06.01+2.g101fc0fda4.
The text was updated successfully, but these errors were encountered: