You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
When passing a list as input to StringMethods.strip, the operation fails and puts cuDF into a broken state, where it is generally not possible to create anymore frames and accessing created frames causes a segfault.
Steps/Code to reproduce bug
importcudfs=cudf.Series(["hi."]).str.strip(["."])
RuntimeError Traceback (most recent call last)
/tmp/ipykernel_1765/347004186.py in <module>
1 import cudf
2
----> 3 s = cudf.Series(["hi."]).str.strip(["."])
/opt/conda/envs/rapids/lib/python3.9/site-packages/cudf/core/column/string.py in strip(self, to_strip)
3185
3186 return self._return_or_inplace(
-> 3187 libstrings.strip(self._column, cudf.Scalar(to_strip))
3188 )
3189
cudf/_lib/strings/strip.pyx in cudf._lib.strings.strip.strip()
RuntimeError: for_each: failed to synchronize: cudaErrorIllegalAddress: an illegal memory access was encountered
print(s)
terminate called after throwing an instance of 'rmm::bad_alloc'
what(): std::bad_alloc: CUDA error at: /workspace/.conda-bld/work/include/rmm/mr/device/cuda_memory_resource.hpp:70: cudaErrorIllegalAddress an illegal memory access was encountered
Aborted (core dumped)
Expected behavior
The equivalent behavior in Pandas:
importpds=pd.Series(["hi."]).str.strip(["."])
s
0 NaN
dtype: float64
Environment overview (please complete the following information)
Environment location: bare metal
Method of cuDF install: conda
Environment details
Click here to see environment details
**git***
print_env.sh: 10: [: true: unexpected operator
Not inside a git repository
The to_strip parameter here is expected to only be a str type. Calling cudf.Scalar(to_strip) when to_strip is a list is later incorrectly converting a list-scalar into a string_scalar. I think the right behavior is to throw a TypeError instead of what Pandas does.
…10597)
Closes#10591
Ensures `to_strip` parameter is a `str` type when converting it to `cudf.Scalar`. It will now through a `TypeError` as follows
```
libstrings.strip(self._column, cudf.Scalar(to_strip, "str"))
File "/conda/envs/rapids/lib/python3.8/site-packages/cudf-22.6.0a0+96.g0aef0c1c3e.dirty-py3.8-linux-x86_64.egg/cudf/core/scalar.py", line 78, in __init__
self._host_value, self._host_dtype = self._preprocess_host_value(
File "/conda/envs/rapids/lib/python3.8/site-packages/cudf-22.6.0a0+96.g0aef0c1c3e.dirty-py3.8-linux-x86_64.egg/cudf/core/scalar.py", line 128, in _preprocess_host_value
raise TypeError("Lists may not be cast to a different dtype")
TypeError: Lists may not be cast to a different dtype
```
This will also prevent the _sticky_ CUDA error.
Also, added the `str` parameter to other `cudf.Scalar` calls where only strings are supported as well.
Authors:
- David Wendt (https://github.com/davidwendt)
Approvers:
- Ashwin Srinath (https://github.com/shwina)
- GALI PREM SAGAR (https://github.com/galipremsagar)
URL: #10597
Describe the bug
When passing a list as input to
StringMethods.strip
, the operation fails and puts cuDF into a broken state, where it is generally not possible to create anymore frames and accessing created frames causes a segfault.Steps/Code to reproduce bug
Expected behavior
The equivalent behavior in Pandas:
Environment overview (please complete the following information)
Environment details
Click here to see environment details
The text was updated successfully, but these errors were encountered: