-
Notifications
You must be signed in to change notification settings - Fork 915
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] When using libcudf replace_re with ?
and *
, some combinations of inputs trigger cudaErrorIllegalAddress
#10753
Comments
Note that this behavior is actually specific to combination of input string and regular expression. For example, if you use a different input string in the example for import cudf
s = cudf.Series(["ABCD"])
s.str.replace("D?s?", "_REPLACE_", regex=True) # this will not crash |
TL;DR: This should not crash and so that will be fixed but the result may be undefined. So the
But note this matches anything:
If you are trying to use like a wildcard you should use
The same issue occurs with
Regardless, it should no crash so this is a valid bug. But the I would consider the behavior undefined and would not rely on the output. |
Btw, in cuDF, the behavior seems to be fine in this instance, it is consistent with Python, which is acceptable.
This crash is specific to these combinations of regular expressions and input strings.
Probably, these are edge cases that came up when I was working on NVIDIA/spark-rapids#4468. |
Closes #10753 Fixes `cudf::strings::replace_re` logic that was reading past the end of a string when given a regex that contained net zero match quantifier pattern (e.g. 'D*' or 'D?s?' both can match to nothing). Authors: - David Wendt (https://github.com/davidwendt) Approvers: - Nghia Truong (https://github.com/ttnghia) - Bradley Dice (https://github.com/bdice) - Mark Harris (https://github.com/harrism) URL: #10760
Describe the bug
A
cudaErrorIllegalAddress
error is triggered when using regular expressions and replace with some combinations of inputs and outputs with*
and?
. This error is a permanent state for the process (probably memory corruption), and cuDF becomes unusable unless the process is restarted.Steps/Code to reproduce bug
Python code examples:
One with
*
.And here is one with
?
Expected behavior
At minimum, the crash should not impact future calls to cuDF (causing future GPU calls to be unusable) but these regular expressions should function in cuDF
Environment overview (please complete the following information)
Environment details
Click here to see environment details
The text was updated successfully, but these errors were encountered: