Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] regexp: cudaErrorIllegalAddress with replace using regex with possibly empty repetition and string anchor #11006

Closed
anthony-chang opened this issue May 30, 2022 · 2 comments
Assignees
Labels
bug Something isn't working

Comments

@anthony-chang
Copy link
Contributor

anthony-chang commented May 30, 2022

Describe the bug
cuDF throws cudaErrorIllegalAddress: an illegal memory access was encountered when attempting to replace an input using a regex containing a repetition that could possibly match zero results, followed by a string anchor $. The error seems to only happens when the character preceding the repetition does not appear in the input.

Steps/Code to reproduce bug

>>> import cudf
>>> cudf.Series(['b']).str.replace('a?$', '#', regex=True)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/antchang/miniconda3/lib/python3.9/site-packages/cudf/core/column/string.py", line 931, in replace
    libstrings.replace_re(
  File "cudf/_lib/strings/replace_re.pyx", line 41, in cudf._lib.strings.replace_re.replace_re
RuntimeError: for_each: failed to synchronize: cudaErrorIllegalAddress: an illegal memory access was encountered

The error does not happen if the character before the repetition is in the input:

>>> cudf.Series(['a']).str.replace('a?$', '#', regex=True)
0    ##
dtype: object

The same error happens for any repetitions that could possibly match zero results:

>>> cudf.Series(['b']).str.replace('a*$', '#', regex=True)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/antchang/miniconda3/lib/python3.9/site-packages/cudf/core/column/string.py", line 931, in replace
    libstrings.replace_re(
  File "cudf/_lib/strings/replace_re.pyx", line 41, in cudf._lib.strings.replace_re.replace_re
RuntimeError: for_each: failed to synchronize: cudaErrorIllegalAddress: an illegal memory access was encountered
>>> cudf.Series(['b']).str.replace('a{0,5}$', '#', regex=True)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/antchang/miniconda3/lib/python3.9/site-packages/cudf/core/column/string.py", line 931, in replace
    libstrings.replace_re(
  File "cudf/_lib/strings/replace_re.pyx", line 41, in cudf._lib.strings.replace_re.replace_re
RuntimeError: for_each: failed to synchronize: cudaErrorIllegalAddress: an illegal memory access was encountered

However, the same case works for searching

>>> cudf.Series(['b']).str.contains('a?$', regex=True)
0    True
dtype: bool

Expected behavior
I expect the replace operation to complete with no error.

Environment overview (please complete the following information)

  • Environment location: bare metal
  • Method of cuDF install: miniconda

Environment details

Click here to see environment details
 **git***
 Not inside a git repository

 ***OS Information***
 DISTRIB_ID=Ubuntu
 DISTRIB_RELEASE=18.04
 DISTRIB_CODENAME=bionic
 DISTRIB_DESCRIPTION="Ubuntu 18.04.5 LTS"
 NAME="Ubuntu"
 VERSION="18.04.5 LTS (Bionic Beaver)"
 ID=ubuntu
 ID_LIKE=debian
 PRETTY_NAME="Ubuntu 18.04.5 LTS"
 VERSION_ID="18.04"
 HOME_URL="https://www.ubuntu.com/"
 SUPPORT_URL="https://help.ubuntu.com/"
 BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
 PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
 VERSION_CODENAME=bionic
 UBUNTU_CODENAME=bionic
 Linux c240m5-01 5.4.0-109-generic #123~18.04.1-Ubuntu SMP Fri Apr 8 09:48:52 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux

 ***GPU Information***
 Mon May 30 10:25:18 2022
 +-----------------------------------------------------------------------------+
 | NVIDIA-SMI 495.29.05    Driver Version: 495.29.05    CUDA Version: 11.5     |
 |-------------------------------+----------------------+----------------------+
 | GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
 | Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
 |                               |                      |               MIG M. |
 |===============================+======================+======================|
 |   0  Tesla T4            On   | 00000000:19:00.0 Off |                    0 |
 | N/A   52C    P0    39W /  70W |   1782MiB / 15109MiB |    100%      Default |
 |                               |                      |                  N/A |
 +-------------------------------+----------------------+----------------------+
 |   1  Tesla T4            On   | 00000000:5E:00.0 Off |                    0 |
 | N/A   53C    P0    39W /  70W |   9291MiB / 15109MiB |     48%      Default |
 |                               |                      |                  N/A |
 +-------------------------------+----------------------+----------------------+
 |   2  Tesla T4            On   | 00000000:86:00.0 Off |                    0 |
 | N/A   47C    P0    38W /  70W |   8097MiB / 15109MiB |     40%      Default |
 |                               |                      |                  N/A |
 +-------------------------------+----------------------+----------------------+
 |   3  Tesla T4            On   | 00000000:AF:00.0 Off |                    0 |
 | N/A   49C    P0    39W /  70W |   6551MiB / 15109MiB |     54%      Default |
 |                               |                      |                  N/A |
 +-------------------------------+----------------------+----------------------+

 +-----------------------------------------------------------------------------+
 | Processes:                                                                  |
 |  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
 |        ID   ID                                                   Usage      |
 |=============================================================================|
 |    0   N/A  N/A      1924      G   /usr/lib/xorg/Xorg                  4MiB |
 |    0   N/A  N/A      3157      C   ./miniconda3/bin/python            97MiB |
 |    0   N/A  N/A     30057      C   ...penjdk-amd64/jre/bin/java      403MiB |
 |    0   N/A  N/A     43651      C   ...penjdk-amd64/jre/bin/java      467MiB |
 |    0   N/A  N/A     52020      C   ...penjdk-amd64/jre/bin/java      403MiB |
 |    0   N/A  N/A     61698      C   ...penjdk-amd64/jre/bin/java      403MiB |
 |    1   N/A  N/A      1924      G   /usr/lib/xorg/Xorg                  4MiB |
 |    1   N/A  N/A     65472      C   python                            667MiB |
 |    1   N/A  N/A     66647      C   python                            435MiB |
 |    1   N/A  N/A     69107      C   /opt/conda/bin/python             715MiB |
 |    1   N/A  N/A     70105      C   /opt/conda/bin/python            7467MiB |
 |    2   N/A  N/A      1924      G   /usr/lib/xorg/Xorg                  4MiB |
 |    2   N/A  N/A     65261      C   python                            971MiB |
 |    2   N/A  N/A     66647      C   python                            435MiB |
 |    2   N/A  N/A     68637      C   /usr/bin/python                  5539MiB |
 |    2   N/A  N/A     70105      C   /opt/conda/bin/python            1145MiB |
 |    3   N/A  N/A      1924      G   /usr/lib/xorg/Xorg                  4MiB |
 |    3   N/A  N/A     65261      C   python                            971MiB |
 |    3   N/A  N/A     65472      C   python                            667MiB |
 |    3   N/A  N/A     68637      C   /usr/bin/python                   989MiB |
 |    3   N/A  N/A     69107      C   /opt/conda/bin/python            3917MiB |
 +-----------------------------------------------------------------------------+

 ***CPU***
 Architecture:        x86_64
 CPU op-mode(s):      32-bit, 64-bit
 Byte Order:          Little Endian
 CPU(s):              72
 On-line CPU(s) list: 0-71
 Thread(s) per core:  2
 Core(s) per socket:  18
 Socket(s):           2
 NUMA node(s):        2
 Vendor ID:           GenuineIntel
 CPU family:          6
 Model:               85
 Model name:          Intel(R) Xeon(R) Gold 6154 CPU @ 3.00GHz
 Stepping:            4
 CPU MHz:             3648.377
 CPU max MHz:         3700.0000
 CPU min MHz:         1200.0000
 BogoMIPS:            6000.00
 Virtualization:      VT-x
 L1d cache:           32K
 L1i cache:           32K
 L2 cache:            1024K
 L3 cache:            25344K
 NUMA node0 CPU(s):   0-17,36-53
 NUMA node1 CPU(s):   18-35,54-71
 Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb cat_l3 cdp_l3 invpcid_single pti intel_ppin ssbd mba ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm cqm mpx rdt_a avx512f avx512dq rdseed adx smap clflushopt clwb intel_pt avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local dtherm ida arat pln pts hwp hwp_act_window hwp_epp hwp_pkg_req pku ospke md_clear flush_l1d

 ***CMake***

 ***g++***
 /usr/bin/g++
 g++ (Ubuntu 9.3.0-11ubuntu0~18.04.1) 9.3.0
 Copyright (C) 2019 Free Software Foundation, Inc.
 This is free software; see the source for copying conditions.  There is NO
 warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.


 ***nvcc***

 ***Python***
 /home/antchang/miniconda3/bin/python
 Python 3.9.12

 ***Environment Variables***
 PATH                            : /home/antchang/miniconda3/bin:/home/antchang/.poetry/bin:/home/antchang/miniconda3/condabin:/home/antchang/.pyenv/shims:/home/antchang/.pyenv/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin:/home/antchang/spark/bin:/home/antchang/spark/sbin
 LD_LIBRARY_PATH                 :
 NUMBAPRO_NVVM                   :
 NUMBAPRO_LIBDEVICE              :
 CONDA_PREFIX                    : /home/antchang/miniconda3
 PYTHON_PATH                     :

 ***conda packages***
 /home/antchang/miniconda3/bin/conda
 # packages in environment at /home/antchang/miniconda3:
 #
 # Name                    Version                   Build  Channel
 _libgcc_mutex             0.1                 conda_forge    conda-forge
 _openmp_mutex             4.5                       2_gnu    conda-forge
 abseil-cpp                20211102.0           h27087fc_1    conda-forge
 arrow-cpp                 5.0.0           py39h1e51584_34_cuda    conda-forge
 arrow-cpp-proc            3.0.0                      cuda    conda-forge
 aws-c-cal                 0.5.11               h95a6274_0    conda-forge
 aws-c-common              0.6.2                h27cfd23_0
 aws-c-event-stream        0.2.7               h3541f99_13    conda-forge
 aws-c-io                  0.10.5               hfb6a706_0    conda-forge
 aws-checksums             0.1.11               ha31a3da_7    conda-forge
 aws-sdk-cpp               1.8.186              hb4091e7_3    conda-forge
 brotlipy                  0.7.0           py39hb9d737c_1004    conda-forge
 bzip2                     1.0.8                h7f98852_4    conda-forge
 c-ares                    1.18.1               h7f98852_0    conda-forge
 ca-certificates           2022.5.18.1          ha878542_0    conda-forge
 cachetools                5.0.0              pyhd8ed1ab_0    conda-forge
 certifi                   2022.5.18.1      py39hf3d152e_0    conda-forge
 cffi                      1.15.0           py39hd667e15_1
 charset-normalizer        2.0.12             pyhd8ed1ab_0    conda-forge
 colorama                  0.4.4              pyh9f0ad1d_0    conda-forge
 conda                     4.13.0           py39h06a4308_0
 conda-package-handling    1.8.1            py39hb9d737c_1    conda-forge
 cryptography              37.0.2           py39hd97740a_0    conda-forge
 cuda-python               11.6.1           py39he80948d_0    conda-forge
 cudatoolkit               11.6.0              hecad31d_10    conda-forge
 cudf                      22.02.00        cuda_11_py39_g774d859fef_0    rapidsai
 cupy                      10.4.0           py39hc3c280e_0    conda-forge
 dlpack                    0.5                  h9c3ff4c_0    conda-forge
 fastavro                  1.4.12           py39hb9d737c_0    conda-forge
 fastrlock                 0.8              py39h5a03fae_2    conda-forge
 fsspec                    2022.5.0           pyhd8ed1ab_0    conda-forge
 gflags                    2.2.2             he1b5a44_1004    conda-forge
 glog                      0.6.0                h6f12383_0    conda-forge
 grpc-cpp                  1.46.3               hc275302_0    conda-forge
 idna                      3.3                pyhd8ed1ab_0    conda-forge
 keyutils                  1.6.1                h166bdaf_0    conda-forge
 krb5                      1.19.3               h3790be6_0    conda-forge
 ld_impl_linux-64          2.36.1               hea4e1c9_2    conda-forge
 libblas                   3.9.0           14_linux64_openblas    conda-forge
 libbrotlicommon           1.0.9                h166bdaf_7    conda-forge
 libbrotlidec              1.0.9                h166bdaf_7    conda-forge
 libbrotlienc              1.0.9                h166bdaf_7    conda-forge
 libcblas                  3.9.0           14_linux64_openblas    conda-forge
 libcudf                   22.02.00        cuda11_g774d859fef_0    rapidsai
 libcurl                   7.83.1               h7bff187_0    conda-forge
 libedit                   3.1.20191231         he28a2e2_2    conda-forge
 libev                     4.33                 h516909a_1    conda-forge
 libevent                  2.1.10               h9b69904_4    conda-forge
 libffi                    3.3                  h58526e2_2    conda-forge
 libgcc-ng                 12.1.0              h8d9b700_16    conda-forge
 libgfortran-ng            12.1.0              h69a702a_16    conda-forge
 libgfortran5              12.1.0              hdcd56e2_16    conda-forge
 libgomp                   12.1.0              h8d9b700_16    conda-forge
 liblapack                 3.9.0           14_linux64_openblas    conda-forge
 libllvm11                 11.1.0               hf817b99_3    conda-forge
 libnghttp2                1.47.0               h727a467_0    conda-forge
 libopenblas               0.3.20          pthreads_h78a6416_0    conda-forge
 libprotobuf               3.20.1               h6239696_0    conda-forge
 librmm                    22.02.00             he96e62b_1    conda-forge
 libssh2                   1.10.0               ha56f1ee_2    conda-forge
 libstdcxx-ng              12.1.0              ha89aaad_16    conda-forge
 libthrift                 0.16.0               h519c5ea_1    conda-forge
 libutf8proc               2.7.0                h7f98852_0    conda-forge
 libzlib                   1.2.12               h166bdaf_0    conda-forge
 llvmlite                  0.38.1           py39h7d9a04d_0    conda-forge
 lz4-c                     1.9.3                h9c3ff4c_1    conda-forge
 ncurses                   6.3                  h27087fc_1    conda-forge
 numba                     0.55.1           py39h66db6d7_1    conda-forge
 numpy                     1.21.6           py39h18676bf_0    conda-forge
 nvtx                      0.2.3            py39h3811e60_1    conda-forge
 openssl                   1.1.1o               h166bdaf_0    conda-forge
 orc                       1.7.4                h6c59b99_1    conda-forge
 packaging                 21.3               pyhd8ed1ab_0    conda-forge
 pandas                    1.3.4            py39hde0f152_1    conda-forge
 parquet-cpp               1.5.1                         2    conda-forge
 pip                       22.1.1             pyhd8ed1ab_0    conda-forge
 protobuf                  3.20.1           py39h5a03fae_0    conda-forge
 ptxcompiler               0.3.0            py39h1689609_2    conda-forge
 pyarrow                   5.0.0           py39h1ed2e5d_34_cuda    conda-forge
 pycosat                   0.6.3           py39hb9d737c_1010    conda-forge
 pycparser                 2.21               pyhd8ed1ab_0    conda-forge
 pyopenssl                 22.0.0             pyhd8ed1ab_0    conda-forge
 pyparsing                 3.0.9              pyhd8ed1ab_0    conda-forge
 pysocks                   1.7.1            py39hf3d152e_5    conda-forge
 python                    3.9.12               h12debd9_0
 python-dateutil           2.8.2              pyhd8ed1ab_0    conda-forge
 python_abi                3.9                      2_cp39    conda-forge
 pytz                      2022.1             pyhd8ed1ab_0    conda-forge
 re2                       2022.04.01           h27087fc_0    conda-forge
 readline                  8.1                  h46c0cb4_0    conda-forge
 requests                  2.27.1             pyhd8ed1ab_0    conda-forge
 rmm                       22.02.00         py39h8804ad5_0    conda-forge
 ruamel_yaml               0.15.80         py39hb9d737c_1007    conda-forge
 s2n                       1.0.10               h9b69904_0    conda-forge
 setuptools                59.8.0           py39hf3d152e_1    conda-forge
 six                       1.16.0             pyh6c4a22f_0    conda-forge
 snappy                    1.1.9                hbd366e4_1    conda-forge
 spdlog                    1.9.2                h4bd325d_1    conda-forge
 sqlite                    3.38.5               h4ff8645_0    conda-forge
 thrust                    1.16.0               h0800d71_1    conda-forge
 tk                        8.6.12               h27826a3_0    conda-forge
 tqdm                      4.64.0             pyhd8ed1ab_0    conda-forge
 typing_extensions         4.2.0              pyha770c72_1    conda-forge
 tzdata                    2022a                h191b570_0    conda-forge
 urllib3                   1.26.9             pyhd8ed1ab_0    conda-forge
 wheel                     0.37.1             pyhd8ed1ab_0    conda-forge
 xz                        5.2.5                h516909a_1    conda-forge
 yaml                      0.2.5                h7f98852_2    conda-forge
 zlib                      1.2.12               h166bdaf_0    conda-forge
 zstd                      1.5.2                h8a70e8d_1    conda-forge

Additional context
None

@anthony-chang anthony-chang added Needs Triage Need team to review and classify bug Something isn't working labels May 30, 2022
@davidwendt davidwendt self-assigned this May 31, 2022
@davidwendt
Copy link
Contributor

I believe this is fixed now in #10760 which is included in 22.06.

>>> cudf.Series(['b']).str.replace('a?$', '#', regex=True)
0    b#
dtype: object
>>> cudf.Series(['a']).str.replace('a?$', '#', regex=True)
0    ##
dtype: object
>>> cudf.Series(['b']).str.replace('a*$', '#', regex=True)
0    b#
dtype: object
>>> cudf.Series(['b']).str.replace('a{0,5}$', '#', regex=True)
0    b#
dtype: object
>>> cudf.Series(['b']).str.contains('a?$', regex=True)
0    True
dtype: bool

@anthony-chang
Copy link
Contributor Author

You're right, sorry I didn't see this. I will mark this as resolved.

@bdice bdice removed the Needs Triage Need team to review and classify label Mar 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants