Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Python find_multiple still expects an nvstrings object as a parameter #4569

Closed
beckernick opened this issue Mar 18, 2020 · 3 comments
Closed
Labels
0 - Backlog In queue waiting for assignment bug Something isn't working Python Affects Python cuDF API. strings strings issues (C++ and Python)

Comments

@beckernick
Copy link
Member

beckernick commented Mar 18, 2020

Series.str.find_multiple can accept a device or host object for the strs parameter. If a device object is passed, it must be an nvstrings object. Since we're deprecating nvstrings, this should likely accept a Series instead (or perhaps something else but similar).

import nvstrings

s = nvstrings.to_device(['the', 'word'])
targets = nvstrings.to_device(['e','r'])
s.find_multiple(['e','r'])
s.find_multiple(targets) # also works
import cudf
import nvstringss = cudf.Series(['the', 'word'])
targets = cudf.Series(['e','r'])
s.str.find_multiple(['e','r'])
[[2, -1], [-1, 2]]
import cudf
s = cudf.Series(['the', 'word'])
targets = cudf.Series(['e','r'])
s.str.find_multiple(targets)
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
ValueError: nvstrings.find_multiple invalid strs parameter

The above exception was the direct cause of the following exception:

SystemError                               Traceback (most recent call last)
<ipython-input-72-184c8bd8ac85> in <module>
      2 s = cudf.Series(['the', 'word'])
      3 targets = cudf.Series(['e','r'])
----> 4 s.str.find_multiple(targets)

/raid/nicholasb/miniconda3/envs/rapids-20200318-cuda101-0905/lib/python3.7/site-packages/cudf/core/column/string.py in wrapper(*args, **kwargs)
    172                 @functools.wraps(passed_attr)
    173                 def wrapper(*args, **kwargs):
--> 174                     ret = passed_attr(*args, **kwargs)
    175                     if isinstance(ret, nvstrings.nvstrings):
    176                         ret = Series(

/raid/nicholasb/miniconda3/envs/rapids-20200318-cuda101-0905/lib/python3.7/site-packages/nvstrings.py in find_multiple(self, strs, devptr)
   2575 
   2576         """
-> 2577         rtn = pyniNVStrings.n_find_multiple(self.m_cptr, strs, devptr)
   2578         return rtn
   2579 

SystemError: <built-in function n_find_multiple> returned a result with an error set
import cudf
s = cudf.Series(['the', 'word'])
targets = cudf.Series(['e','r'])
s.str.find_multiple(targets._column.nvstrings)
[[2, -1], [-1, 2]]

From 03-18-2020 conda nightly installed at 9:04 AM EST.

cudf                      0.13.0a200318         py37_4269    rapidsai-nightly
cugraph                   0.13.0a200318          py37_386    rapidsai-nightly
cuml                      0.13.0a200318   cuda10.1_py37_1494    rapidsai-nightly
cuspatial                 0.13.0a200207            py37_7    rapidsai-nightly
dask-cuda                 0.13.0b200318           py37_69    rapidsai-nightly
dask-cudf                 0.13.0a200318         py37_4269    rapidsai-nightly
dask-xgboost              0.2.0.dev28      cuda10.1py36_0    rapidsai-nightly
libcudf                   0.13.0a200318     cuda10.1_4269    rapidsai-nightly
libcugraph                0.13.0a200318      cuda10.1_386    rapidsai-nightly
libcuml                   0.13.0a200318     cuda10.1_1494    rapidsai-nightly
libcumlprims              0.13.0a200313       cuda10.1_11    rapidsai-nightly
libcuspatial              0.13.0a200316       cuda10.1_19    rapidsai-nightly
libnvstrings              0.13.0a200318     cuda10.1_4269    rapidsai-nightly
librmm                    0.13.0a200318      cuda10.1_567    rapidsai-nightly
libxgboost                1.0.2dev.rapidsai0.13      cuda10.1_5    rapidsai-nightly
nvstrings                 0.13.0a200318         py37_4269    rapidsai-nightly
py-xgboost                1.0.2dev.rapidsai0.13  cuda10.1py37_5    rapidsai-nightly
rapids                    0.13.0          cuda10.1_py37_116    rapidsai-nightly
rapids-xgboost            0.13.0          cuda10.1_py37_116    rapidsai-nightly
rmm                       0.13.0a200318          py37_567    rapidsai-nightly
ucx                       1.7.0+g9d06c3a       cuda10.1_0    rapidsai-nightly
ucx-py                    0.13.0a200318+g9d06c3a         py37_76    rapidsai-nightly
xgboost                   1.0.2dev.rapidsai0.13  cuda10.1py37_5    rapidsai-nightly
@beckernick beckernick added bug Something isn't working libcudf Affects libcudf (C++/CUDA) code. Python Affects Python cuDF API. strings strings issues (C++ and Python) labels Mar 18, 2020
@beckernick beckernick removed the libcudf Affects libcudf (C++/CUDA) code. label Mar 18, 2020
@beckernick
Copy link
Member Author

beckernick commented Mar 18, 2020

Actually it looks like it's just that this isn't hitting the new API. Updating the title

@beckernick beckernick changed the title [BUG] libcudf find_multiple still expects an nvstrings object as a parameter [BUG] Python find_multiple still expects an nvstrings object as a parameter Mar 18, 2020
@galipremsagar galipremsagar self-assigned this Mar 18, 2020
@galipremsagar galipremsagar added the 0 - Backlog In queue waiting for assignment label Mar 18, 2020
@galipremsagar
Copy link
Contributor

After an internal discussion with @beckernick , @kkraus14 we decided to go ahead with using cython API directly and reshape the flattened result. We will probably plumb this API once there is list columns support in future releases.

@kkraus14
Copy link
Collaborator

This is no longer an issue as nvstrings is removed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
0 - Backlog In queue waiting for assignment bug Something isn't working Python Affects Python cuDF API. strings strings issues (C++ and Python)
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants