Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Unable to perform setitem operations on slices of struct and list columns #11721

Closed
galipremsagar opened this issue Sep 20, 2022 · 0 comments · Fixed by #11760
Closed
Assignees
Labels
bug Something isn't working Python Affects Python cuDF API.

Comments

@galipremsagar
Copy link
Contributor

Describe the bug
When we want to fill a slice of a series column of type struct or list, there is an error.

Steps/Code to reproduce bug

In [49]: s = cudf.Series([[1, 2], [2, 3], [3, 4], [4, 5], [6, 7]])

In [50]: s
Out[50]: 
0    [1, 2]
1    [2, 3]
2    [3, 4]
3    [4, 5]
4    [6, 7]
dtype: list

In [51]: s[slice(0, 3, 1)]
Out[51]: 
0    [1, 2]
1    [2, 3]
2    [3, 4]
dtype: list

In [52]: s[slice(0, 3, 1)] = cudf.Scalar([10, 11])
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
Cell In [52], line 1
----> 1 s[slice(0, 3, 1)] = cudf.Scalar([10, 11])

File /nvme/0/pgali/envs/cudfdev/lib/python3.9/contextlib.py:79, in ContextDecorator.__call__.<locals>.inner(*args, **kwds)
     76 @wraps(func)
     77 def inner(*args, **kwds):
     78     with self._recreate_cm():
---> 79         return func(*args, **kwds)

File /nvme/0/pgali/envs/cudfdev/lib/python3.9/site-packages/cudf/core/series.py:1182, in Series.__setitem__(self, key, value)
   1179 @_cudf_nvtx_annotate
   1180 def __setitem__(self, key, value):
   1181     if isinstance(key, slice):
-> 1182         self.iloc[key] = value
   1183     else:
   1184         self.loc[key] = value

File /nvme/0/pgali/envs/cudfdev/lib/python3.9/contextlib.py:79, in ContextDecorator.__call__.<locals>.inner(*args, **kwds)
     76 @wraps(func)
     77 def inner(*args, **kwds):
     78     with self._recreate_cm():
---> 79         return func(*args, **kwds)

File /nvme/0/pgali/envs/cudfdev/lib/python3.9/site-packages/cudf/core/series.py:235, in _SeriesIlocIndexer.__setitem__(self, key, value)
    230         if self._frame._column.dtype != to_dtype:
    231             self._frame._column._mimic_inplace(
    232                 self._frame._column.astype(to_dtype), inplace=True
    233             )
--> 235 self._frame._column[key] = value

File /nvme/0/pgali/envs/cudfdev/lib/python3.9/site-packages/cudf/core/column/lists.py:99, in ListColumn.__setitem__(self, key, value)
     97 else:
     98     raise ValueError(f"Can not set {value} into ListColumn")
---> 99 super().__setitem__(key, value)

File /nvme/0/pgali/envs/cudfdev/lib/python3.9/site-packages/cudf/core/column/column.py:525, in ColumnBase.__setitem__(self, key, value)
    523 out: Optional[ColumnBase]  # If None, no need to perform mimic inplace.
    524 if isinstance(key, slice):
--> 525     out = self._scatter_by_slice(key, value_normalized)
    526 else:
    527     key = as_column(key)

File /nvme/0/pgali/envs/cudfdev/lib/python3.9/site-packages/cudf/core/column/column.py:557, in ColumnBase._scatter_by_slice(self, key, value)
    555 if step == 1:
    556     if isinstance(value, cudf.core.scalar.Scalar):
--> 557         return self._fill(value, start, stop, inplace=True)
    558     else:
    559         return libcudf.copying.copy_range(
    560             value, self, 0, num_keys, start, stop, False
    561         )

File /nvme/0/pgali/envs/cudfdev/lib/python3.9/site-packages/cudf/core/column/column.py:349, in ColumnBase._fill(self, fill_value, begin, end, inplace)
    346     mask = create_null_mask(self.size, state=MaskState.ALL_VALID)
    347     self.set_base_mask(mask)
--> 349 libcudf.filling.fill_in_place(self, begin, end, slr.device_value)
    351 return self

File filling.pyx:31, in cudf._lib.filling.fill_in_place()

RuntimeError: cuDF failure at: /nvme/0/pgali/cudf/cpp/src/filling/fill.cu:214: In-place fill does not support variable-sized types.






In [55]: s = cudf.Series([{'a':10, 'b':11}])

In [56]: s
Out[56]: 
0    {'a': 10, 'b': 11}
dtype: struct

In [57]: s[0] = {'a':12, 'b':5}

In [58]: s
Out[58]: 
0    {'a': 12, 'b': 5}
dtype: struct

In [59]: s
Out[59]: 
0    {'a': 12, 'b': 5}
dtype: struct

In [60]: s[slice(0, 1, 1)]
Out[60]: 
0    {'a': 12, 'b': 5}
dtype: struct

In [61]: s[slice(0, 1, 1)] = cudf.Scalar({'a':100, 'b':1})
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
Cell In [61], line 1
----> 1 s[slice(0, 1, 1)] = cudf.Scalar({'a':100, 'b':1})

File /nvme/0/pgali/envs/cudfdev/lib/python3.9/contextlib.py:79, in ContextDecorator.__call__.<locals>.inner(*args, **kwds)
     76 @wraps(func)
     77 def inner(*args, **kwds):
     78     with self._recreate_cm():
---> 79         return func(*args, **kwds)

File /nvme/0/pgali/envs/cudfdev/lib/python3.9/site-packages/cudf/core/series.py:1182, in Series.__setitem__(self, key, value)
   1179 @_cudf_nvtx_annotate
   1180 def __setitem__(self, key, value):
   1181     if isinstance(key, slice):
-> 1182         self.iloc[key] = value
   1183     else:
   1184         self.loc[key] = value

File /nvme/0/pgali/envs/cudfdev/lib/python3.9/contextlib.py:79, in ContextDecorator.__call__.<locals>.inner(*args, **kwds)
     76 @wraps(func)
     77 def inner(*args, **kwds):
     78     with self._recreate_cm():
---> 79         return func(*args, **kwds)

File /nvme/0/pgali/envs/cudfdev/lib/python3.9/site-packages/cudf/core/series.py:235, in _SeriesIlocIndexer.__setitem__(self, key, value)
    230         if self._frame._column.dtype != to_dtype:
    231             self._frame._column._mimic_inplace(
    232                 self._frame._column.astype(to_dtype), inplace=True
    233             )
--> 235 self._frame._column[key] = value

File /nvme/0/pgali/envs/cudfdev/lib/python3.9/site-packages/cudf/core/column/struct.py:82, in StructColumn.__setitem__(self, key, value)
     79         value[field] = value.get(field, NA)
     81     value = cudf.Scalar(value, self.dtype)
---> 82 super().__setitem__(key, value)

File /nvme/0/pgali/envs/cudfdev/lib/python3.9/site-packages/cudf/core/column/column.py:525, in ColumnBase.__setitem__(self, key, value)
    523 out: Optional[ColumnBase]  # If None, no need to perform mimic inplace.
    524 if isinstance(key, slice):
--> 525     out = self._scatter_by_slice(key, value_normalized)
    526 else:
    527     key = as_column(key)

File /nvme/0/pgali/envs/cudfdev/lib/python3.9/site-packages/cudf/core/column/column.py:557, in ColumnBase._scatter_by_slice(self, key, value)
    555 if step == 1:
    556     if isinstance(value, cudf.core.scalar.Scalar):
--> 557         return self._fill(value, start, stop, inplace=True)
    558     else:
    559         return libcudf.copying.copy_range(
    560             value, self, 0, num_keys, start, stop, False
    561         )

File /nvme/0/pgali/envs/cudfdev/lib/python3.9/site-packages/cudf/core/column/column.py:349, in ColumnBase._fill(self, fill_value, begin, end, inplace)
    346     mask = create_null_mask(self.size, state=MaskState.ALL_VALID)
    347     self.set_base_mask(mask)
--> 349 libcudf.filling.fill_in_place(self, begin, end, slr.device_value)
    351 return self

File filling.pyx:31, in cudf._lib.filling.fill_in_place()

RuntimeError: cuDF failure at: /nvme/0/pgali/cudf/cpp/src/filling/fill.cu:214: In-place fill does not support variable-sized types.

Expected behavior
We should be able to fill the values without hitting the inplace replace methods of libcudf for these two types specifically.

Environment overview (please complete the following information)

  • Environment location: [Bare-metal]
  • Method of cuDF install: [from source]

Additional context
Required for #11718

@galipremsagar galipremsagar added bug Something isn't working Python Affects Python cuDF API. labels Sep 20, 2022
@galipremsagar galipremsagar self-assigned this Sep 20, 2022
rapids-bot bot pushed a commit that referenced this issue Sep 26, 2022
Fixes: #11721 
This PR:
- [x] Fixes: #11721, by not going through the fill & fill_inplace APIs which don't support `struct` and `list` columns.
- [x] Fixes an issue in caching while constructing a `struct` or `list` scalar as `list` & `dict` objects are not hashable and we were running into the following errors:
```python
In [9]: i = cudf.Scalar([10, 11])
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
File /nvme/0/pgali/envs/cudfdev/lib/python3.9/site-packages/cudf/core/scalar.py:51, in CachedScalarInstanceMeta.__call__(self, value, dtype)
     49 try:
     50     # try retrieving an instance from the cache:
---> 51     self.__instances.move_to_end(cache_key)
     52     return self.__instances[cache_key]

KeyError: ([10, 11], <class 'list'>, None, <class 'NoneType'>)

During handling of the above exception, another exception occurred:

TypeError                                 Traceback (most recent call last)
Cell In [9], line 1
----> 1 i = cudf.Scalar([10, 11])

File /nvme/0/pgali/envs/cudfdev/lib/python3.9/site-packages/cudf/core/scalar.py:57, in CachedScalarInstanceMeta.__call__(self, value, dtype)
     53 except KeyError:
     54     # if an instance couldn't be found in the cache,
     55     # construct it and add to cache:
     56     obj = super().__call__(value, dtype=dtype)
---> 57     self.__instances[cache_key] = obj
     58     if len(self.__instances) > self.__maxsize:
     59         self.__instances.popitem(last=False)

TypeError: unhashable type: 'list'
```

Authors:
  - GALI PREM SAGAR (https://github.com/galipremsagar)

Approvers:
  - Matthew Roeschke (https://github.com/mroeschke)

URL: #11760
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working Python Affects Python cuDF API.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant