Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] GeoDataframe has broken slicing. #676

Closed
thomcom opened this issue Sep 14, 2022 · 0 comments · Fixed by #680
Closed

[BUG] GeoDataframe has broken slicing. #676

thomcom opened this issue Sep 14, 2022 · 0 comments · Fixed by #680
Assignees
Labels
bug Something isn't working Python Related to Python code

Comments

@thomcom
Copy link
Contributor

thomcom commented Sep 14, 2022

Describe the bug
If I try to use gpu_dataframe.head() or .iloc a strange error that doesn't touch cuspatial is thrown:

In [25]: gpu_dataframe.head()
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
Cell In [25], line 1
----> 1 gpu_dataframe.head()

File ~/compose/etc/conda/cuda_11.6/envs/notebooks/lib/python3.8/contextlib.py:75, in ContextDecorator.__call__.<locals>.inner(*args, **kwds)
     72 @wraps(func)
     73 def inner(*args, **kwds):
     74     with self._recreate_cm():
---> 75         return func(*args, **kwds)

File ~/cudf/python/cudf/cudf/core/frame.py:2648, in Frame.head(self, n)
   2567 @_cudf_nvtx_annotate
   2568 def head(self, n=5):
   2569     """
   2570     Return the first `n` rows.
   2571     This function returns the first `n` rows for the object based
   (...)
   2646     1    1  11.0
   2647     """
-> 2648     return self.iloc[:n]

File ~/cudf/python/cudf/cudf/core/dataframe.py:144, in _DataFrameIndexer.__getitem__(self, arg)
    142 if not isinstance(arg, tuple):
    143     arg = (arg, slice(None))
--> 144 return self._getitem_tuple_arg(arg)

File ~/compose/etc/conda/cuda_11.6/envs/notebooks/lib/python3.8/contextlib.py:75, in ContextDecorator.__call__.<locals>.inner(*args, **kwds)
     72 @wraps(func)
     73 def inner(*args, **kwds):
     74     with self._recreate_cm():
---> 75         return func(*args, **kwds)

File ~/cudf/python/cudf/cudf/core/dataframe.py:443, in _DataFrameIlocIndexer._getitem_tuple_arg(self, arg)
    441 else:
    442     if isinstance(arg[0], slice):
--> 443         df = columns_df._slice(arg[0])
    444     elif is_scalar(arg[0]):
    445         index = arg[0]

File ~/compose/etc/conda/cuda_11.6/envs/notebooks/lib/python3.8/contextlib.py:75, in ContextDecorator.__call__.<locals>.inner(*args, **kwds)
     72 @wraps(func)
     73 def inner(*args, **kwds):
     74     with self._recreate_cm():
---> 75         return func(*args, **kwds)

File ~/cudf/python/cudf/cudf/core/dataframe.py:1405, in DataFrame._slice(self, arg)
   1394     return self._gather(
   1395         cudf.core.column.arange(
   1396             start, stop=stop, step=stride, dtype=np.int32
   1397         )
   1398     )
   1400 columns_to_slice = [
   1401     *(self._index._data.columns if not is_range_index else []),
   1402     *self._columns,
   1403 ]
   1404 result = self._from_columns_like_self(
-> 1405     libcudf.copying.columns_slice(columns_to_slice, [start, stop])[0],
   1406     self._column_names,
   1407     None if is_range_index else self._index.names,
   1408 )
   1410 if is_range_index:
   1411     result.index = self.index[start:stop]

File copying.pyx:365, in cudf._lib.copying.columns_slice()
File utils.pyx:328, in cudf._lib.utils.columns_from_table_view()
File column.pyx:503, in cudf._lib.column.Column.from_column_view()
File column.pyx:87, in cudf._lib.column.Column.base_size.__get__()
AttributeError: 'str' object has no attribute 'itemsize'

Steps/Code to reproduce bug
This should produce the same result for you:

import cuspatial
import geopandas
host_dataframe = geopandas.read_file(geopandas.datasets.get_path("naturalearth_lowres"))
gpu_dataframe = cuspatial.from_geopandas(host_dataframe)
gpu_dataframe.head()

Expected behavior
Print the first five rows of the dataframe.

Environment details (please complete the following information):
rapids-compose

@thomcom thomcom added bug Something isn't working Needs Triage Need team to review and classify labels Sep 14, 2022
@thomcom thomcom self-assigned this Sep 14, 2022
@rapids-bot rapids-bot bot closed this as completed in #680 Sep 23, 2022
rapids-bot bot pushed a commit that referenced this issue Sep 23, 2022
This PR will adds a `_slice` method which is called when a `cudf.DataFrame` is accessed in a variety of ways. It also adds the `name` member to `GeoSeries` that are pulled from the `GeoDataFrame`.

Fixes #676

Authors:
  - H. Thomson Comer (https://github.com/thomcom)

Approvers:
  - Michael Wang (https://github.com/isVoid)

URL: #680
@harrism harrism moved this to Done in cuSpatial Sep 27, 2022
@harrism harrism added Python Related to Python code and removed Needs Triage Need team to review and classify labels Sep 27, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working Python Related to Python code
Projects
Status: Done
Development

Successfully merging a pull request may close this issue.

2 participants