We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cudf.DataFrame.pivot
Describe the bug Function cudf.DataFrame.pivot doesn't support lists of columns and index arguments. In addition, the error message is misleading.
columns
index
Steps/Code to reproduce bug
import cudf data = {'bar': ['x', 'y', 'z', 'w'], 'col': ['a', 'b', 'a', 'b'], 'foo': [1, 2, 3, 4], 'ix': [1, 1, 2, 2]} df = cudf.DataFrame(data) df.pivot(columns=['col'], index=['ix'])
This produces the following error:
ValueError Traceback (most recent call last) Cell In[19], line 1 ----> 1 df.pivot(columns=['col'], index=['ix']) File /opt/conda/lib/python3.10/site-packages/cudf/utils/performance_tracking.py:51, in _performance_tracking.<locals>.wrapper(*args, **kwargs) 43 if nvtx.enabled(): 44 stack.enter_context( 45 nvtx.annotate( 46 message=func.__qualname__, (...) 49 ) 50 ) ---> 51 return func(*args, **kwargs) File /opt/conda/lib/python3.10/site-packages/cudf/core/dataframe.py:7475, in DataFrame.pivot(self, columns, index, values) 7472 @_performance_tracking 7473 @copy_docstring(reshape.pivot) 7474 def pivot(self, *, columns, index=no_default, values=no_default): -> 7475 return cudf.core.reshape.pivot( 7476 self, index=index, columns=columns, values=values 7477 ) File /opt/conda/lib/python3.10/site-packages/cudf/core/reshape.py:1036, in pivot(data, columns, index, values) 1034 index = df.index 1035 else: -> 1036 index = cudf.core.index.Index(df.loc[:, index]) 1037 columns = cudf.Index(df.loc[:, columns]) 1039 # Create a DataFrame composed of columns from both 1040 # columns and index File /opt/conda/lib/python3.10/site-packages/cudf/core/index.py:87, in IndexMeta.__call__(cls, data, *args, **kwargs) 82 raise NotImplementedError( 83 "tupleize_cols is currently not supported." 84 ) 86 if cls is Index: ---> 87 return as_index( 88 arbitrary=data, 89 *args, 90 **kwargs, 91 ) 92 return super().__call__(data, *args, **kwargs) File /opt/conda/lib/python3.10/site-packages/cudf/utils/performance_tracking.py:51, in _performance_tracking.<locals>.wrapper(*args, **kwargs) 43 if nvtx.enabled(): 44 stack.enter_context( 45 nvtx.annotate( 46 message=func.__qualname__, (...) 49 ) 50 ) ---> 51 return func(*args, **kwargs) File /opt/conda/lib/python3.10/site-packages/cudf/core/index.py:3114, in as_index(arbitrary, nan_as_null, copy, name, dtype) 3110 return cudf.MultiIndex.from_pandas( 3111 arbitrary.copy(deep=copy), nan_as_null=nan_as_null 3112 ) 3113 elif isinstance(arbitrary, cudf.DataFrame) or is_scalar(arbitrary): -> 3114 raise ValueError("Index data must be 1-dimensional and list-like") 3115 else: 3116 return as_index( 3117 column.as_column(arbitrary, dtype=dtype, nan_as_null=nan_as_null), 3118 copy=copy, 3119 name=name, 3120 dtype=dtype, 3121 ) ValueError: Index data must be 1-dimensional and list-like
Expected behavior Expected the same output as produced by pandas:
print(df.to_pandas().pivot(columns=['col'], index=['ix'])) bar foo col a b a b ix 1 x y 1 2 2 z w 3 4
Environment overview (please complete the following information)
Environment details
print(cudf.__version__) 24.08.03
Additional context Add any other context about the problem here.
The text was updated successfully, but these errors were encountered:
pivot
pivot_table
observed=True
Thanks for the report!
I opened #17373 to fix this, and hopefully it should be included in the 24.12 release
Sorry, something went wrong.
332cc06
mroeschke
Successfully merging a pull request may close this issue.
Describe the bug
Function
cudf.DataFrame.pivot
doesn't support lists ofcolumns
andindex
arguments. In addition, the error message is misleading.Steps/Code to reproduce bug
This produces the following error:
Expected behavior
Expected the same output as produced by pandas:
Environment overview (please complete the following information)
Environment details
Additional context
Add any other context about the problem here.
The text was updated successfully, but these errors were encountered: