Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Raise NotImplementedError in pivot for cuDF if pivot_table is called with observed=True, backend is cuDF, and there are any categoricals #1400

Open
MarcoGorelli opened this issue Nov 18, 2024 · 1 comment
Labels

Comments

@MarcoGorelli
Copy link
Member

cuDF tests for pivot are failing: https://www.kaggle.com/code/marcogorelli/testing-cudf-in-narwhals?scriptVersionId=208170308

I think the simplest fix would be, in

result = frame.pivot_table(
values=values_,
index=index,
columns=on,
aggfunc=aggregate_function,
margins=False,
observed=True,
)

to do something like

if self._implementation is Implementation.CUDF and any(x == self._dtypes.Categorical for x in self.schema.values()):
    msg = "`pivot` with Categoricals is not implemented for cuDF backend"
    raise NotImplementedError(msg)
@MarcoGorelli MarcoGorelli added good first issue Good for newcomers, but anyone is welcome to submit a pull request! tests and removed good first issue Good for newcomers, but anyone is welcome to submit a pull request! labels Nov 18, 2024
@raisadz
Copy link
Contributor

raisadz commented Nov 19, 2024

I started looking into this issue but found that cuDF currently doesn't support list types of columns and index arguments in cudf.DataFrame.pivot.

I opened an issue about it in their repo rapidsai/cudf#17360.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants