Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEA] Support unique expression (polars) #16169

Closed
beckernick opened this issue Jul 2, 2024 · 2 comments · Fixed by #16173
Closed

[FEA] Support unique expression (polars) #16169

beckernick opened this issue Jul 2, 2024 · 2 comments · Fixed by #16173
Labels
1 - On Deck To be worked on next cudf.polars Issues specific to cudf.polars feature request New feature or request

Comments

@beckernick
Copy link
Member

The unique expression is useful for things like getting the unique number of {customers, accounts, devices, transactions, etc.} in a dataset.

import polars as pl
from functools import partial
from cudf_polars.callback import execute_with_cudf

ldf = pl.DataFrame({"a": [0,0,1,2]}).lazy()

print(ldf.select(pl.col("a").unique()).collect())
print(ldf.select(pl.col("a").unique()).collect(post_opt_callback=partial(execute_with_cudf, raise_on_fail=True)))
shape: (3, 1)
┌─────┐
│ a   │
│ --- │
│ i64 │
╞═════╡
│ 0   │
│ 1   │
│ 2   │
└─────┘
---------------------------------------------------------------------------
ComputeError                              Traceback (most recent call last)
Cell In[18], line 8
      5 ldf = pl.DataFrame({"a": [0,0,1,2]}).lazy()
      7 print(ldf.select(pl.col("a").unique()).collect())
----> 8 print(ldf.select(pl.col("a").unique()).collect(post_opt_callback=partial(execute_with_cudf, raise_on_fail=True)))

File [/raid/nicholasb/miniconda3/envs/all_cuda-122_arch-x86_64/lib/python3.11/site-packages/polars/lazyframe/frame.py:1942](http://10.117.23.184:8882/lab/tree/raid/raid/nicholasb/miniconda3/envs/all_cuda-122_arch-x86_64/lib/python3.11/site-packages/polars/lazyframe/frame.py#line=1941), in LazyFrame.collect(self, type_coercion, predicate_pushdown, projection_pushdown, simplify_expression, slice_pushdown, comm_subplan_elim, comm_subexpr_elim, cluster_with_columns, no_optimization, streaming, background, _eager, **_kwargs)
   1939 # Only for testing purposes atm.
   1940 callback = _kwargs.get("post_opt_callback")
-> 1942 return wrap_df(ldf.collect(callback))

ComputeError: 'cuda' conversion failed: NotImplementedError: No handler for Expr function node with name='unique'
@beckernick beckernick added the feature request New feature or request label Jul 2, 2024
@lithomas1
Copy link
Contributor

lithomas1 commented Jul 2, 2024

@wence- has this implemented locally I think
wence-@acf0e2f

@lithomas1 lithomas1 added the cudf.polars Issues specific to cudf.polars label Jul 2, 2024
wence- added a commit to wence-/cudf that referenced this issue Jul 2, 2024
And add evaluation handlers.

- Closes rapidsai#16169
wence- added a commit to wence-/cudf that referenced this issue Jul 2, 2024
And add evaluation handlers.

- Closes rapidsai#16169
@GPUtester GPUtester moved this from Todo to In Progress in cuDF Python Jul 2, 2024
@wence-
Copy link
Contributor

wence- commented Jul 2, 2024

@wence- has this implemented locally I think wence-@acf0e2f

Yes, thanks for the reminder.

@lithomas1 lithomas1 added the 1 - On Deck To be worked on next label Jul 2, 2024
rapids-bot bot pushed a commit that referenced this issue Jul 5, 2024
@github-project-automation github-project-automation bot moved this from In Progress to Done in cuDF Python Jul 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
1 - On Deck To be worked on next cudf.polars Issues specific to cudf.polars feature request New feature or request
Projects
Status: Done
Development

Successfully merging a pull request may close this issue.

3 participants