Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEA] DataFrame and Index nunique #9611

Closed
beckernick opened this issue Nov 4, 2021 · 0 comments · Fixed by #10077
Closed

[FEA] DataFrame and Index nunique #9611

beckernick opened this issue Nov 4, 2021 · 0 comments · Fixed by #10077
Labels
feature request New feature or request good first issue Good for newcomers Python Affects Python cuDF API.

Comments

@beckernick
Copy link
Member

In Pandas, DataFrame.nunique returns Series in which each row index corresponds to a column name in the original dataframe and the value corresponds to the number of distinct elements in that column. The Index methods behave like the Series method.

We already have the machinery to run this with {DataFrame, SingleColumnFrame}._reduce, but it would be nice to expose this as a top level API on DataFrame (and Index). The Series and Groupby APIs are already implemented.

import pandas as pddf = pd.DataFrame({
    "key": [0, 1, 1, 0, 0, 1],
    "val": [1, 8, 3, 9, -3, 8],
})
print(df, "\n")
print(df.nunique())
   key  val
0    0    1
1    1    8
2    1    3
3    0    9
4    0   -3
5    1    8 

key    2
val    5
dtype: int64
import cudfdf = cudf.DataFrame({
    "key": [0, 1, 1, 0, 0, 1],
    "val": [1, 8, 3, 9, -3, 8],
})
print(df, "\n")
print(df.index._reduce("distinct_count"))
print(df.key._reduce("distinct_count"))
   key  val
0    0    1
1    1    8
2    1    3
3    0    9
4    0   -3
5    1    8 

6
2
@beckernick beckernick added feature request New feature or request good first issue Good for newcomers Python Affects Python cuDF API. labels Nov 4, 2021
rapids-bot bot pushed a commit that referenced this issue Feb 2, 2022
Add Dataframe and Index nunique. Resolves #9611

Authors:
  - https://github.com/martinfalisse
  - Ashwin Srinath (https://github.com/shwina)

Approvers:
  - Ashwin Srinath (https://github.com/shwina)
  - Vyas Ramasubramani (https://github.com/vyasr)

URL: #10077
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request New feature or request good first issue Good for newcomers Python Affects Python cuDF API.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants