-
Notifications
You must be signed in to change notification settings - Fork 915
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Dataframe and Index nunique #10077
Conversation
Can one of the admins verify this patch? |
Hi @martinfalisse thanks for the contribution! Could you retarget this PR to 22.04? We are no longer accepting new changes on 22.02 since we're in the process of finalizing a release. |
Sure, sorry about that, should be fixed now. |
232c112
to
7291d3a
Compare
Co-authored-by: Ashwin Srinath <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great! Thank you @martinfalisse
Thanks for the contribution @martinfalisse ! May I ask how you decided to work on #9611 vs another issue? |
Sure! I searched for the issues under the category 'good first issue', and then picked one that I felt I could implement. This specific issue seemed to only require putting some things together on the python side which is why I chose it. Hope to contribute more in the future! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is almost there! Thanks for the work, the changes are looking good.
ok to test |
@martinfalisse you're almost there! The code looks great now, but the style checks are failing. The easiest way for you to fix this is to use pre-commit. If you haven't used it before, you can As an aside, you can run |
fcb37ba
to
bcd84ad
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Thanks for the great work.
For future reference, please don't force push changes once reviews have started. It can erase enough history to make comments vanish and cause other problems. It's easier to just add commits, and we squash all commits when a PR is merged anyway so you don't need to worry about keeping the history clean.
Codecov Report
@@ Coverage Diff @@
## branch-22.04 #10077 +/- ##
===============================================
Coverage ? 10.47%
===============================================
Files ? 122
Lines ? 20506
Branches ? 0
===============================================
Hits ? 2148
Misses ? 18358
Partials ? 0 Continue to review full report at Codecov.
|
@gpucibot merge |
Removes unnecessary nunique function in Series, as indicated by @vyasr here #10077 (comment). Seem to have made a mistake when correcting the style of my last commit and not have kept the original changes of that commit. Authors: - https://github.com/martinfalisse Approvers: - Vyas Ramasubramani (https://github.com/vyasr) URL: #10205
…0411) The APIs `DataFrame.nunique`, `Series.nunique`, and `ColumnBase.distinct_count` have a `method` argument. The pandas API for `nunique` [does not have a `method` argument](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.nunique.html). This `method` argument was meant to distinguish distinct counts computed by sort/hash approaches, but there are two issues: only `"sort"` is supported as an input value, and the algorithm in libcudf actually uses a _hash-based approach_. To resolve this inconsistency and align with the pandas API, I have removed the `method` parameter from `DataFrame.nunique`, `Series.nunique`, and `ColumnBase.distinct_count`. This is a breaking change but I don't think a deprecation is needed. The `nunique` feature was added in 22.04 via #10077, so it has not yet been released. The internal changes to `distinct_count` can be made without a deprecation. Authors: - Bradley Dice (https://github.com/bdice) Approvers: - GALI PREM SAGAR (https://github.com/galipremsagar) URL: #10411
Add Dataframe and Index nunique. Resolves #9611