-
Notifications
You must be signed in to change notification settings - Fork 916
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FEA] Groupby cumulative count #1296
Comments
Implementation is basically the same as groupby.cumsum. Will move to 0.8 along with #1298 |
Is this the same function as the following code?
The full notebook link, |
This operation requires output to be in same order as input at column level. is that right? |
It only requires the output order within groups to be stable. The ordering of the grouping keys is not required. |
Adds support for groupby scan operations. Addresses part of #1298 cumsum #1296 cumcount - sum - min - max - count Authors: - Karthikeyan (@karthikeyann) - Michael Wang (@isVoid) Approvers: - Vukasin Milovanovic (@vuule) - Jake Hemstad (@jrhemstad) - Nghia Truong (@ttnghia) - David (@davidwendt) URL: #7387
@karthikeyann is this implemented as part of 7387? |
libcudf part is implemented. |
@karthikeyann unassigned you since I figured someone else would do the Cython / Python code, but if you're tackling it please reassign yourself. |
I am already working on it. |
closes #1296 Groupby cumulative count closes #1298 Groupby cumulative sum - [x] Add cython code for groupby scan (cannot mix reduce aggs and scan aggs) - [x] Add python code for groupby scan functions - cumsum, cummin, cummax, cumcount, groupby.agg() - [x] unit tests Authors: - Karthikeyan (https://github.com/karthikeyann) - Vyas Ramasubramani (https://github.com/vyasr) - GALI PREM SAGAR (https://github.com/galipremsagar) Approvers: - Keith Kraus (https://github.com/kkraus14) - Vyas Ramasubramani (https://github.com/vyasr) URL: #7759
Is your feature request related to a problem? Please describe.
As a cuDF user, I want to assign numbers to each observation of a group reflecting its order of occurrence in the group.
The equivalent in the pandas API doc is here.
Describe the solution you'd like
I'd like to be able to call
df.groupby(col).cumcount()
and return a column of the same size containing the numberings described above.The text was updated successfully, but these errors were encountered: