Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEA] Groupby pct_change #9606

Closed
beckernick opened this issue Nov 4, 2021 · 4 comments · Fixed by #11144
Closed

[FEA] Groupby pct_change #9606

beckernick opened this issue Nov 4, 2021 · 4 comments · Fixed by #11144
Assignees
Labels
feature request New feature or request Python Affects Python cuDF API.

Comments

@beckernick
Copy link
Member

For time series analysis and pandas API compatibility, it would be useful to implement Groupby.pct_change. It may be feasible to implement Groupby.pct_change reasonably efficiently in the Python layer by using existing groupby functionality (diff and shift). As a starting point, it would make sense to limit the supported parameters/configuration to those currently supported by Series.pct_change.

Per the pandas documentation, "Calculate pct_change of each value to previous entry in group."

import pandas as pddf = pd.DataFrame({
    "key": [0, 1, 1, 0, 0, 1],
    "val": [1, 8, 3, 9, -3, 8],
})
print(df, "\n")
print(df.groupby("key").pct_change())
   key  val
0    0    1
1    1    8
2    1    3
3    0    9
4    0   -3
5    1    8 

        val
0       NaN
1       NaN
2 -0.625000
3  8.000000
4 -1.333333
5  1.666667
@beckernick beckernick added feature request New feature or request Python Affects Python cuDF API. labels Nov 4, 2021
@beckernick beckernick added this to the Time Series Analysis milestone Nov 4, 2021
@github-actions
Copy link

github-actions bot commented Dec 4, 2021

This issue has been labeled inactive-30d due to no recent activity in the past 30 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed. This issue will be labeled inactive-90d if there is no activity in the next 60 days.

@beckernick
Copy link
Member Author

I believe this could be expressed fairly smoothly from Python with two separate groupby aggregations with a loose structure of:

  • Handle nulls
  • Groupby diff (with supported arguments)
  • Groupby shift (with supported arguments)
  • Binary op on resultants of diff and shift

Using internal APIs, is it possible to calculate the diff and shift in a single call to groupby aggregation?

@github-actions
Copy link

github-actions bot commented Jan 5, 2022

This issue has been labeled inactive-30d due to no recent activity in the past 30 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed. This issue will be labeled inactive-90d if there is no activity in the next 60 days.

@github-actions
Copy link

This issue has been labeled inactive-90d due to no recent activity in the past 90 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed.

rapids-bot bot pushed a commit that referenced this issue Jul 18, 2022
Subsequent to #9805, this PR adds support for Groupby.pct_change()

Fixes #9606
Replaces #10444

Authors:
  - Sheilah Kirui (https://github.com/skirui-source)

Approvers:
  - Ashwin Srinath (https://github.com/shwina)

URL: #11144
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request New feature or request Python Affects Python cuDF API.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants