Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEA][BACKLOG] Difference functionality #1271

Closed
beckernick opened this issue Mar 22, 2019 · 3 comments · Fixed by #9817
Closed

[FEA][BACKLOG] Difference functionality #1271

beckernick opened this issue Mar 22, 2019 · 3 comments · Fixed by #9817
Labels
feature request New feature or request libcudf Affects libcudf (C++/CUDA) code. Python Affects Python cuDF API.

Comments

@beckernick
Copy link
Member

beckernick commented Mar 22, 2019

Is your feature request related to a problem? Please describe.
As a cuDF user, I want to calculate the difference between rows in a column (most commonly with the previous row). The equivalent in the pandas API docs is here for Series and here for Groupbys.

Describe the solution you'd like
I'd like to be able to call .diff(n) on:

And return the appropriate object with the difference between each value and the value n rows "above" it.

Describe alternatives you've considered
I could do this by going to the CPU or with a kernel by iterating through the column and populating a new device array of the same size.

@beckernick beckernick added Needs Triage Need team to review and classify feature request New feature or request labels Mar 22, 2019
@kkraus14 kkraus14 added Python Affects Python cuDF API. libcudf Affects libcudf (C++/CUDA) code. and removed Needs Triage Need team to review and classify labels Apr 8, 2019
@beckernick beckernick changed the title [FEA] Series level diff [FEA][BACKLOG] libcudf column diff May 20, 2019
@beckernick
Copy link
Member Author

Stopgap Python layer functionality for this feature has been merged with #1456. Explicitly backlogging this and defer to @harrism prioritization on the libcudf side.

@beckernick beckernick changed the title [FEA][BACKLOG] libcudf column diff [FEA][BACKLOG] Difference functionality May 23, 2019
@harrism
Copy link
Member

harrism commented Jul 4, 2019

@beckernick how is this different from a binary subtract operation ?

@beckernick
Copy link
Member Author

Closing the loop on this issue, this functionality can be expressed as a call to shift followed by a binary subtract op with the operands being the original column and the shifted column.

Column and Groupby shift have now been implemented in libcudf and the only remaining piece in Python is DataFrame.diff, which is specifically requested in #9604 and has a PR in review at #9817

As a result, I believe this can be closed alongside #9604 when #9817 lands, and I'll update the PR description to reflect it.

@rapids-bot rapids-bot bot closed this as completed in #9817 Feb 5, 2022
rapids-bot bot pushed a commit that referenced this issue Feb 5, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request New feature or request libcudf Affects libcudf (C++/CUDA) code. Python Affects Python cuDF API.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants