Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEA] Support GroupBy aggregations on DecimalColumn #7489

Closed
ChrisJar opened this issue Mar 2, 2021 · 2 comments
Closed

[FEA] Support GroupBy aggregations on DecimalColumn #7489

ChrisJar opened this issue Mar 2, 2021 · 2 comments
Labels
feature request New feature or request Python Affects Python cuDF API.

Comments

@ChrisJar
Copy link
Contributor

ChrisJar commented Mar 2, 2021

Is your feature request related to a problem? Please describe.
I wish I could perform groupby aggregations on columns with type decimal

Describe the solution you'd like
I would like to mimic what happens when calling a groupby aggregation on columns with type float. For example:

df = cudf.DataFrame({'id': [0, 1, 1], 'val': [1.00, 1.01, 1.02]})
df.groupby('id').agg('mean')

returns

id | val
-- | --
0  |  1.000
1  |  1.015

However, if I convert the float column to decimal

df['val'] = cudf.Series([decimal.Decimal(x) for x in [1.00, 1.01, 1.02]], dtype=cudf.Decimal64Dtype(7,3))
df.groupby('id').agg('mean')

it returns:

---------------------------------------------------------------------------
DataError                                 Traceback (most recent call last)
<ipython-input-20-2a197b045ee3> in <module>
----> 1 df.groupby('id').agg('mean')

~/anaconda3/envs/cudf_dev/lib/python3.7/contextlib.py in inner(*args, **kwds)
     72         def inner(*args, **kwds):
     73             with self._recreate_cm():
---> 74                 return func(*args, **kwds)
     75         return inner
     76 

~/anaconda3/envs/cudf_dev/lib/python3.7/site-packages/cudf/core/groupby/groupby.py in agg(self, func)
    169         # a Float64Index, while Pandas returns an Int64Index
    170         # (GH: 6945)
--> 171         result = self._groupby.aggregate(self.obj, normalized_aggs)
    172 
    173         result = cudf.DataFrame._from_table(result)

cudf/_lib/groupby.pyx in cudf._lib.groupby.GroupBy.aggregate()

cudf/_lib/groupby.pyx in cudf._lib.groupby._drop_unsupported_aggs()

DataError: No numeric types to aggregate
@ChrisJar ChrisJar added Needs Triage Need team to review and classify feature request New feature or request labels Mar 2, 2021
@kkraus14 kkraus14 added Python Affects Python cuDF API. Cython and removed Needs Triage Need team to review and classify labels Mar 2, 2021
@github-actions
Copy link

github-actions bot commented Apr 2, 2021

This issue has been labeled inactive-30d due to no recent activity in the past 30 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed. This issue will be labeled inactive-90d if there is no activity in the next 60 days.

@ChrisJar
Copy link
Contributor Author

ChrisJar commented Apr 5, 2021

Closed with #7731

@ChrisJar ChrisJar closed this as completed Apr 5, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request New feature or request Python Affects Python cuDF API.
Projects
None yet
Development

No branches or pull requests

2 participants