ENH: NamedAgg support for time windows/Expanding #34685

ddofer · 2020-06-10T08:41:07Z

Is your feature request related to a problem?

I am using aggregations over data with groups over time, to create features. I had hoped to use the "new" namedAgg functionality to efficiently calculate and rename the many feature columns.
I find that named Aggregations only works with the "default"groupby.agg, when used with a rolling window or expanding, it's not supported.
(I note that the functions documentation doesn't mention this, the named agg is mainly in the general documentation, not method-level. i.e the groupby.agg method's documentation doesn't mention or demonstrate this functionality at all).

By "named aggregations" I refer to the functionality:
animals.groupby("kind").agg(**min_height=('height', 'min')**)

Describe the solution you'd like

Expand named aggregation support (NamedAgg) to the groupby aggregation used in expanding.aggregate and rolling.aggregate

API breaking implications

Should not affect it. Seems 1:1.

Additional context

Example usage/errors:

df_actions.set_index("age").groupby('uuid').rolling(30).agg(unique_actions=('action', 'nunique'), 
                                total_actions=('counter', 'sum'))

TypeError: aggregate() missing 1 required positional argument: 'func'

When used without the window, we get the benefit of namedAggs. (The real code has many more columns and transformations and the columns the features are calculated in are dynamic, so setting a list of column names to use is not desirable. Additionally, data is time-sorted):

df_actions.set_index("age").groupby('uuid').agg(unique_actions=('action', 'nunique'), 
                               total_actions=('counter', 'sum'))

>>>

  | unique_actions | total_actions

0 | 0.0
0 | 0.0
...

The text was updated successfully, but these errors were encountered:

MarcoGorelli · 2020-06-10T09:02:54Z

Thanks @ddofer

Is this the same as #32803?

ddofer · 2020-06-10T10:06:17Z

Thanks @ddofer

Is this the same as #32803?

It definetely overlaps. Not sure if it's a duplicate, since that is refering a different specific usage

mroeschke · 2020-10-05T01:21:47Z

I think this is a duplicate of #28333, closing.

ddofer added Enhancement Needs Triage Issue that has not been reviewed by a pandas team member labels Jun 10, 2020

TomAugspurger added Window rolling, ewma, expanding and removed Needs Triage Issue that has not been reviewed by a pandas team member labels Jun 12, 2020

mroeschke closed this as completed Oct 5, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ENH: NamedAgg support for time windows/Expanding #34685

ENH: NamedAgg support for time windows/Expanding #34685

ddofer commented Jun 10, 2020

MarcoGorelli commented Jun 10, 2020 •

edited

Loading

ddofer commented Jun 10, 2020

mroeschke commented Oct 5, 2020

ENH: NamedAgg support for time windows/Expanding #34685

ENH: NamedAgg support for time windows/Expanding #34685

Comments

ddofer commented Jun 10, 2020

Is your feature request related to a problem?

Describe the solution you'd like

API breaking implications

Additional context

MarcoGorelli commented Jun 10, 2020 • edited Loading

ddofer commented Jun 10, 2020

mroeschke commented Oct 5, 2020

MarcoGorelli commented Jun 10, 2020 •

edited

Loading