-
Notifications
You must be signed in to change notification settings - Fork 93
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closes ticket #2831 to align dataframe.groupby().size(), dataframe.groupby().sum() to pandas #2892
Closes ticket #2831 to align dataframe.groupby().size(), dataframe.groupby().sum() to pandas #2892
Conversation
… unsuported in python 3.8
…one, and return series when as_index=True and as_series=None
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few stylistic suggestions, but looks good overall! Well done!!
The only thing holding up an approval is a typo in the proto test
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me.
For future reference, let the person who requested the change resolve the request to ensure that their issue was properly addressed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good!
This ticket changes DataFrame.GroupBy.size() to be more similar to the pandas API. It will now return a dataframe as default, with an as_series optional parameter to allow a series to be returned.
The groupby count operation was temporarily set to be an alias of size both in dataframe.GroupBy.count() and groupbyclass.GroupBy.count().
In dataframe.py, def _make_aggop returns an aggregate function, such as dataframe.GroupBy.sum(). The output has been changed to return a dataframe.DataFrame object as default, with an optional parameter as_series to return a Series object instead. The groupby column names have been added as well.
dataframe.GroupBy also now has additional parameters gb_key_names=None, as_index=True. The gb_key_names allow storage of the key names of the columns used to group by, and the as_index parameter allows the user to select whether the group by column will be used as the index in the returned dataframe.DataFrame or series.Series.