Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closes ticket #2831 to align dataframe.groupby().size(), dataframe.groupby().sum() to pandas #2892

Conversation

ajpotts
Copy link
Contributor

@ajpotts ajpotts commented Jan 4, 2024

This ticket changes DataFrame.GroupBy.size() to be more similar to the pandas API. It will now return a dataframe as default, with an as_series optional parameter to allow a series to be returned.

The groupby count operation was temporarily set to be an alias of size both in dataframe.GroupBy.count() and groupbyclass.GroupBy.count().

In dataframe.py, def _make_aggop returns an aggregate function, such as dataframe.GroupBy.sum(). The output has been changed to return a dataframe.DataFrame object as default, with an optional parameter as_series to return a Series object instead. The groupby column names have been added as well.

dataframe.GroupBy also now has additional parameters gb_key_names=None, as_index=True. The gb_key_names allow storage of the key names of the columns used to group by, and the as_index parameter allows the user to select whether the group by column will be used as the index in the returned dataframe.DataFrame or series.Series.

@ajpotts ajpotts linked an issue Jan 4, 2024 that may be closed by this pull request
Copy link
Member

@stress-tess stress-tess left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few stylistic suggestions, but looks good overall! Well done!!

The only thing holding up an approval is a typo in the proto test

PROTO_tests/tests/dataframe_test.py Outdated Show resolved Hide resolved
arkouda/dataframe.py Outdated Show resolved Hide resolved
PROTO_tests/tests/series_test.py Outdated Show resolved Hide resolved
arkouda/dataframe.py Show resolved Hide resolved
arkouda/dataframe.py Outdated Show resolved Hide resolved
arkouda/dataframe.py Outdated Show resolved Hide resolved
Copy link
Contributor

@jaketrookman jaketrookman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.
For future reference, let the person who requested the change resolve the request to ensure that their issue was properly addressed.

Copy link
Member

@stress-tess stress-tess left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

@stress-tess stress-tess added this pull request to the merge queue Jan 8, 2024
Merged via the queue into Bears-R-Us:master with commit d3d297e Jan 8, 2024
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

dataframe count and size groupby aggregations
3 participants