Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closes #1260 - GroupBy w/ Booleans #1270

Merged
merged 3 commits into from
Apr 14, 2022

Conversation

Ethan-DeBandi99
Copy link
Contributor

This PR (closes #1260):

Adds support for Booleans to ak.GroupBy. This is done by updating the pdarrayclass._get_grouping_keys() method to handle boolean arrays. The arrays are cast to ak.int64 arrays using ak.cast, which allows all other functionality to be used as if we are dealing with integers.

a = ak.array([True, False, True, True, False])
b = ak.array([False, False, True, False, False])

ga = ak.GroupBy(a)
ga.count()
(array([False True]), array([2 3]))

g = ak.GroupBy([a, b])
g.count()
([array([False True True]), array([False False True])], array([2 2 1]))

Copy link
Contributor

@joshmarshall1 joshmarshall1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. With my general and VERY limited knowledge of how GroupBy works, casting bools to ints in order to group them efficiently makes sense.

elif self.dtype in (akint64, akuint64):
return [self]
else:
raise TypeError("Grouping is only supported on numeric data (integral types) and bools.")
# Integral pdarrays are their own grouping keys
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

super small: this comment should be moved to be above return [self]

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Corrected comment location

@stress-tess stress-tess merged commit 280c1fa into Bears-R-Us:master Apr 14, 2022
@Ethan-DeBandi99 Ethan-DeBandi99 deleted the 1260_Bool_GroupBy branch April 14, 2022 19:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

group by with booleans
4 participants