Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: generic implementation of ea_wrap_cython_operation #43682

Closed
debnathshoham opened this issue Sep 21, 2021 · 2 comments · Fixed by #51166
Closed

ENH: generic implementation of ea_wrap_cython_operation #43682

debnathshoham opened this issue Sep 21, 2021 · 2 comments · Fixed by #51166
Labels
Enhancement ExtensionArray Extending pandas with custom dtypes or arrays. Groupby

Comments

@debnathshoham
Copy link
Member

Create a property function to decide what to do for EAs in _ea_wrap_cython_operation for groupby ops.
xref : #43634 (comment)

@final
def _ea_wrap_cython_operation(
self,
values: ExtensionArray,
min_count: int,
ngroups: int,
comp_ids: np.ndarray,
**kwargs,
) -> ArrayLike:
"""
If we have an ExtensionArray, unwrap, call _cython_operation, and
re-wrap if appropriate.
"""
# TODO: general case implementation overridable by EAs.
if isinstance(values, BaseMaskedArray) and self.uses_mask():
return self._masked_ea_wrap_cython_operation(

@debnathshoham debnathshoham added Enhancement Needs Triage Issue that has not been reviewed by a pandas team member ExtensionArray Extending pandas with custom dtypes or arrays. Groupby labels Sep 21, 2021
@simonjayhawkins simonjayhawkins added this to the Contributions Welcome milestone Sep 21, 2021
@simonjayhawkins simonjayhawkins removed the Needs Triage Issue that has not been reviewed by a pandas team member label Sep 21, 2021
@mroeschke mroeschke removed this from the Contributions Welcome milestone Oct 13, 2022
@jbrockmendel
Copy link
Member

jbrockmendel commented Jan 24, 2023

I'm increasingly thinking that doing this right will mean implementing the relevant operations end-to-end on the EAs, e.g. groupby_sum. For e.g pyarrow dtypes or potential GPU/distributed arrays they'd want their own things instead of casting back and forth to use our groupby.pyx code

@rhshadrach
Copy link
Member

For e.g pyarrow dtypes or potential GPU/distributed arrays they'd want their own things instead of casting back and forth to use our groupby.pyx code

Ref: #51166 (comment)

we should choose one naming scheme for all these methods instead of three slightly different ones. i.e. either a) shove everything into something like groupby_op or b) explicitly have groupby_foo for each of the relevant methods.

The two options aren't mutually exclusive; we could have groupby_op which dispatches to each groupby_foo. Then EA authors could override groupby_op if that is convenient or groupby_foo if they just want a certain method.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancement ExtensionArray Extending pandas with custom dtypes or arrays. Groupby
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants