-
-
Notifications
You must be signed in to change notification settings - Fork 18.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: Enabled skipna argument on groupby reduction ops #15675 #58844
ENH: Enabled skipna argument on groupby reduction ops #15675 #58844
Conversation
Added a skipna argurment to the groupby reduction ops: sum, prod, min, max, mean, median, var, std and sem Added relevant tests Updated whatsnew to reflect changes Co-authored-by: Tiago Firmino <[email protected]>
Co-authored-by: André Correia <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only took a quick look, but overall this is looking good. Can you also add tests for EAs (the nullable and pyarrow dtypes).
Co-authored-by: Tiago Firmino <[email protected]>
Co-authored-by: André Correia <[email protected]>
fa11256
to
5cd994c
Compare
Co-authored-by: André Correia <[email protected]>
Co-authored-by: Tiago Firmino <[email protected]>
788768a
to
5e3a965
Compare
Co-authored-by: Tiago Firmino <[email protected]>
Co-authored-by: André Correia <[email protected]>
Hi, we refactored the tests as requested and were working on adding the EAs. We found and fixed a few sneaky edge case bugs with these, but we ran into a problem with The current implementation with EAs creates a typed numpy array and no NA value can be directly used in place for integers, we don't really see a path forward without considerable changes for this particular dtype. We could add arrow floats easily, but we felt like only having a few arrow dtypes supported doesn't make much sense. How should we proceed? |
for more information, see https://pre-commit.ci
…ia/pandas into add_skipna_on_groupby_ops_pr
The groupby methods implemented in Cython use I can also open up a PR into your branch if you want some assistance with the EAs here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Once EAs are implemented, we'll also want a test for them in tests/extension/base/reduce.py
.
Co-authored-by: André Correia <[email protected]>
558bc25
to
2856c6d
Compare
Hello, after some rethinking about how we were handling EAs, we believe we've got it done correctly now since we were overthinking it before. However, we noticed the relevant array types tend to use Our way of testing it was as such:
And the
In relation to the tests in |
@tiago-firmino - each time you force push, reviewers must review the entire PR as history could be changing. On small PRs, this isn't much of a problem, but this is not a small PR. Can I ask that you no longer force push here? |
Indeed,
For this PR, the relevant tests are in |
I'm really sorry, I was not aware of that and will no longer do it. |
Co-authored-by: Tiago Firmino <[email protected]>
Co-authored-by: Tiago Firmino <[email protected]>
Hello, |
@andremcorreia - I should be able to take a look in the next few days. |
It looks to me like the if op_name == "prod" and skipna and data.dtype.itemsize < 8 and np.intp().itemsize < 8:
pytest.xfail(reason=f"{op_name} with itemsize {data.dtype.itemsize} overflows") |
This pull request is stale because it has been open for thirty days with no activity. Please update and respond to this comment if you're still interested in working on this. |
Thanks for the pull request, but it appears to have gone stale. If interested in continuing, please merge in the main branch, address any review comments and/or failing tests, and we can reopen. |
doc/source/whatsnew/v3.0.0.rst
file if fixing a bug or adding a new feature.Added a skipna argurment to the groupby reduction ops for consistency with the Series and Dataframe variants:
Added new relevant tests, updated api tests and whatsnew