-
-
Notifications
You must be signed in to change notification settings - Fork 18.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: fix Series.argsort
#42090
BUG: fix Series.argsort
#42090
Conversation
Would this handle #12694? |
Yes this fixes that. It solves it by:
|
@jorisvandenbossche since you commented on the underlying issue perhaps you would also like to review? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
haven't really looked, just some quick comments
res_ser = Series(res, index=self.index[res], dtype="int64", name=self.name) | ||
return res_ser.__finalize__(self, method="argsort") | ||
else: | ||
# GH 42090 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would be very clear that we are ordering nulls at the na_position='last'
(equiv of sort_values). i think its fine not to expose this as an option (as numpy doesn't).
@attack68 thanks for looking into this topic! Took a quick look, and some high-level comments:
|
I interpret a "bug-fix" as a breaking change that corrects erroneous behaviour, and as pointed out by others this function is currently unusable because of its errors:
My use of numpy argsort has has been to sort an array by the values of another similar shape array, which is just a kind of key-sorting: Since numpy arrays don't have index objects per se this acts as an index sort. Since pandas data structures have index objects you can perform the same kind of method on objects with shared indexes with: But, there are often different ways of doing things. And someone coming from numpy used to using
I have added "first" and "last" to |
…_index_sorted # Conflicts: # doc/source/whatsnew/v1.4.0.rst
…_index_sorted # Conflicts: # doc/source/whatsnew/v1.4.0.rst
…_index_sorted # Conflicts: # doc/source/whatsnew/v1.4.0.rst
…_index_sorted # Conflicts: # doc/source/whatsnew/v1.4.0.rst
can you add a whatsnew note that shows the previous and new behavior; then it will be easier to assess |
@jorisvandenbossche perhaps you care to review the whatsnew above? |
noting #43840 |
…_index_sorted # Conflicts: # doc/source/whatsnew/v1.4.0.rst
…_index_sorted # Conflicts: # doc/source/whatsnew/v1.4.0.rst
…_index_sorted # Conflicts: # pandas/core/series.py # pandas/tests/extension/base/methods.py
…_index_sorted # Conflicts: # doc/source/whatsnew/v1.4.0.rst
…_index_sorted # Conflicts: # doc/source/whatsnew/v1.4.0.rst
…_index_sorted # Conflicts: # doc/source/whatsnew/v1.4.0.rst
…_index_sorted # Conflicts: # doc/source/whatsnew/v1.4.0.rst
Appears that this PR has sufficiently stalled, so closing for now. Looks almost there is anyone wants to pick it up. |
Is it worth reviving this for pandas 2.0? |
i think would either do
it does the wrong thing now - while your PR does seems to fix it up - its essentially duplicating the sorting methods intent |
Thanks for the pull request, but it appears to have gone stale. If interested in continuing, please merge in the main branch, address any review comments and/or failing tests, and we can reopen. |
rendered whatsnew: