-
-
Notifications
You must be signed in to change notification settings - Fork 18.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement sort=True for Index setops #25151
Comments
@jorisvandenbossche OK with moving off the 1.0 milestone? |
Pushing off 1.0. |
@TomAugspurger @jorisvandenbossche @jreback this issue was on the agenda for today's call but no one present knew what the reasoning was for having sort=None instead of sort=True in the first place. There are a bunch of comments in the tests to the effect of "# TODO: decide what sort=True means" IIUC sort=True will differ from sort=None only in the handful of cases that currently go through a fastpath and avoid sorting like the [1, 0] example in the OP? |
This is a psudo-blocker for #48553 as there are some test warnings involving
|
That's maybe rather a bug in the MultiIndex non-sort case? Because the warning says to pass
|
The top post has a list of cases where the current default behaviour ( >>> idx1 = pd.Index(['c', 'b', 'a'])
>>> idx2 = pd.Index(['a', 'c', 'b'])
>>> idx1.union(idx2, sort=None)
Index(['a', 'b', 'c'], dtype='object') # <-- sorted
>>> idx1.union(idx2, sort=False)
Index(['c', 'b', 'a'], dtype='object')
>>> idx1.union(idx1.copy(), sort=None)
Index(['c', 'b', 'a'], dtype='object') # <-- not sorted
>>> idx1.union(idx1.copy(), sort=False)
Index(['c', 'b', 'a'], dtype='object') So depending on the values, the default So we originally used |
…ntersection` (#13497) This PR enables `sort=True` for `union`, `difference`, and `intersection` APIs in `Index`. This also fixes 1 pytest failure and adds 77 pytests: On `Index_sort_2.0`: ``` = 230 failed, 95836 passed, 2045 skipped, 768 xfailed, 308 xpassed in 438.88s (0:07:18) = ``` On `pandas_2.0_feature_branch`: ``` = 231 failed, 95767 passed, 2045 skipped, 764 xfailed, 300 xpassed in 432.59s (0:07:12) = ``` xref: pandas-dev/pandas#25151
#25063 made the 0.24.x
sort=True
behavior be enabled bysort=None
. This allows us to usesort=True
to mean "always sort".So rather than raising
We would instead return
This is a (hopefully) exhaustive list of special cases not sorted when
sort=None
.union
:The text was updated successfully, but these errors were encountered: