-
-
Notifications
You must be signed in to change notification settings - Fork 18.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: x in MultiIndex.drop(x) #19054
BUG: x in MultiIndex.drop(x) #19054
Changes from 4 commits
129b286
9aec9de
35ee8f0
698853f
5da9830
a0495a4
552392f
22d3f69
9b79e96
a516fcc
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||||||
---|---|---|---|---|---|---|---|---|
|
@@ -705,6 +705,24 @@ def test_multiindex_symmetric_difference(self): | |||||||
result = idx ^ idx2 | ||||||||
assert result.names == [None, None] | ||||||||
|
||||||||
def test_multiindex_contains_dropped(self): | ||||||||
# GH 19027 | ||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. can you add a comment |
||||||||
idx = MultiIndex.from_product([[1, 2], [3, 4]]) | ||||||||
assert 2 in idx | ||||||||
idx = idx.drop(2) | ||||||||
|
||||||||
# drop implementation keeps 2 in the levels | ||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. blank line before the comment |
||||||||
assert 2 in idx.levels[0] | ||||||||
# but it should no longer be in the index itself | ||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. can you test with a non-integer MI as well There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This comment made me realize that I had focused onto integers specifically when this should apply to all hashable keys (sorry). But, that would mean tuples would then be treated differently from how they currently are, e.g.:
So, should it assume that a nested tuple like in There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Off the same vein, There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I think the idea should just be: any key There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Sure, in principle we could try with There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Got it, thanks!
produces a performance warning for slightly larger multi-indeces:
Or we could use the fact that a key that has been dropped from a multi-index will return an empty slice when fed into the index's
which does not produce the performance warnings and is ~3 times faster and simply adds to what it is right now: pandas/pandas/core/indexes/multi.py Lines 2123 to 2125 in 4a8496b
but is potentially less clear (and more hacky?). Which one should I use? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I ended up using the faster one and added a comment to explain what's going on. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, I think it makes sense |
||||||||
assert 2 not in idx | ||||||||
|
||||||||
# also applies to strings | ||||||||
idx = MultiIndex.from_product([['a', 'b'], ['c', 'd']]) | ||||||||
assert 'a' in idx | ||||||||
idx = idx.drop('a') | ||||||||
assert 'a' in idx.levels[0] | ||||||||
assert 'a' not in idx | ||||||||
|
||||||||
|
||||||||
class TestMultiIndexSlicers(object): | ||||||||
|
||||||||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what test exercises this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pandas/pandas/tests/test_multilevel.py
Line 1195 in 35b2aba
pandas/pandas/tests/frame/test_reshape.py
Line 136 in 35b2aba
pandas/pandas/tests/frame/test_reshape.py
Line 730 in 35b2aba
These were the 3. And it looks like Travis doesn't have permalinks to specific lines like GitHub?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also just took a look at the test that caused most of the builds to fail:
pandas/pandas/tests/frame/test_mutate_columns.py
Lines 186 to 198 in 35b2aba
and we just changed this so I'll go ahead and negate the
assert
.