-
-
Notifications
You must be signed in to change notification settings - Fork 18.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Mi indexing #15425
Mi indexing #15425
Conversation
# passing a dataframe as a key with a MultiIndex | ||
index = MultiIndex.from_product([[1, 2, 3], ['A', 'B', 'C']]) | ||
x = Series(index=index, data=range(9), dtype=np.float64) | ||
idx_keys = [(1, 'A'), (2, 'C'), (3, 'B')] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you also add tests for empty Series/DataFrame indexers (Series in test above)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use an empty series / dataframe as an indexer (no data)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Series(data=[], dtype=np.float64)
isn't empty?!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ahh it's in the above be test ok then
lgtm. just a small test addition. ping when pushed and green. |
Codecov Report
@@ Coverage Diff @@
## master #15425 +/- ##
==========================================
+ Coverage 90.37% 90.37% +<.01%
==========================================
Files 135 135
Lines 49463 49467 +4
==========================================
+ Hits 44702 44707 +5
+ Misses 4761 4760 -1
Continue to review full report at Codecov.
|
The ability to index with a DataFrame is actually a new feature, not a bug fix. And thus it should also documented like that if we include it. |
Ouch, right. I caught #15424 while trying to index with a That said, I would love that feature. I understand the "complexity" argument, but I propose to reason backwards: if we don't want the column names to align on the level names, then there's no complexity added - it's already done (and really, Concerning this PR: in the first case, I would just add a |
By the way: indexing with a Series is also not documented, right?
|
yeah I agree with @jorisvandenbossche how is this actually useful? it is just plain confusing, e.g. what dimensions are you actually indexing on? what happens if the levels don't match, or partially match. I think this is too much of a rabbit hole. Take this our for now (and just leave the series fixing). Ok with actually raising NotImplemented Error though if a df is passed like this. You can raise a separate issue with some well-defined use cases for consideration though. |
raise NotImplementedError("Indexing a MultiIndex with a " | ||
"DataFrame key is not " | ||
"implemented") | ||
elif hasattr(key, 'ndim') and key.ndim > 1: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
test using a numpy scalar as well (which is 0-dim), e.g. np.int64(5)
(I think this works anyhow)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
pandas/core/indexing.py
Outdated
@@ -1525,11 +1525,22 @@ def _getitem_axis(self, key, axis=0): | |||
# possibly convert a list-like into a nested tuple | |||
# but don't convert a list-like of tuples |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you update this a bit (explain the goal of what is happening)
@@ -1521,15 +1521,24 @@ def _getitem_axis(self, key, axis=0): | |||
return self._getbool_axis(key, axis=axis) | |||
elif is_list_like_indexer(key): | |||
|
|||
# GH 7349 | |||
# possibly convert a list-like into a nested tuple | |||
# but don't convert a list-like of tuples |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Related to (the code I currently don't understand in) #15448
lgtm. ping on green. |
thanks @toobaz |
…xers closes pandas-dev#15424 closes pandas-dev#15434 Author: Pietro Battiston <[email protected]> Closes pandas-dev#15425 from toobaz/mi_indexing and squashes the following commits: 2ba2d5d [Pietro Battiston] Updated comment 900e3ce [Pietro Battiston] whatsnew 8467b57 [Pietro Battiston] Tests for previous commit 17209f3 [Pietro Battiston] BUG: support indexing MultiIndex with 1-D array 7606114 [Pietro Battiston] Whatsnew 0b719f5 [Pietro Battiston] Test for previous commit 1f2f385 [Pietro Battiston] BUG: Fix indexing MultiIndex with Series with 0 not index
git diff upstream/master | flake8 --diff
This also fixes indexing with a
DataFrame
, which was actually not mentioned in #15424 (but it's a twoliner)