-
-
Notifications
You must be signed in to change notification settings - Fork 18.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: Enable indexing with nullable Boolean #31591
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for working on this. I think the tests you've deleted should all be updated instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some small comments, thanks for working on this!
@@ -124,20 +125,17 @@ def test_setitem_mask_raises(self, data, box_in_series): | |||
with pytest.raises(IndexError, match="wrong length"): | |||
data[mask] = data[0] | |||
|
|||
def test_setitem_mask_boolean_array_raises(self, data, box_in_series): | |||
# missing values in mask | |||
def test_setitem_mask_boolean_array_with_na(self, data, box_in_series): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't this test duplicating the test_setitem_mask
above?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Somewhat, yes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can simply remove it then, I think?
@@ -30,13 +31,6 @@ will raise a ``ValueError``. | |||
mask = pd.array([True, False, pd.NA], dtype="boolean") | |||
s[mask] | |||
|
|||
The missing values will need to be explicitly filled with True or False prior | |||
to using the array as a mask. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you keep this example but reword it as something like "if you want different behaviour, you can fill manually with fillna(True)" ?
doc/source/whatsnew/v1.0.2.rst
Outdated
@@ -47,6 +47,33 @@ Bug fixes | |||
|
|||
.. --------------------------------------------------------------------------- | |||
|
|||
Indexing with Nullable Boolean Arrays | |||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you move this above bug fixes section, and make this of the same level? (so using ~~~
for underline)
(it's not a bug fix but a deliberate API change)
pandas/core/indexing.py
Outdated
@@ -2207,10 +2208,12 @@ def check_bool_indexer(index: Index, key) -> np.ndarray: | |||
"the indexed object do not match)." | |||
) | |||
result = result.astype(bool)._values | |||
else: | |||
elif is_object_dtype(key): | |||
# key might be sparse / object-dtype bool, check_array_indexer needs bool array |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you check this comment? (about the sparse from the code comment)
@jorisvandenbossche It seems that In [1]: import numpy as np
...: import pandas as pd
...: from pandas.core.indexing import check_bool_indexer
...: from pandas.core.indexers import check_array_indexer
...:
...: arr = pd.arrays.SparseArray([True, False])
...: idx = np.array([1, 2])
...:
...: check_array_indexer(idx, arr)
...:
Out[1]: array([ True, False]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm. @jorisvandenbossche
That seems so then, yes. Last question: can you update the comment then? |
thanks @dsaxton very nice! |
hmm not sure why this didn’t backport automatically |
I'm not too familiar with how backporting works, do I just put a PR against the |
yes checkout the 1.0.x and cherry pick the commit then push to a new remote branch but before you do that let me see if i can get the bot to do it |
@meeseeksdev backport to 1.0.x |
Owee, I'm MrMeeseeks, Look at me. There seem to be a conflict, please backport manually. Here are approximate instructions:
And apply the correct labels and milestones. Congratulation you did some good work ! Hopefully your backport PR will be tested by the continuous integration and merged soon! If these instruction are inaccurate, feel free to suggest an improvement. |
@dsaxton ok instructions above |
closes #31503
black pandas
git diff upstream/master -u -- "*.py" | flake8 --diff
I think from the discussion in #31503 that this is something people want to allow.