BUG: when Index is numeric and indexer is boolean (#16877) #17343

andrejonasson · 2017-08-26T06:38:29Z

[ X ] closes BUG: boolean indexing error with .drop() #16877
[ X ] tests added / passed
[ X ] passes git diff upstream/master -u -- "*.py" | flake8 --diff
[ X ] whatsnew entry

This warns for both tickets:
#6189
#16877

There's one particular flaw: non-unique indices never run get_indexer and deals with dropping in a way where it is difficult to inspect types. I think it should be OK to leave that one for now though.

One of the reasons I chose to warn instead of Error is that running some_numeric_index.difference([True, False]) should possibly still be allowed, while still warning the user that the difference they are getting may just be due to conversion from True to 1 (or 1.0) and False to 0 (or 0.0). This probably is rare enough that an Error might still be the best approach.

The questions remaining to me are:
Is this severe enough that an Error should be raised? What type in this case?
Is the message sufficient?

Thanks.

codecov · 2017-08-26T08:04:58Z

Codecov Report

Merging #17343 into master will decrease coverage by 0.01%.
The diff coverage is 100%.

@@            Coverage Diff             @@
##           master   #17343      +/-   ##
==========================================
- Coverage   91.03%   91.02%   -0.02%     
==========================================
  Files         162      162              
  Lines       49567    49569       +2     
==========================================
- Hits        45125    45118       -7     
- Misses       4442     4451       +9

Flag	Coverage Δ
#multiple	`88.8% <100%> (ø)`	⬆️
#single	`40.24% <50%> (-0.06%)`	⬇️

Impacted Files	Coverage Δ
pandas/core/indexes/base.py	`95.88% <100%> (ø)`	⬆️
pandas/io/gbq.py	`25% <0%> (-58.34%)`	⬇️
pandas/core/frame.py	`97.72% <0%> (-0.1%)`	⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 473a7f3...44ffaf1. Read the comment docs.

codecov · 2017-08-26T08:05:02Z

Codecov Report

Merging #17343 into master will decrease coverage by 0.01%.
The diff coverage is 100%.

@@            Coverage Diff             @@
##           master   #17343      +/-   ##
==========================================
- Coverage   91.23%   91.21%   -0.02%     
==========================================
  Files         163      163              
  Lines       49808    49810       +2     
==========================================
- Hits        45442    45435       -7     
- Misses       4366     4375       +9

Flag	Coverage Δ
#multiple	`89.01% <100%> (ø)`	⬆️
#single	`40.32% <50%> (-0.06%)`	⬇️

Impacted Files	Coverage Δ
pandas/core/indexes/base.py	`96.29% <100%> (ø)`	⬆️
pandas/io/gbq.py	`25% <0%> (-58.34%)`	⬇️
pandas/core/frame.py	`97.73% <0%> (-0.1%)`	⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 83c9205...1e6c013. Read the comment docs.

toobaz · 2017-08-26T12:18:57Z

Is this severe enough that an Error should be raised? What type in this case?

I'm a bit in favour of raising an error (KeyError). Actually, I think the error shouldn't necessarily illustrate the dtype problem... just reporting "keys not found" would be enough.

But more in general, I wonder whether a BooleanIndex would be worth introducing, even just as an internal class. I at least didn't expect

In [2]: pd.MultiIndex.from_product([[1,2], [True, False]]).levels[1].dtype
Out[2]: dtype('O')

but that's probably just coherent with BooleanIndex not existing (MultiIndex levels of Booleans make sense, while admittedly flat indexes of Booleans don't).

Anyway, I'm afraid the current proposal is missing the following:

In [2]: pd.Series([1,2,3], index=[0, 1, False]).drop([False])
Out[2]: 
1    2
dtype: int64

gfyoung · 2017-08-26T12:57:27Z

We generally try to avoid going from no error message / warning to direct error message because for downstream users, that could be too drastic of a change. A warning should suffice for now I would think.

gfyoung · 2017-08-26T12:57:59Z

doc/source/whatsnew/v0.21.0.txt

@@ -358,6 +358,7 @@ Indexing
 - Allow unicode empty strings as placeholders in multilevel columns in Python 2 (:issue:`17099`)
 - Bug in ``.iloc`` when used with inplace addition or assignment and an int indexer on a ``MultiIndex`` causing the wrong indexes to be read from and written to (:issue:`17148`)
 - Bug in ``.isin()`` in which checking membership in empty ``Series`` objects raised an error (:issue:`16991`)
+- Warns when a numeric ``Index``'s ``get_indexer`` is passed a boolean indexer (:issue:`16877`)


I think you might be able to use func here ? @jorisvandenbossche @jreback

gfyoung · 2017-08-26T12:58:29Z

pandas/tests/indexes/test_base.py

+        tm.assert_numpy_array_equal(actual, expected, check_dtype=False)
+
+        numeric_idx = pd.Index(range(4))
+        with pytest.warns(UserWarning):


Use our own tm.assert_produces_warning instead from pandas.util.testing.

gfyoung · 2017-08-26T12:59:24Z

pandas/core/indexes/base.py

+        if target.is_boolean() and self.is_numeric():
+            warnings.warn('Index is numeric and indexer is boolean, '
+                          'possible unintended conversion from boolean '
+                          'to numeric', UserWarning, stacklevel=1)


I think we can rephrase the prose here. Proper grammar and slightly more formality in the writing would be good here.

'Passing a boolean indexer to an Index containing numeric indices may cause the indexer to behave as being of numeric type'

@gfyoung Thanks for the feedback. Would the above work? Any pointers would be welcome.

toobaz · 2017-08-26T14:21:49Z

We generally try to avoid going from no error message / warning to direct error message because for downstream users, that could be too drastic of a change.

OK: let me just emphasize that in this specific case it's very likely that anyone indexing a numeric index with boolean labels is doing something wrong/assuming a behaviour (e.g. in .drop()) which is actually not supported.

jreback · 2017-08-26T15:28:02Z

i am ok in raising in this case
this is a clear error

andrejonasson · 2017-08-26T15:42:03Z

@toobaz The example index [0, 1, False] is a duplicate index (0 and False are equivalent), which I described as not being covered by this PR. However, you are right that the PR does not cover the case with unique mixed-integer indices (e.g. [0, True, 2]). I will add a "self.holds_integer()" to the condition to cover this, since the function is already implemented.

This would not cover "mixed-float" (e.g. [1.5, True]), which is 'mixed' inferred type in pandas but 'mixed' could also be ['ab', True] in which case it would be an incorrect warning. So I'll leave that one out.

Edit: I will only add a self.holds_integer() condition in case it's a warning. Since a user with [0, True, 2] index should be able to drop the True without an Error. Even if that user may drop the 0 by passing [False].

toobaz · 2017-08-26T16:14:45Z

I will add a "self.holds_integer()" to the condition to cover this, since the function is already implemented.

Actually, thinking to my example again I realize that you are right... but that this PR might not be the fix we need. My example should not raise an error (neither a warning); however, it should give a different result (not dropping the 0). So I'm afraid the fix is a bit deeper (wherever booleans are automatically casted to integers).

That said, waiting for a better fix, your PR certainly won't harm.

toobaz · 2017-08-26T16:19:35Z

Just for context:

In [2]: pd.Series([1,2,3], index=[0, 1, 's']).loc[True]
Out[2]: 2

In [3]: pd.Series([1,2,3], index=[0, 1, True]).loc[True]
Out[3]: 
1       2
True    3
dtype: int64

In [4]: pd.Series([1,2,3], index=[0, 1, 2]).loc[True]
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
/home/nobackup/repo/pandas/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance)
   2457             try:
-> 2458                 return self._engine.get_loc(key)
   2459             except KeyError:

/home/nobackup/repo/pandas/pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc (pandas/_libs/index.c:5272)()
    128 
--> 129     cpdef get_loc(self, object val):
    130         if is_definitely_invalid_key(val):

/home/nobackup/repo/pandas/pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc (pandas/_libs/index.c:5077)()
    147 
--> 148         self._check_type(val)
    149 

/home/nobackup/repo/pandas/pandas/_libs/index_class_helper.pxi in pandas._libs.index.Int64Engine._check_type (pandas/_libs/index.c:16899)()
    135         if util.is_bool_object(val):
--> 136             raise KeyError(val)
    137         elif util.is_float_object(val):

KeyError: True

During handling of the above exception, another exception occurred:

KeyError                                  Traceback (most recent call last)
/home/nobackup/repo/pandas/pandas/core/indexing.py in _get_label(self, label, axis)
    129             try:
--> 130                 return self.obj._xs(label, axis=axis)
    131             except:

/home/nobackup/repo/pandas/pandas/core/generic.py in xs(self, key, axis, level, drop_level)
   2242         else:
-> 2243             loc = self.index.get_loc(key)
   2244 

/home/nobackup/repo/pandas/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance)
   2459             except KeyError:
-> 2460                 return self._engine.get_loc(self._maybe_cast_indexer(key))
   2461 

/home/nobackup/repo/pandas/pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc (pandas/_libs/index.c:5272)()
    128 
--> 129     cpdef get_loc(self, object val):
    130         if is_definitely_invalid_key(val):

/home/nobackup/repo/pandas/pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc (pandas/_libs/index.c:5077)()
    147 
--> 148         self._check_type(val)
    149 

/home/nobackup/repo/pandas/pandas/_libs/index_class_helper.pxi in pandas._libs.index.Int64Engine._check_type (pandas/_libs/index.c:16899)()
    135         if util.is_bool_object(val):
--> 136             raise KeyError(val)
    137         elif util.is_float_object(val):

KeyError: True

During handling of the above exception, another exception occurred:

KeyError                                  Traceback (most recent call last)
<ipython-input-4-b1a86c8195a6> in <module>()
----> 1 pd.Series([1,2,3], index=[0, 1, 2]).loc[True]

/home/nobackup/repo/pandas/pandas/core/indexing.py in __getitem__(self, key)
   1333         else:
   1334             key = com._apply_if_callable(key, self.obj)
-> 1335             return self._getitem_axis(key, axis=0)
   1336 
   1337     def _is_scalar_access(self, key):

/home/nobackup/repo/pandas/pandas/core/indexing.py in _getitem_axis(self, key, axis)
   1557         # fall thru to straight lookup
   1558         self._has_valid_type(key, axis)
-> 1559         return self._get_label(key, axis=axis)
   1560 
   1561 

/home/nobackup/repo/pandas/pandas/core/indexing.py in _get_label(self, label, axis)
    130                 return self.obj._xs(label, axis=axis)
    131             except:
--> 132                 return self.obj[label]
    133         elif isinstance(label, tuple) and isinstance(label[axis], slice):
    134             raise IndexingError('no slices here, handle elsewhere')

/home/nobackup/repo/pandas/pandas/core/series.py in __getitem__(self, key)
    617         key = com._apply_if_callable(key, self)
    618         try:
--> 619             result = self.index.get_value(self, key)
    620 
    621             if not is_scalar(result):

/home/nobackup/repo/pandas/pandas/core/indexes/base.py in get_value(self, series, key)
   2491         try:
   2492             return self._engine.get_value(s, k,
-> 2493                                           tz=getattr(series.dtype, 'tz', None))
   2494         except KeyError as e1:
   2495             if len(self) > 0 and self.inferred_type in ['integer', 'boolean']:

/home/nobackup/repo/pandas/pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_value (pandas/_libs/index.c:4396)()
     93         return val in self.mapping
     94 
---> 95     cpdef get_value(self, ndarray arr, object key, object tz=None):
     96         """
     97         arr : 1-dimensional ndarray

/home/nobackup/repo/pandas/pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_value (pandas/_libs/index.c:4079)()
    101             void* data_ptr
    102 
--> 103         loc = self.get_loc(key)
    104         if PySlice_Check(loc) or cnp.PyArray_Check(loc):
    105             return arr[loc]

/home/nobackup/repo/pandas/pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc (pandas/_libs/index.c:5077)()
    146             return self._get_loc_duplicates(val)
    147 
--> 148         self._check_type(val)
    149 
    150         try:

/home/nobackup/repo/pandas/pandas/_libs/index_class_helper.pxi in pandas._libs.index.Int64Engine._check_type (pandas/_libs/index.c:16899)()
    134         hash(val)
    135         if util.is_bool_object(val):
--> 136             raise KeyError(val)
    137         elif util.is_float_object(val):
    138             raise KeyError(val)

KeyError: True

pep8speaks · 2017-08-26T21:32:51Z

Hello @andrejonasson! Thanks for updating the PR.

Cheers ! There are no PEP8 issues in this Pull Request. 🍻

Comment last updated on September 25, 2017 at 14:07 Hours UTC

andrejonasson · 2017-08-26T22:09:47Z

@toobaz I agree. This PR can prevent cases like the issues referenced until the core of the problem is solved.

I have changed it to correctly return an array of repeated -1 when the Index is numeric and the indexer is boolean.

andrejonasson · 2017-08-27T06:11:18Z

>>> a = pd.Index([0,1,2,3])
>>> a.drop([True,False])
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/andredotj/Projects/pandas/pandas/pandas/core/indexes/base.py", line 3641, in drop
    labels[mask])
ValueError: labels [ True False] not contained in axis
>>> a.difference([True, False])
Int64Index([0, 1, 2, 3], dtype='int64')

New update let's the client functions handle the missing values as they see fit e.g. difference correctly does not remove [0, 1].

For the example in #16877

>>> df = pd.DataFrame( data = {
...                          'acol'  : np.arange(4),
...                          'bcol' :  2*np.arange(4)
...                         })
>>> df.drop(df.bcol > 2, axis=0, inplace=True)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/andredotj/Projects/pandas/pandas/pandas/core/generic.py", line 2402, in drop
    new_axis = axis.drop(labels, errors=errors)
  File "/home/andredotj/Projects/pandas/pandas/pandas/core/indexes/base.py", line 3641, in drop
    labels[mask])
ValueError: labels [False False  True  True] not contained in axis

It correctly raises when passed the boolean target.

jorisvandenbossche · 2017-08-27T06:21:24Z

Just wanted to say "shouldn't this rather return -1 than raising an error", but I see you already did that!
This looks good to me.

In [3]: pd.Series([1,2,3], index=[0, 1, True]).loc[True]
Out[3]:
1 2
True 3
dtype: int64

@toobaz I am not sure we will ever be able to solve this. When you have an 'object' index, I think we can only rely on the python object's equality / hashing, and True and 1 are equal for Python (only their identity is different, but we don't rely on that)

andrejonasson · 2017-08-29T06:46:21Z

Is there anything that should be amended in the updated PR? Also, do I just push again to get AppVeyor to pass or will it automatically rebuild at some point?

Thanks!

gfyoung · 2017-08-29T07:24:14Z

doc/source/whatsnew/v0.21.0.txt

@@ -358,6 +358,7 @@ Indexing
 - Allow unicode empty strings as placeholders in multilevel columns in Python 2 (:issue:`17099`)
 - Bug in ``.iloc`` when used with inplace addition or assignment and an int indexer on a ``MultiIndex`` causing the wrong indexes to be read from and written to (:issue:`17148`)
 - Bug in ``.isin()`` in which checking membership in empty ``Series`` objects raised an error (:issue:`16991`)
+- When a numeric ``Index``'s :func:`get_indexer` is passed a boolean target it now returns the correct indexer  (:issue:`16877`)


@jreback @jorisvandenbossche : I don't quite remember the formatting for :func: and stuff, but somehow the beginning of the whatsnew looks incorrectly formatted. However, correct me if I'm wrong.

I would actually re-write this message so it is more user-oriented (the typical user does not know Index.get_indexer). In the end it is about boolean values that now raise when indexing a numeric index or dropping from such an index?

@gfyoung for :func:, to have it working, in this case you would need to write :func:`Index.get_indexer` (but as I said, I wouldn't add that, as typical user does not need to use that function)

@jorisvandenbossche : Ah, gotcha, thanks for clarifying!

agree, this issue is all about passing a boolean indexer to DataFrame.drop(), so say that.

gfyoung · 2017-08-29T07:24:42Z

pandas/tests/indexes/test_base.py

+        expected = np.array([0, 0, 1])
+        tm.assert_numpy_array_equal(actual, expected, check_dtype=False)
+
+    def test_get_indexer_numeric_index_boolean_target(self):


Reference issue number for both of your tests.

gfyoung · 2017-08-29T07:25:32Z

pandas/tests/indexes/test_base.py

+        boolean_idx = pd.Index([True, False])
+        actual = boolean_idx.get_indexer([True, True, False])
+        expected = np.array([0, 0, 1])
+        tm.assert_numpy_array_equal(actual, expected, check_dtype=False)


Why don't we check dtype here (we generally prefer to)? I think if you set the dtype of expected to be np.intp, you should be okay. Same for your test below.

gfyoung · 2017-08-29T07:26:13Z

@andrejonasson : Don't worry about AppVeyor. We'll take care of that for you. I left a couple of comments on your PR though.

jreback · 2017-08-30T10:21:16Z

pandas/tests/indexes/test_base.py

@@ -1131,6 +1131,18 @@ def test_get_indexer_strings(self):
        with pytest.raises(TypeError):
            idx.get_indexer(['a', 'b', 'c', 'd'], method='pad', tolerance=2)

+    def test_get_indexer_boolean_index_boolean_target(self):
+        boolean_idx = pd.Index([True, False])
+        actual = boolean_idx.get_indexer([True, True, False])


use result = rather than actual.

jreback · 2017-08-30T10:23:51Z

doc/source/whatsnew/v0.21.0.txt

@@ -358,6 +358,7 @@ Indexing
 - Allow unicode empty strings as placeholders in multilevel columns in Python 2 (:issue:`17099`)
 - Bug in ``.iloc`` when used with inplace addition or assignment and an int indexer on a ``MultiIndex`` causing the wrong indexes to be read from and written to (:issue:`17148`)
 - Bug in ``.isin()`` in which checking membership in empty ``Series`` objects raised an error (:issue:`16991`)
+- When a numeric ``Index``'s :func:`get_indexer` is passed a boolean target it now returns the correct indexer  (:issue:`16877`)


agree, this issue is all about passing a boolean indexer to DataFrame.drop(), so say that.

jreback · 2017-08-30T10:24:11Z

pandas/tests/indexes/test_base.py

@@ -1131,6 +1131,18 @@ def test_get_indexer_strings(self):
        with pytest.raises(TypeError):
            idx.get_indexer(['a', 'b', 'c', 'd'], method='pad', tolerance=2)

+    def test_get_indexer_boolean_index_boolean_target(self):
+        boolean_idx = pd.Index([True, False])


add the issue number here

jreback · 2017-08-30T10:24:34Z

pandas/tests/indexes/test_base.py

+        numeric_idx = pd.Index(range(4))
+        actual = numeric_idx.get_indexer([True, False, True])
+        expected = np.repeat(-1, 3)
+        tm.assert_numpy_array_equal(actual, expected, check_dtype=False)


pls add a test that directly exercises .drop() as in the original issue

andrejonasson · 2017-08-31T06:33:11Z

Hi, thanks for the feedback everyone.

I removed test_get_indexer_boolean_index_boolean_target - since when I tried to make it fail (by returning wrong indexer for instance) it was already covered by other tests. If redundancy is desirable I'll add it back in.

I added a direct test to pd.Series .drop() since it uses the same underlying function as pd.DataFrame.

jreback

some comments and pls rebase

jreback · 2017-09-07T01:21:31Z

pandas/core/indexes/base.py

@@ -2589,6 +2589,9 @@ def get_indexer(self, target, method=None, limit=None, tolerance=None):
        if tolerance is not None:
            tolerance = self._convert_tolerance(tolerance)

+        if target.is_boolean() and self.is_numeric():


can you add a comment here why

jreback · 2017-09-07T01:22:00Z

pandas/tests/indexes/test_base.py

+        # GH 16877
+        numeric_idx = pd.Index(range(4))
+        result = numeric_idx.get_indexer([True, False, True])
+        expected = np.repeat(-1, 3)


I think this needs to be intp_ dtype

jreback · 2017-09-07T01:23:11Z

doc/source/whatsnew/v0.21.0.txt

@@ -360,6 +360,7 @@ Indexing
 - Bug in ``.iloc`` when used with inplace addition or assignment and an int indexer on a ``MultiIndex`` causing the wrong indexes to be read from and written to (:issue:`17148`)
 - Bug in ``.isin()`` in which checking membership in empty ``Series`` objects raised an error (:issue:`16991`)
 - Bug in ``CategoricalIndex`` reindexing in which specified indices containing duplicates were not being respected (:issue:`17323`)
+- Bug in :func:`DataFrame.drop` caused boolean labels ``False`` and ``True`` to be treated as labels 0 and 1 respectively when dropping indices from a numeric index (:issue:`16877`)


add this will now raise.and move to api other changes section (even though its actually a bug fix, its highlited a bit more there).

andrejonasson · 2017-09-18T18:26:17Z

@jreback Is there anything else that needs editing?

Also is there some way to make sure I don't get merge conflicts due to the whatsnew line I've added (keeps on being the reason for my conflicts)?

gfyoung · 2017-09-18T18:46:32Z

Also is there some way to make sure I don't get merge conflicts due to the whatsnew line I've added

Really not much you can do here. People will modify the whatsnew in other PR's, causing the merge conflicts to occur. Just remain diligent to resolve them when they pop up.

gfyoung · 2017-09-18T18:47:14Z

doc/source/whatsnew/v0.21.0.txt

@@ -433,7 +433,7 @@ Other API Changes
 - :class:`Period` is now immutable, and will now raise an ``AttributeError`` when a user tries to assign a new value to the ``ordinal`` or ``freq`` attributes (:issue:`17116`).
 - :func:`to_datetime` when passed a tz-aware ``origin=`` kwarg will now raise a more informative ``ValueError`` rather than a ``TypeError`` (:issue:`16842`)
 - Renamed non-functional ``index`` to ``index_col`` in :func:`read_stata` to improve API consistency (:issue:`16342`)
-
+- Bug in :func:`DataFrame.drop` caused boolean labels ``False`` and ``True`` to be treated as labels 0 and 1 respectively when dropping indices from a numeric index, this will now raise (:issue:`16877`)


This is grammatically incorrect. "This will now raise" should be its own sentence.

jreback · 2017-09-23T20:49:48Z

can you rebase / update

jreback

lgtm. minor doc-comments

jreback · 2017-09-24T16:39:29Z

doc/source/whatsnew/v0.21.0.txt

@@ -481,7 +481,7 @@ Other API Changes
 - :class:`Period` is now immutable, and will now raise an ``AttributeError`` when a user tries to assign a new value to the ``ordinal`` or ``freq`` attributes (:issue:`17116`).
 - :func:`to_datetime` when passed a tz-aware ``origin=`` kwarg will now raise a more informative ``ValueError`` rather than a ``TypeError`` (:issue:`16842`)
 - Renamed non-functional ``index`` to ``index_col`` in :func:`read_stata` to improve API consistency (:issue:`16342`)
-
+- Bug in :func:`DataFrame.drop` caused boolean labels ``False`` and ``True`` to be treated as labels 0 and 1 respectively when dropping indices from a numeric index. This will now raise (:issue:`16877`)


add what its raising (ValueError)

jreback · 2017-09-24T16:39:48Z

pandas/core/indexes/base.py

@@ -2609,6 +2609,11 @@ def get_indexer(self, target, method=None, limit=None, tolerance=None):
        if tolerance is not None:
            tolerance = self._convert_tolerance(tolerance)

+        # Treat boolean labels passed to a numeric index as not found. Without
+        # this fix False and True would be treated as 0 and 1 respectively.
+        if target.is_boolean() and self.is_numeric():


add the issue number here as a comment

jorisvandenbossche · 2017-09-25T07:06:53Z

There is a windows failure (int32 vs int64)

jorisvandenbossche · 2017-09-25T07:12:09Z

pandas/core/indexes/base.py

+        # this fix False and True would be treated as 0 and 1 respectively.
+        # (GH #16877)
+        if target.is_boolean() and self.is_numeric():
+            return np.repeat(-1, target.size)


You need to put _ensure_patform_int around this before returning (similar to the final return in this function)

…d target is boolean (pandas-dev#16877)

TomAugspurger · 2017-09-25T20:08:43Z

I restarted the failed build (compile error, that I think has been fixed on master). Ping on green.

jorisvandenbossche · 2017-09-25T21:50:37Z

@andrejonasson Thanks!

…d target is boolean (pandas-dev#16877) (pandas-dev#17343)

andrejonasson force-pushed the 16877-err-boolean branch from 47340c3 to 44ffaf1 Compare August 26, 2017 08:04

gfyoung added Dtype Conversions Unexpected or buggy dtype conversions Error Reporting Incorrect or improved errors from pandas Indexing Related to indexing on series/frames, not to indexes themselves and removed Dtype Conversions Unexpected or buggy dtype conversions labels Aug 26, 2017

gfyoung reviewed Aug 26, 2017

View reviewed changes

andrejonasson force-pushed the 16877-err-boolean branch from 44ffaf1 to 661bfb6 Compare August 26, 2017 21:32

andrejonasson force-pushed the 16877-err-boolean branch from 661bfb6 to 4a10a11 Compare August 26, 2017 21:37

andrejonasson force-pushed the 16877-err-boolean branch 2 times, most recently from 0aaaad8 to 085b9a0 Compare August 27, 2017 05:50

gfyoung reviewed Aug 29, 2017

View reviewed changes

jreback changed the title ~~ERR: Warn when Index is numeric and indexer is boolean (#16877)~~ BUG: when Index is numeric and indexer is boolean (#16877) Aug 30, 2017

jreback added Bug and removed Error Reporting Incorrect or improved errors from pandas labels Aug 30, 2017

jreback requested changes Aug 30, 2017

View reviewed changes

andrejonasson force-pushed the 16877-err-boolean branch from 085b9a0 to 5f877fa Compare August 30, 2017 18:46

jreback requested changes Sep 7, 2017

View reviewed changes

andrejonasson force-pushed the 16877-err-boolean branch from 5f877fa to c0a2156 Compare September 7, 2017 18:48

andrejonasson force-pushed the 16877-err-boolean branch from c0a2156 to 5b22400 Compare September 18, 2017 18:15

gfyoung reviewed Sep 18, 2017

View reviewed changes

andrejonasson force-pushed the 16877-err-boolean branch from 5b22400 to 9c914ff Compare September 18, 2017 19:14

andrejonasson force-pushed the 16877-err-boolean branch 2 times, most recently from 9abe2a1 to 6e60ee1 Compare September 24, 2017 15:33

jreback approved these changes Sep 24, 2017

View reviewed changes

jreback added this to the 0.21.0 milestone Sep 24, 2017

andrejonasson force-pushed the 16877-err-boolean branch from 6e60ee1 to 0997ade Compare September 24, 2017 19:12

jorisvandenbossche reviewed Sep 25, 2017

View reviewed changes

ERR: get_indexer returns the correct indexer when Index is numeric an…

1e6c013

…d target is boolean (pandas-dev#16877)

andrejonasson force-pushed the 16877-err-boolean branch from 0997ade to 1e6c013 Compare September 25, 2017 14:07

jorisvandenbossche merged commit 45a795e into pandas-dev:master Sep 25, 2017

alanbato pushed a commit to alanbato/pandas that referenced this pull request Nov 10, 2017

ERR: get_indexer returns the correct indexer when Index is numeric an…

644fb32

…d target is boolean (pandas-dev#16877) (pandas-dev#17343)

jorisvandenbossche mentioned this pull request Nov 14, 2017

Unexpected behaviour while dropping #18287

Closed

No-Stream pushed a commit to No-Stream/pandas that referenced this pull request Nov 28, 2017

ERR: get_indexer returns the correct indexer when Index is numeric an…

f5672d9

…d target is boolean (pandas-dev#16877) (pandas-dev#17343)

toobaz mentioned this pull request Dec 1, 2017

errors and inconsistent behaviour when using a DataFrame or a boolean Series as an index #18579

Closed

BUG: when Index is numeric and indexer is boolean (#16877) #17343

BUG: when Index is numeric and indexer is boolean (#16877) #17343

Conversation

andrejonasson commented Aug 26, 2017

codecov bot commented Aug 26, 2017

Codecov Report

codecov bot commented Aug 26, 2017 • edited Loading

Codecov Report

toobaz commented Aug 26, 2017

gfyoung commented Aug 26, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

toobaz commented Aug 26, 2017

jreback commented Aug 26, 2017

andrejonasson commented Aug 26, 2017 • edited Loading

toobaz commented Aug 26, 2017

toobaz commented Aug 26, 2017

pep8speaks commented Aug 26, 2017 • edited Loading

Comment last updated on September 25, 2017 at 14:07 Hours UTC

andrejonasson commented Aug 26, 2017 • edited Loading

andrejonasson commented Aug 27, 2017

jorisvandenbossche commented Aug 27, 2017

andrejonasson commented Aug 29, 2017

gfyoung Aug 29, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gfyoung commented Aug 29, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

andrejonasson commented Aug 31, 2017

jreback left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

andrejonasson commented Sep 18, 2017

gfyoung commented Sep 18, 2017

Choose a reason for hiding this comment

jreback commented Sep 23, 2017

jreback left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jorisvandenbossche commented Sep 25, 2017

Choose a reason for hiding this comment

TomAugspurger commented Sep 25, 2017

jorisvandenbossche commented Sep 25, 2017

codecov bot commented Aug 26, 2017 •

edited

Loading

andrejonasson commented Aug 26, 2017 •

edited

Loading

pep8speaks commented Aug 26, 2017 •

edited

Loading

andrejonasson commented Aug 26, 2017 •

edited

Loading

gfyoung Aug 29, 2017 •

edited

Loading