BUG: binary comparison of numpy.int/float and Series #9369

dmsul · 2015-01-29T05:35:07Z

This only happens with the numpy object is on the left. It doesn't matter if it's an int or a float. This error does not get raised with DataFrames.

After more poking around, It looks like this actually comres from a change in numpy, between versions 1.8.2 and 1.9.0.

import pandas as pd
import numpy as np

s = pd.Series(np.arange(4))
arr = np.arange(4)

right = s < arr[0]
left = arr[0] > s

Running this yields

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
bug_lhs_numpy.py in <module>()
      6
      7 right = s < arr[0]
----> 8 left = arr[0] > s

C:\Anaconda\lib\site-packages\pandas-0.14.1_236_g989a51b-py2.7-win-amd64.egg\pandas\core\ops.py
ther)
    555             return NotImplemented
    556         elif isinstance(other, (pa.Array, pd.Series, pd.Index)):
--> 557             if len(self) != len(other):
    558                 raise ValueError('Lengths must match to compare')
    559             return self._constructor(na_op(self.values, np.asarray(other)),

TypeError: len() of unsized object

In [2]: type(arr[0])
Out[2]: numpy.int32

jreback · 2015-01-29T10:48:12Z

left = 0 > s works (e.g. a python scalar). So I think this is being treated as a 0-dim array (its a np.int64) (and not as a scalar when called.) I'll mark as a bug. Feel free to dig in.

dmsul · 2015-01-30T04:11:27Z

It is indeed being converted at some point to a 0-dim array (ipython %debug). But I'm fairly certain it's something on the numpy side. (As long as numpy<=1.8.2, everything is fine in pandas>=0.14.0. As soon as you bump numpy to 1.9.0, the example code raises the error on every version of pandas.) I think this one is going to be way over my head. But if I happen to learn C before someone else takes care of this, I'll give it a try.

tvyomkesh · 2015-02-09T17:29:00Z

Thinking about this from a slightly different angle. Series comparison seems to works fine if LHS or RHS is one element list.

In [1]: import pandas as pd
In [2]: import numpy as np
In [3]: [0] < pd.Series(np.arange(4))
Out[3]:
0    False
1     True
2     True
3     True
dtype: bool
In [8]: pd.Series(np.arange(4)) > [0]
Out[8]:
0    False
1     True
2     True
3     True
dtype: bool

Question is should the behavior be modified so that we get the same answer with one element pd.Series or np.ndarray, same as with list or scalar? numpy works with both scalar or one element np.ndarray.

In [9]: np.arange(1) < np.arange(4)
Out[9]: array([False,  True,  True,  True], dtype=bool)
In [10]: 0 < np.arange(4)
Out[10]: array([False,  True,  True,  True], dtype=bool)

Right now these comparisons will throw error. With the proposed fix, these can be expected to work fine.

In [3]: pd.Series(np.arange(1)) < pd.Series(np.arange(4))
In [4]: np.arange(1) < pd.Series(np.arange(4))
In [5]: pd.Series(np.arange(4)) > pd.Series(np.arange(1))
In [6]: pd.Series(np.arange(4)) > np.arange(1)

wadawson · 2015-02-18T22:45:34Z

I can confirm @dmsul 's diagnosis of this bug. Afraid I don't have time currently to correct this but +1 for a fix.

gliptak · 2016-05-01T01:02:25Z

The basic form of this is:

import numpy as np
import pandas as pd
s = pd.Series([1])
b = np.int32(1)
b < s

type(b) before the call is <class 'numpy.int32'>, while it is <class 'numpy.ndarray'> within ops.wrapper (other).
How can the type of this variable change?

jreback · 2016-05-01T01:21:28Z

both are correct

np.int32 is a 0 dim scalar that's also an ndarray

gliptak · 2016-05-01T01:35:13Z

This is what I get:

In [12]: isinstance(np.int32(1), np.ndarray)
Out[12]: False

but somehow it becomes True within ops.wrapper ...

gliptak · 2016-05-01T19:16:26Z

Thoughts on how to look into this further? Thanks

jreback · 2016-05-01T20:15:40Z

@gliptak you need to step thru and debug

gliptak · 2016-05-01T20:23:25Z

I did that already ... Before the call into the function it is int32 1, within the function it shows ndarray 1. Pointers to what translation could have happened in between are welcome.

jbrockmendel · 2018-10-24T03:22:08Z

AFAICT there are two more-or-less separate issues here: Series comparison with numpy scalar (works fine, probably has for a while), and Series comparison with 1-element listlike (not supported).

We recently changed DataFrame broadcasting behavior to match numpy with 2-dimensional arrays with shape either (1, ncols) or (nrows, 1). We could consider doing the same for Series broadcasting against 1-dimensional objects with shape (1,).

@dmsul does this synopsis appear accurate?

dmsul · 2018-10-24T03:31:20Z

@jbrockmendel No idea, it's been almost 4 years since I looked at this bug.

As of numpy 1.12.1 and pandas 0.20.2 (just what I had in the nearest env at hand) there is no error, and I can't get an error when doing s < [0] or any number of permutations. Haven't tried it with ops.wrapper, but from a practitioner's standpoint I can't recreate it.

jbrockmendel · 2018-10-25T17:38:17Z

@jreback closeable?

jreback · 2018-10-25T23:55:55Z

yeah i think this was the array_priority which was fixed a long time ago

jreback added Bug Dtype Conversions Unexpected or buggy dtype conversions labels Jan 29, 2015

jreback added this to the 0.16.0 milestone Jan 29, 2015

jreback modified the milestones: 0.16.0, Next Major Release Mar 5, 2015

jreback mentioned this issue Jun 15, 2015

Inconsistent Series/scalar comparison behavior based upon the scalar's type #10363

Closed

jreback modified the milestones: 0.17.0, Next Major Release Jun 15, 2015

jreback modified the milestones: Next Major Release, 0.17.0 Aug 19, 2015

jreback added Prio-low labels Aug 19, 2015

jreback mentioned this issue May 10, 2016

COMPAT: comparisons master issue #13129

Closed

7 tasks

jbrockmendel mentioned this issue Jul 24, 2018

DataFrame vs Series vs Index arithmetic Roundup #18824

Closed

59 tasks

jreback closed this as completed Oct 25, 2018

jorisvandenbossche modified the milestones: Contributions Welcome, No action Oct 26, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BUG: binary comparison of numpy.int/float and Series #9369

BUG: binary comparison of numpy.int/float and Series #9369

dmsul commented Jan 29, 2015

jreback commented Jan 29, 2015

dmsul commented Jan 30, 2015

tvyomkesh commented Feb 9, 2015

wadawson commented Feb 18, 2015

gliptak commented May 1, 2016

jreback commented May 1, 2016

gliptak commented May 1, 2016

gliptak commented May 1, 2016

jreback commented May 1, 2016

gliptak commented May 1, 2016

jbrockmendel commented Oct 24, 2018

dmsul commented Oct 24, 2018

jbrockmendel commented Oct 25, 2018

jreback commented Oct 25, 2018

BUG: binary comparison of numpy.int/float and Series #9369

BUG: binary comparison of numpy.int/float and Series #9369

Comments

dmsul commented Jan 29, 2015

jreback commented Jan 29, 2015

dmsul commented Jan 30, 2015

tvyomkesh commented Feb 9, 2015

wadawson commented Feb 18, 2015

gliptak commented May 1, 2016

jreback commented May 1, 2016

gliptak commented May 1, 2016

gliptak commented May 1, 2016

jreback commented May 1, 2016

gliptak commented May 1, 2016

jbrockmendel commented Oct 24, 2018

dmsul commented Oct 24, 2018

jbrockmendel commented Oct 25, 2018

jreback commented Oct 25, 2018