Index division by zero not filled #19322

jbrockmendel · 2018-01-20T07:37:47Z


idx = pd.Index(range(5))
>>> idx
Int64Index([0, 1, 2, 3, 4], dtype='int64')

>>> idx / 0
Int64Index([0, 0, 0, 0, 0], dtype='int64')

>>> idx // 0
Int64Index([0, 0, 0, 0, 0], dtype='int64')

>>> idx % 0
Int64Index([0, 0, 0, 0, 0], dtype='int64')

>>> divmod(idx, 0)
(Int64Index([0, 0, 0, 0, 0], dtype='int64'), Int64Index([0, 0, 0, 0, 0], dtype='int64'))

>>> idx.astype('uint64') / 0
UInt64Index([0, 0, 0, 0, 0], dtype='uint64')

Others are OK:

>>> idx.__truediv__(0)
Float64Index([nan, inf, inf, inf, inf], dtype='float64')

>>> idx.astype('float64') / 0
Float64Index([nan, inf, inf, inf, inf], dtype='float64')

The text was updated successfully, but these errors were encountered:

jreback · 2018-01-20T18:37:06Z

this is what numpy does though i think we changed for Series a long time ago

jbrockmendel · 2018-01-20T20:43:43Z

The test from #9308 for Series division is now in tests.series.test_operators. Splitting it into separate tests for Series.__div__, Series.__rdiv__, Series.__floordiv__ and parameterizing it with different dtypes and different zero-like other args, the behavior is inconsistent.

# For testing division by (or of) zero for Series with length 3, this
# gives several scalar-zeros and length-3 vector-zeros
zeros = [box([0, 0, 0], dtype=dtype)
         for box in [pd.Series, pd.Index, np.array]
         for dtype in [np.int64, np.uint64, np.float64]]
zeros.extend([np.array(0, dtype=dtype)
             for dtype in [np.int64, np.uint64, np.float64]])
zeros.extend([0, 0.0, long(0)])

class TestSeriesArithmetic(object):
    @pytest.mark.parametrize('dtype', [np.int64, np.uint64, np.float64])
    @pytest.mark.parametrize('zero', zeros)
    def test_div_zero(self, dtype, zero):
        # GH#9144
        # dtype stability GH#19322
        ser = Series([-1, 0, 1]).astype(dtype)

        result = ser / zero
        expected = Series([-np.inf, np.nan, np.inf])
        assert_series_equal(result, expected)

15 of these fail, 38 for the analogous __floordiv__ test. Before I try to fix this: is there a convention we're following where 0 / 0, 1 / 0, and -1 / 0 may vary depending on the exact dtypes of the 1s and 0s? Or should these always be np.nan, np.inf, and -np.inf, respectively?

Update The tests with dtype==np.uint64 are wrong b/c of casting.

jreback · 2018-01-20T21:15:03Z

there is an old issue and whatsnew about this see if u can find it
maybe version .11 or around there

jbrockmendel · 2018-01-20T21:51:50Z

0.12.0 has

  - Fix modulo and integer division on Series,DataFrames to act similarly to ``float`` dtypes to return
    ``np.nan`` or ``np.inf`` as appropriate (:issue:`3590`). This correct a numpy bug that treats ``integer``
    and ``float`` dtypes differently.

It looks like #3600 and #9308 are both about this, but unsigned ints appear to be missing form the tests, along with a bunch of the reverse ops. I can patch this and implement a thorough set of test cases, but need to confirm that all ints and all zeros are interchangeable for this purpose.

jreback · 2018-01-20T21:56:11Z

this is tricky as numpy does different things for uint and int
they don’t convert to float

i think we should though - see what breaks

jbrockmendel · 2018-01-21T17:53:22Z

Making some progress on this. It's liable to be a big diff, so my thought is to first write all the tests and xfail the ones that are currently wrong (allowing a reviewer to double-check my understanding of the desired behavior), then make reasonably-sized PRs to un-xfail those tests.

jreback · 2018-01-21T18:06:56Z

yeah I think these should all match what we do in series now.

In [1]: idx = pd.Index(range(5))

In [2]: s = Series(idx)

In [3]: s / 0
Out[3]: 
0    NaN
1    inf
2    inf
3    inf
4    inf
dtype: float64

In [4]: idx / s
Out[4]: 
0    NaN
1    1.0
2    1.0
3    1.0
4    1.0
dtype: float64

In [5]: idx / 0
---------------------------------------------------------------------------
ZeroDivisionError                         Traceback (most recent call last)
<ipython-input-5-99f2096171d8> in <module>()
----> 1 idx / 0

~/pandas/pandas/core/indexes/range.py in _evaluate_numeric_binop(self, other)
    600                     if step:
    601                         with np.errstate(all='ignore'):
--> 602                             rstep = step(self._step, other)
    603 
    604                         # we don't have a representable op

ZeroDivisionError: division by zero

In [6]: idx // 0
---------------------------------------------------------------------------
ZeroDivisionError                         Traceback (most recent call last)
<ipython-input-6-1648d8a4476e> in <module>()
----> 1 idx // 0

~/pandas/pandas/core/indexes/range.py in __floordiv__(self, other)
    553         if is_integer(other):
    554             if (len(self) == 0 or
--> 555                     self._start % other == 0 and
    556                     self._step % other == 0):
    557                 start = self._start // other

ZeroDivisionError: integer division or modulo by zero

In [7]: s // 0
Out[7]: 
0    NaN
1    inf
2    inf
3    inf
4    inf
dtype: float64

yes xfailing things would be ok. Note since these are in ops, It think it would make sense to split out the giant tests_ops to sub-modules first :> (I know that's for Series), but same idea.

jbrockmendel · 2019-07-20T03:00:38Z

Closed by #27321.

jbrockmendel mentioned this issue Jan 20, 2018

BUG: 0/frame numeric ops buggy (GH9144) #9308

Merged

jreback added Dtype Conversions Unexpected or buggy dtype conversions Numeric Operations Arithmetic, Comparison, and Logical operations Compat pandas objects compatability with Numpy or Python functions Difficulty Intermediate labels Jan 21, 2018

jreback added this to the Next Major Release milestone Jan 21, 2018

jbrockmendel mentioned this issue Jan 22, 2018

separate numeric tests so we can isolate division by zero #19336

Merged

jbrockmendel mentioned this issue Feb 7, 2018

DataFrame vs Series vs Index arithmetic Roundup #18824

Closed

59 tasks

jbrockmendel mentioned this issue May 23, 2018

ENH: Support operators for ExtensionArray #20889

Closed

4 tasks

jbrockmendel closed this as completed Jul 20, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Index division by zero not filled #19322

Index division by zero not filled #19322

jbrockmendel commented Jan 20, 2018

jreback commented Jan 20, 2018

jbrockmendel commented Jan 20, 2018 •

edited

Loading

jreback commented Jan 20, 2018

jbrockmendel commented Jan 20, 2018

jreback commented Jan 20, 2018

jbrockmendel commented Jan 21, 2018

jreback commented Jan 21, 2018

jbrockmendel commented Jul 20, 2019

Index division by zero not filled #19322

Index division by zero not filled #19322

Comments

jbrockmendel commented Jan 20, 2018

jreback commented Jan 20, 2018

jbrockmendel commented Jan 20, 2018 • edited Loading

jreback commented Jan 20, 2018

jbrockmendel commented Jan 20, 2018

jreback commented Jan 20, 2018

jbrockmendel commented Jan 21, 2018

jreback commented Jan 21, 2018

jbrockmendel commented Jul 20, 2019

jbrockmendel commented Jan 20, 2018 •

edited

Loading