Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: DataFrame.mad(axis=1) not working for nullable integer dtype #33036

Closed
jorisvandenbossche opened this issue Mar 26, 2020 · 6 comments · Fixed by #43170
Closed

BUG: DataFrame.mad(axis=1) not working for nullable integer dtype #33036

jorisvandenbossche opened this issue Mar 26, 2020 · 6 comments · Fixed by #43170
Labels
good first issue NA - MaskedArrays Related to pd.NA and nullable extension arrays Needs Tests Unit test(s) needed to prevent regressions Reduction Operations sum, mean, min, max, etc.
Milestone

Comments

@jorisvandenbossche
Copy link
Member

In [32]: df = pd.DataFrame(np.random.randn(100000, 4).astype(int)).astype("Int64")  

In [33]: df.mad()  
Out[33]: 
0    0.363361
1    0.365456
2    0.369014
3    0.368195
dtype: float64

In [34]: df.mad(axis=1) 
---------------------------------------------------------------------------
NotImplementedError                       Traceback (most recent call last)
<ipython-input-34-ebf68ac71360> in <module>
----> 1 df.mad(axis=1)

~/scipy/pandas/pandas/core/generic.py in mad(self, axis, skipna, level)
  10042                 demeaned = data - data.mean(axis=0)
  10043             else:
> 10044                 demeaned = data.sub(data.mean(axis=1), axis=0)
  10045             return np.abs(demeaned).mean(axis=axis, skipna=skipna)
  10046 

~/scipy/pandas/pandas/core/ops/__init__.py in f(self, other, axis, level, fill_value)
    751             axis = self._get_axis_number(axis) if axis is not None else 1
    752             return _combine_series_frame(
--> 753                 self, other, pass_op, axis=axis, str_rep=str_rep
    754             )
    755         else:

~/scipy/pandas/pandas/core/ops/__init__.py in _combine_series_frame(left, right, func, axis, str_rep)
    551 
    552             array_op = get_array_op(func, str_rep=str_rep)
--> 553             bm = left._data.apply(array_op, right=values.T)
    554             return type(left)(bm)
    555 

~/scipy/pandas/pandas/core/internals/managers.py in apply(self, f, filter, align_keys, **kwargs)
    431 
    432             if callable(f):
--> 433                 applied = b.apply(f, **kwargs)
    434             else:
    435                 applied = getattr(b, f)(**kwargs)

~/scipy/pandas/pandas/core/internals/blocks.py in apply(self, func, **kwargs)
    365         """
    366         with np.errstate(all="ignore"):
--> 367             result = func(self.values, **kwargs)
    368 
    369         return self._split_op_result(result)

~/scipy/pandas/pandas/core/ops/array_ops.py in arithmetic_op(left, right, op, str_rep)
    203     if should_extension_dispatch(lvalues, rvalues) or isinstance(rvalues, Timedelta):
    204         # Timedelta is included because numexpr will fail on it, see GH#31457
--> 205         res_values = op(lvalues, rvalues)
    206 
    207     else:

~/scipy/pandas/pandas/core/ops/common.py in new_method(self, other)
     61         other = item_from_zerodim(other)
     62 
---> 63         return method(self, other)
     64 
     65     return new_method

~/scipy/pandas/pandas/core/arrays/integer.py in integer_arithmetic_method(self, other)
    618 
    619             if getattr(other, "ndim", 0) > 1:
--> 620                 raise NotImplementedError("can only perform ops with 1-d structures")
    621 
    622             if isinstance(other, IntegerArray):

NotImplementedError: can only perform ops with 1-d structures
@jorisvandenbossche jorisvandenbossche added this to the Contributions Welcome milestone Mar 26, 2020
@jreback
Copy link
Contributor

jreback commented Mar 26, 2020

thought we deprecated .mad

@jorisvandenbossche
Copy link
Member Author

You mentioned that recently here as well: #31742 (comment), but I am not aware of any issues/discussion about that

@jreback
Copy link
Contributor

jreback commented Mar 26, 2020

#11787

@jbrockmendel
Copy link
Member

This is now working on master, though #33600 is breaking it for the axis=0 case.

@jbrockmendel jbrockmendel added the Reduction Operations sum, mean, min, max, etc. label Sep 21, 2020
@notwopr
Copy link

notwopr commented Dec 23, 2020

Hi. I've used Pandas 1.1.3 thru 1.1.5, but all versions have produced the same problem...pandas.DataFrame.mad(axis=1) producing NaNs on perfectly fine data. The dataframe I have has several columns of floats. When I run my script to create a new column using any other command like .std(axis=1), .max(axis=1), .min(axis=1), .mean(axis=1), and .median(axis=1), they all produce correct numbers.
But when it calculates for .mad(axis=1), it produces a blank column. No error messages are produced. It's just blank.

@jbrockmendel jbrockmendel added the NA - MaskedArrays Related to pd.NA and nullable extension arrays label May 18, 2021
@jbrockmendel
Copy link
Member

This is working on master, could use tests.

@jbrockmendel jbrockmendel added the Needs Tests Unit test(s) needed to prevent regressions label Jul 1, 2021
@mroeschke mroeschke modified the milestones: Contributions Welcome, 1.4 Aug 23, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue NA - MaskedArrays Related to pd.NA and nullable extension arrays Needs Tests Unit test(s) needed to prevent regressions Reduction Operations sum, mean, min, max, etc.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants