Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Arithmetic with numpy arrays: inconsistency depending on mask #6734

Closed
rcomer opened this issue Oct 14, 2020 · 4 comments
Closed

Arithmetic with numpy arrays: inconsistency depending on mask #6734

rcomer opened this issue Oct 14, 2020 · 4 comments

Comments

@rcomer
Copy link
Contributor

rcomer commented Oct 14, 2020

What happened:
If I multiply a plain numpy array by a dask array, I get a dask array. But if I multiply a numpy masked array by a dask array, I get a numpy masked array.

What you expected to happen:
I would expect to either get a dask array in both cases, or the type of my first operand in both cases.

Minimal Complete Verifiable Example:

import numpy as np
import dask.array as da

plain_numpy_array = np.arange(5)
masked_numpy_array = np.ma.array(np.arange(5), mask=[0, 1, 1, 0, 0])
dask_array = da.from_array(plain_numpy_array)

print(type(plain_numpy_array * dask_array))
print(type(masked_numpy_array * dask_array))

output:

<class 'dask.array.core.Array'>
<class 'numpy.ma.core.MaskedArray'>

Anything else we need to know?:
This relates to #4441, and I note the advice at #4441 (comment) to explicitly convert the numpy arrays to dask arrays before the arithmetic operation. I am attempting to do this for Iris (SciTools/iris#3790) but, while teasing out the details of that, we noticed this inconsistency between plain numpy arrays and masked numpy arrays. I don't think the inconsistency was noted at #4441, and it possibly throws a new light on it?

Environment:

  • Dask version: 2.26.0
  • Python version: 3.7.8
  • Operating System: RHEL7.8
  • Install method (conda, pip, source): miniconda
@jthielen
Copy link
Contributor

jthielen commented Oct 14, 2020

This appears to be an instance of numpy/numpy#15200, now that Dask Arrays more closely follow a NEP-13/18-style type hierarchy after #6393. If that is the case, then I would believe this is fully an upstream issue in NumPy. To check, what result to you get when you reverse the order of multiplication?

@rcomer
Copy link
Contributor Author

rcomer commented Oct 15, 2020

Thanks @jthielen. If the order of multiplication is reversed, we do indeed get dask arrays for both cases. I've also used your WrappedArray example from numpy/numpy#15200 to reproduce the problem that the dask/wrapped array's mask is ignored (which is what I was originally trying to fix in Iris).

# Masks in different places.
wm = WrappedArray(np.ma.masked_array([1, 3, 5], mask=[False, False, True]), test=2)
m = np.ma.masked_array([2, 0, 1], mask=[False, True, False])

print(wm * m)
print(m * wm)

output:

WrappedArray(
[2 -- --]
{'test': 2}
)

[2 -- 5]

So it looks like I should close this and subscribe to numpy/numpy#15200 instead?

@TomAugspurger
Copy link
Member

Thanks for following up. Are we agreed that numpy/numpy#15200 is the root culprit and this can be closed?

@rcomer
Copy link
Contributor Author

rcomer commented Nov 25, 2020

Closing this on the assumption that the problem is with numpy.ma. Thanks for the advice!

@rcomer rcomer closed this as completed Nov 25, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants