-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
apply_ufunc should preemptively broadcast #3032
Comments
Thanks for the issue & code sample @OriolAbril IIUC,
|
Then shouldn't |
Because In [8]: import xarray as xr
...: import numpy as np
...:
...: a = xr.DataArray(data=np.random.normal(size=(7, 3)), dims=["dim1", "dim2"])
...: c = xr.DataArray(data=np.random.normal(size=(7, 6)), dims=["dim1", "dim4"]) # <- change here
...:
...: def func(x,y):
...: print(x.shape)
...: print(y.shape)
...: return x
...:
In [9]: xr.apply_ufunc(func, a, c)
(7, 3, 1)
(7, 1, 6) ...otherwise Another option would be for your original example to put lengths of 1 in all axes, rather than only 'forward filling', e.g.
I think it operates without that step because functions 'in the wild' generally will handle that themselves, but that's a guess and needs someone who knows this better to weight in |
For what it's worth, I agree that this behavior is a little surprising. We should probably make |
To confirm, so that we have something like this? xr.apply_ufunc(func, a, c)
# Out
# (7, 3, 5, 6)
# (7, 3, 5, 6) |
Yes, exactly. |
I'm trying to think whether there would be any performance cost there - i.e. are there any arrays where preemptive broadcasting would be both expensive and unnecessary? |
Even if there were a performance cost (compared to the actual behaviour), it could be easily avoided by using all dims as input_core_dims couldn't it? IIUC, all dims should be broadcasted unless they are in input core dims, so it broadcasting could still be avoided without problem. |
With NumPy arrays at least, there is no cost for broadcasting, because it
can always be done with views. But even for other array types, inserting
size 1 dimensions in the correct location should be basically free, and
would be more helpful than what we currently do
…On Wed, Jun 19, 2019 at 9:25 PM Oriol Abril ***@***.***> wrote:
I'm trying to think whether there would be any performance cost there -
i.e. are there any arrays where preemptive broadcasting would be both
expensive and unnecessary?
Even if there were a performance cost (compared to the actual behaviour),
it could be easily avoided by using all dims as input_core_dims couldn't
it? IIUC, all dims should be broadcasted unless they are in input core
dims, so it broadcasting could still be avoided without problem.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#3032?email_source=notifications&email_token=AAJJFVUDJUKKVSXB5DQONF3P3J2YXA5CNFSM4HZEHCZKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODYCXX4Y#issuecomment-503675891>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAJJFVTEQBAYTYROYMQNGC3P3J2YXANCNFSM4HZEHCZA>
.
|
@shoyer thanks for the clarity @OriolAbril would you mind if we changed this issue to "apply_ufunc should preemptively broadcast" |
@max-sixty Not at all, whatever is best. I actually opened the issue without being 100% it was one. |
Code Sample
I am having some troubles understanding
apply_ufunc
broadcasting rules. As I had some trouble understanding the docs, I am not 100% sure it is a bug, but I am quite sure. I will try to explain why with the following really simple example.The function defined always raises an error when trying to call
apply_ufunc
, but this is intended, as the shapes have already been printed by then, and this keeps the example as simple as possible.Problem description
Here,
a
has been kind of broadcasted, but I would expect the shapes ofa
andc
to be the same as when callingxr.broadcast
, as there are no input core dims, so all dimensions are broadcasted. However:Using different input core dims does not get rid of the problem, instead I believe it shows some more issues:
Is this current behaviour what should be expected?
Output of
xr.show_versions()
xarray: 0.12.1
pandas: 0.24.2
numpy: 1.16.4
scipy: 1.3.0
netCDF4: 1.5.1.2
pydap: None
h5netcdf: None
h5py: 2.9.0
Nio: None
zarr: None
cftime: 1.0.3.4
nc_time_axis: None
PseudonetCDF: None
rasterio: None
cfgrib: None
iris: None
bottleneck: None
dask: None
distributed: None
matplotlib: 3.1.0
cartopy: None
seaborn: None
setuptools: 41.0.0
pip: 19.1.1
conda: None
pytest: 4.5.0
IPython: 7.5.0
sphinx: 2.0.1
The text was updated successfully, but these errors were encountered: