Slow performance of rolling.reduce #1831

fujiisoup · 2018-01-15T11:44:47Z

Code Sample, a copy-pastable example if possible

In [1]: import numpy as np
   ...: import xarray as xr
   ...: 
   ...: da = xr.DataArray(np.random.randn(1000, 100), dims=['x', 'y'],
   ...:                   coords={'x': np.arange(1000)})
   ...: 

In [2]: %%timeit
   ...: da.rolling(x=10).reduce(np.sum)
   ...: 
2.04 s ± 8.25 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

Problem description

In DataArray.rolling, we index by .isel method for every window, constructing huge number of xr.DataArray instances. This is very inefficient.

Of course, we can use bottleneck methods if available, but this provides only a limited functions.
(This also limits possible extensions of rolling, such as ND-rolling (#819), window type (#1142), strides (#819).)

I am wondering if we could skip any sanity checks in our DataArray.isel -> Variable.isel path in indexing.
Or can we directly construct a single large DataArray instead of a lot of small DataArrays?

The text was updated successfully, but these errors were encountered:

jhamman · 2018-01-15T17:02:38Z

@fujiisoup - I think this is a great idea. As you've noted, the reduce implementation in about as naive as it comes (#668). While the __iter__ method should continue to return DataArray objects, the reduce method can be optimized.

fujiisoup · 2018-01-15T23:47:06Z

I'm thinking to use the fancy indexing to speed up rolling.reduce,
by temporary constructing a single large array with da.ndim+1 dimensions,

E.g. for the following DataArray

In [2]: da
Out [2]: 
<xarray.DataArray (x: 4)>
array([0, 1, 2, 3])
Dimensions without coordinates: x

a larger array with additional dimension (x_rolling) would be created

<xarray.DataArray (x: 4, x_rolling: 3)>
array([[ nan,   0.,   1.],
       [  0.,   1.,   2.],
       [  1.,   2.,   3.],
       [  2.,   3.,  nan]])
Dimensions without coordinates: x, x_rolling

then reduce along x_rolling dimension.

The advantages would be

Indexing occurs only once.
Reducing operation can be easily vectorized.

The disadvantages would be

It constructs a huge array with size of (window_size - 1) * da.size, consuming a lot of memory.

I think this disadvantage would be solved if we could use np.lib.stride_trick.as_strided for np.ndarray backend case.
(how about with dask?)

shoyer · 2018-01-16T01:37:39Z

Yes, I think the stride tricks version would be a significant improvement. See this numpy PR for discussion/examples: numpy/numpy#31

fujiisoup · 2018-01-16T09:48:56Z

Thanks for the information.
I will look into the issue.

I think the sliding-and-stack method itself would be also handy.
I will start from this.

fujiisoup mentioned this issue Jan 18, 2018

Rolling window with as_strided #1837

Merged

5 tasks

fujiisoup closed this as completed in #1837 Mar 1, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Slow performance of rolling.reduce #1831

Slow performance of rolling.reduce #1831

fujiisoup commented Jan 15, 2018

jhamman commented Jan 15, 2018

fujiisoup commented Jan 15, 2018

shoyer commented Jan 16, 2018

fujiisoup commented Jan 16, 2018 •

edited

Loading

Slow performance of rolling.reduce #1831

Slow performance of rolling.reduce #1831

Comments

fujiisoup commented Jan 15, 2018

Code Sample, a copy-pastable example if possible

Problem description

jhamman commented Jan 15, 2018

fujiisoup commented Jan 15, 2018

shoyer commented Jan 16, 2018

fujiisoup commented Jan 16, 2018 • edited Loading

fujiisoup commented Jan 16, 2018 •

edited

Loading