ENH/API: rolling_apply to pass frames to the rolled function (rather than ndarrays) #5071

jreback · 2013-10-01T17:32:57Z

Returning a Series: http://stackoverflow.com/questions/19121854/using-rolling-apply-on-a-dataframe-object

Returning a Scalar: http://stackoverflow.com/questions/21040766/python-pandas-rolling-apply-two-column-input-into-function/21045831#21045831

jseabold · 2014-01-09T00:15:46Z

+1

I was just trying to do similar. Would be nice if rolling_apply, expanding_apply had an option to work over the whole DataFrame. It doesn't even have to pass frames, but rather just roll over the whole 0 axis instead of one series at a time.

ghost · 2014-01-09T00:55:14Z

That sounds equivalent to the split-apply(-combine) approach of groupby, only pandas doesn't
currently provide that sort of split function.

related #4059

twiecki · 2015-06-22T10:45:16Z

Just ran into the same issue.

frankz-ai · 2015-06-27T19:50:00Z

same issue here

max-sixty · 2016-04-22T01:08:25Z

@jreback What's the best way to do this?

If I try and change the _apply method on _Rolling to take pandas objects rather than numpy arrays, a few of the standard functions fail (e.g. _zsqrt):

...
return _zsqrt(algos.roll_var(arg, window, minp, ddof))
TypeError: Argument 'input' has incorrect type (expected numpy.ndarray, got Series)

Could this be done in roll_generic? Or with an additional path other that the standard _apply for user-supplied functions? Neither seem that compelling

jreback · 2016-04-22T01:23:19Z

So just to have an example

In [32]: df = DataFrame({'A' : np.random.randn(5), 'B' : np.random.randint(0,10,size=5)})

In [33]: def f(x):
    print type(x)
    return x.sum()
   ....: 

In [34]: df.rolling(2).apply(f)
<type 'numpy.ndarray'>
<type 'numpy.ndarray'>
<type 'numpy.ndarray'>
<type 'numpy.ndarray'>
<type 'numpy.ndarray'>
<type 'numpy.ndarray'>
<type 'numpy.ndarray'>
<type 'numpy.ndarray'>
Out[34]: 
          A     B
0       NaN   NaN
1 -0.414646  15.0
2  1.007150   8.0
3  1.822979   2.0
4  0.884894   4.0

The issue is that you need to pass a constructed object to algos.roll_generic (or maybe a new function) which does the windowing.

here

max-sixty · 2016-04-22T01:48:04Z

Is this do-able with roll_generic? It seems that requires an array:

In [28]: series=pd.Series(range(10),dtype='float64')

In [29]: roll_generic(series, win=2, minp=2, offset=0, func=lambda x: x.sum(), args=[], kwargs={})
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-29-3ec0f9465dad> in <module>()
----> 1 roll_generic(series, win=2, minp=2, offset=0, func=lambda x: x.sum(), args=[], kwargs={})

TypeError: Argument 'input' has incorrect type (expected numpy.ndarray, got Series)

Does that mean we need a parallel function which operates on Series?

I could imagine having a function that generated the groups - then it would actually be a groupby. But haven't thought through it enough and performance may be an issue.

jreback · 2016-04-22T01:57:57Z

no u have to change roll_generic to take an object

doing with GroupBy is a whole separate idea - I may do that but it's orthogonal (and the reason is different than this)

max-sixty · 2016-04-22T02:07:43Z

OK, I haven't worked with Cython before, and not sure how it handles non-numpy arrays, but I can have a go. Probably won't have immediate results.

citynorman · 2016-08-06T00:02:39Z

Almost 3 years and it's still an issue :'(
`

import pandas as pd
import numpy as np

def distance_sum(df):
    print df
    df['norm1']=df.ix[:,0]/df.ix[0,0]
    df['norm2']=df.ix[:,1]/df.ix[0,1]
    return np.sum(np.square(df['norm1']-df['norm2']))

df=pd.DataFrame({'a':np.array([1,2,3]),'b':np.array([10,20,30])})
df.rolling(center=False,window=2).apply(distance_sum)

`

AttributeError Traceback (most recent call last)
in ()
9
10 df=pd.DataFrame({'a':np.array([1,2,3]),'b':np.array([10,20,30])})
---> 11 df.rolling(center=False,window=2).apply(distance_sum)

/usr/local/lib/python2.7/dist-packages/pandas/core/generic.pyc in getattr(self, name)
2358 return self[name]
2359 raise AttributeError("'%s' object has no attribute '%s'" %
-> 2360 (type(self).name, name))
2361
2362 def setattr(self, name, value):

AttributeError: 'DataFrame' object has no attribute 'rolling'

OR

AttributeError Traceback (most recent call last)
in ()
14
15 t=pd.DataFrame({'a':a,'b':b})
---> 16 t.rolling(center=False,window=2).apply(test_distance_sum)

/usr/local/lib/python2.7/dist-packages/pandas/core/window.pyc in apply(self, func, args, kwargs)

/usr/local/lib/python2.7/dist-packages/pandas/core/window.pyc in _apply(self, func, name, window, center, check_minp, how, **kwargs)

/usr/local/lib/python2.7/dist-packages/numpy/lib/shape_base.pyc in apply_along_axis(func1d, axis, arr, _args, *_kwargs)
89 outshape = asarray(arr.shape).take(indlist)
90 i.put(indlist, ind)
---> 91 res = func1d(arr[tuple(i.tolist())], _args, *_kwargs)
92 # if res is a number, then we have a smaller output array
93 if isscalar(res):

/usr/local/lib/python2.7/dist-packages/pandas/core/window.pyc in calc(x)

/usr/local/lib/python2.7/dist-packages/pandas/core/window.pyc in f(arg, window, min_periods)

pandas/algos.pyx in pandas.algos.roll_generic (pandas/algos.c:51577)()

in test_distance_sum(df)
9 def test_distance_sum(df):
10 print df
---> 11 df['pxnorm1']=df.ix[:,0]/df.ix[0,0]
12 df['pxnorm2']=df.ix[:,1]/df.ix[0,1]
13 return np.mean(df)#np.sum(np.square(df['pxnorm1']-df['pxnorm2']))

AttributeError: 'numpy.ndarray' object has no attribute 'ix'

closes pandas-dev#5071

closes #5071

jreback mentioned this issue Feb 6, 2014

ENH: All args and kwargs to generic expanding/rolling apply. #6289

Closed

jreback modified the milestones: 0.15.0, 0.14.0 Mar 28, 2014

jreback mentioned this issue Oct 29, 2014

API/ENH: master issue for pd.rolling_apply #8659

Closed

14 tasks

jreback mentioned this issue Nov 10, 2014

rolling_apply and dataframes #8777

Closed

jreback added the Reshaping Concat, Merge/Join, Stack/Unstack, Explode label Feb 5, 2015

jreback modified the milestones: 0.17.0, 0.16.0 Feb 5, 2015

jreback added Difficulty Intermediate labels Apr 6, 2015

jreback mentioned this issue Apr 21, 2016

Functions applied on .expanding() receive ndarrays rather than pandas objects #12950

Closed

jreback modified the milestones: Interesting Issues, Next Major Release Sep 11, 2017

jreback modified the milestones: Interesting Issues, Next Major Release Nov 26, 2017

Seung-hyeon mentioned this issue Mar 9, 2018

rolling().apply is inconsistent with groupby().apply #20068

Closed

jreback added a commit to jreback/pandas that referenced this issue Apr 2, 2018

API: rolling.apply will pass Series to function

a55f486

closes pandas-dev#5071

jreback mentioned this issue Apr 2, 2018

API: rolling.apply will pass Series to function #20584

Merged

jreback added a commit to jreback/pandas that referenced this issue Apr 2, 2018

API: rolling.apply will pass Series to function

4594457

closes pandas-dev#5071

jreback added a commit to jreback/pandas that referenced this issue Apr 10, 2018

API: rolling.apply will pass Series to function

0a6a1be

closes pandas-dev#5071

jreback added a commit to jreback/pandas that referenced this issue Apr 12, 2018

API: rolling.apply will pass Series to function

f131bba

closes pandas-dev#5071

jreback added a commit to jreback/pandas that referenced this issue Apr 13, 2018

API: rolling.apply will pass Series to function

424e784

closes pandas-dev#5071

jreback added a commit to jreback/pandas that referenced this issue Apr 14, 2018

API: rolling.apply will pass Series to function

5e47143

closes pandas-dev#5071

jreback added a commit to jreback/pandas that referenced this issue Apr 15, 2018

API: rolling.apply will pass Series to function

5c5abe5

closes pandas-dev#5071

jreback added a commit to jreback/pandas that referenced this issue Apr 16, 2018

API: rolling.apply will pass Series to function

946233c

closes pandas-dev#5071

jreback closed this as completed in #20584 Apr 16, 2018

jreback modified the milestones: Next Major Release, 0.23.0 Apr 16, 2018

jreback added a commit that referenced this issue Apr 16, 2018

API: rolling.apply will pass Series to function (#20584)

4a34497

closes #5071

jreback mentioned this issue May 30, 2018

Rolling Data Frames Don't Include Index Values #21236

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ENH/API: rolling_apply to pass frames to the rolled function (rather than ndarrays) #5071

ENH/API: rolling_apply to pass frames to the rolled function (rather than ndarrays) #5071

jreback commented Oct 1, 2013

jseabold commented Jan 9, 2014

ghost commented Jan 9, 2014

twiecki commented Jun 22, 2015

frankz-ai commented Jun 27, 2015

max-sixty commented Apr 22, 2016 •

edited

Loading

jreback commented Apr 22, 2016

max-sixty commented Apr 22, 2016

jreback commented Apr 22, 2016

max-sixty commented Apr 22, 2016

citynorman commented Aug 6, 2016

ENH/API: rolling_apply to pass frames to the rolled function (rather than ndarrays) #5071

ENH/API: rolling_apply to pass frames to the rolled function (rather than ndarrays) #5071

Comments

jreback commented Oct 1, 2013

jseabold commented Jan 9, 2014

ghost commented Jan 9, 2014

twiecki commented Jun 22, 2015

frankz-ai commented Jun 27, 2015

max-sixty commented Apr 22, 2016 • edited Loading

jreback commented Apr 22, 2016

max-sixty commented Apr 22, 2016

jreback commented Apr 22, 2016

max-sixty commented Apr 22, 2016

citynorman commented Aug 6, 2016

`

max-sixty commented Apr 22, 2016 •

edited

Loading