Add Numba to rolling.apply #29

mroeschke · 2019-09-18T19:09:05Z

closes Use Numba to improve the performance difference between rolling.apply #25
tests added / passed
passes black pandas
passes git diff upstream/master -u -- "*.py" | flake8 --diff

mroeschke · 2019-09-18T19:17:27Z

Here's the performance comparison so far. Unfortunately cython is still faster for the ndarray case.

# This branch
In [1]: s = pd.Series(range(10000))

In [2]: f = lambda x: np.sum(x) + 5

# raw is unused; row is always an ndarray
In [4]: %timeit s.rolling(10).apply(f, raw=False)
219 ms ± 29.4 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

In [5]: %timeit s.rolling(10).apply(f, raw=True)
226 ms ± 26.6 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

# Master
In [1]: s = pd.Series(range(10000))

In [2]: f = lambda x: np.sum(x) + 5

# row passed as a Series
In [4]: %timeit s.rolling(10).apply(f, raw=False)
1.19 s ± 5.81 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

# row passed as a ndarray
In [5]: %timeit s.rolling(10).apply(f, raw=True)
32.3 ms ± 366 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

mroeschke · 2019-09-18T19:20:47Z

Additionally, apply currently supports passing *args and **kwargs into the function per row. When njitting the passed function from the user, passing kwargs is currently unsupported. numba/numba#2916. And binding these kwargs with functools.partial beforehand is unsupported, numba/numba#4587

…ling_apply_numba

mroeschke · 2019-09-19T06:29:27Z

I was able to solve the performance problem in aa9644c. The biggest issue was compiling the njit rolling_apply function every time. If we dynamically create the rolling apply function with the passed argument, cache the function, and call it again, performance beats cython.

In [1]: s = pd.Series(range(10000))

# r, a Rolling object, will cache the apply functions
In [2]: r = s.rolling(10)

In [3]: f = lambda x: np.sum(x) + 5

In [4]: %timeit r.apply(f, raw=False)
2.16 ms ± 204 µs per loop (mean ± std. dev. of 7 runs, 1 loop each)

…ling_apply_numba

mroeschke · 2019-09-25T07:23:05Z

Here are the ASV benchmarks:

       before           after         ratio
     [68a663be]       [3c49d034]
     <master>         <feature/rolling_apply_numba>
+         289±2ms        350±0.9ms     1.21  rolling.Apply.time_rolling('Series', 1000, 'float', <function sum at 0x105754620>, True)
+         290±7ms          350±1ms     1.20  rolling.Apply.time_rolling('Series', 1000, 'int', <function sum at 0x105754620>, True)
+         297±8ms          353±2ms     1.19  rolling.Apply.time_rolling('DataFrame', 1000, 'int', <function sum at 0x105754620>, True)
-         250±3ms        215±0.6ms     0.86  rolling.Apply.time_rolling('Series', 10, 'float', <function sum at 0x105754620>, True)
-         255±1ms        215±0.8ms     0.84  rolling.Apply.time_rolling('Series', 10, 'int', <function sum at 0x105754620>, True)
-         308±5ms        246±0.9ms     0.80  rolling.Apply.time_rolling('Series', 10, 'float', <function Apply.<lambda> at 0x11b8c6730>, True)
-        310±20ms          241±4ms     0.78  rolling.Apply.time_rolling('DataFrame', 10, 'float', <function Apply.<lambda> at 0x11b8c6730>, True)
-        340±40ms        245±0.6ms     0.72  rolling.Apply.time_rolling('Series', 10, 'int', <function Apply.<lambda> at 0x11b8c6730>, True)
-      13.9±0.06s          380±1ms     0.03  rolling.Apply.time_rolling('Series', 1000, 'int', <function Apply.<lambda> at 0x11b8c6730>, False)
-       14.0±0.1s        380±0.8ms     0.03  rolling.Apply.time_rolling('Series', 1000, 'float', <function Apply.<lambda> at 0x11b8c6730>, False)
-         14.1±0s          380±1ms     0.03  rolling.Apply.time_rolling('DataFrame', 1000, 'float', <function Apply.<lambda> at 0x11b8c6730>, False)
-      13.7±0.01s        349±0.8ms     0.03  rolling.Apply.time_rolling('Series', 1000, 'int', <function sum at 0x105754620>, False)
-      13.9±0.06s        350±0.6ms     0.03  rolling.Apply.time_rolling('Series', 1000, 'float', <function sum at 0x105754620>, False)
-      13.7±0.07s        245±0.8ms     0.02  rolling.Apply.time_rolling('Series', 10, 'float', <function Apply.<lambda> at 0x11b8c6730>, False)
-       14.0±0.1s        246±0.5ms     0.02  rolling.Apply.time_rolling('Series', 10, 'int', <function Apply.<lambda> at 0x11b8c6730>, False)
-      13.5±0.01s        215±0.3ms     0.02  rolling.Apply.time_rolling('Series', 10, 'float', <function sum at 0x105754620>, False)

The raw=False benchmarks are partially misleading since it means cython is handling pandas objects while numba is always handling numpy arrays (numba cannot operate in nopython mode with pandas objects).

mroeschke · 2019-09-25T07:43:22Z

Additional notes:

I had to xfail a couple of tests. To operate rolling.apply in nopython mode for max performance, apply cannot accept arbitrary functions.
the passed function to apply cannot accept *args and **kwargs because it's unsupported in numba.

…ling_apply_numba

WIP: Add Numba to rolling.apply

5f476d9

Matt Roeschke added 2 commits September 18, 2019 13:40

Merge branch 'feature/generalized_window_operations' into feature/rol…

9b9ea7a

…ling_apply_numba

Cache apply function

aa9644c

Matt Roeschke added 7 commits September 22, 2019 01:35

xfail tests due to numba limitations, modify internals

bd9f9c3

Add floor support and xfail more tests

2096c6b

Merge branch 'feature/generalized_window_operations' into feature/rol…

e38c730

…ling_apply_numba

Remove unused codde

2cbbf22

Merge branch 'feature/generalized_window_operations' into feature/rol…

a616ecc

…ling_apply_numba

lint and new passed tests

ed1fbe5

Merge branch 'feature/generalized_window_operations' into feature/rol…

3c49d03

…ling_apply_numba

Add scricter check, remove sum benchmark since its unsupported in numba

1bcc83e

Merge branch 'feature/generalized_window_operations' into feature/rol…

e955d47

…ling_apply_numba

mroeschke changed the title ~~WIP: Add Numba to rolling.apply~~ Add Numba to rolling.apply Sep 29, 2019

mroeschke merged commit d34b96c into feature/generalized_window_operations Sep 29, 2019

mroeschke deleted the feature/rolling_apply_numba branch September 29, 2019 02:51

This was referenced Sep 29, 2019

Address xfail tests in rolling.apply #30

Closed

Use parallel=True in rolling.apply #32

Merged

mroeschke mentioned this pull request Dec 11, 2019

ENH: Add numba engine for rolling apply pandas-dev/pandas#30151

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Numba to rolling.apply #29

Add Numba to rolling.apply #29

mroeschke commented Sep 18, 2019 •

edited

Loading

mroeschke commented Sep 18, 2019 •

edited

Loading

mroeschke commented Sep 18, 2019

mroeschke commented Sep 19, 2019

mroeschke commented Sep 25, 2019

mroeschke commented Sep 25, 2019

Add Numba to rolling.apply #29

Add Numba to rolling.apply #29

Conversation

mroeschke commented Sep 18, 2019 • edited Loading

mroeschke commented Sep 18, 2019 • edited Loading

mroeschke commented Sep 18, 2019

mroeschke commented Sep 19, 2019

mroeschke commented Sep 25, 2019

mroeschke commented Sep 25, 2019

mroeschke commented Sep 18, 2019 •

edited

Loading

mroeschke commented Sep 18, 2019 •

edited

Loading