PERF: Improve performance in rolling.mean(engine="numba") #43612

mroeschke · 2021-09-16T21:50:14Z

tests added / passed
Ensure all linting tests pass, see here for how to run them
whatsnew entry

This also starts to add a shared aggregation function (mean) that can shared between rolling/groupby/DataFrame when using the numba engine.

df = pd.DataFrame(np.ones((10000, 1000)))
roll = df.rolling(10)
roll.mean(engine="numba", engine_kwargs={"nopython": True, "nogil": True, "parallel": True})
%timeit roll.mean(engine="numba", engine_kwargs={"nopython": True, "nogil": True, "parallel": True})

260 ms ± 13.5 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) <- PR
431 ms ± 9.21 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) <- master

pandas/core/numba_/executor.py

jreback · 2021-09-16T22:36:19Z

how does this compare to the cython mean?

mroeschke · 2021-09-16T22:46:41Z

how does this compare to the cython mean?

In [3]: %timeit roll.mean()  # cython
371 ms ± 15 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

jbrockmendel · 2021-09-17T15:57:30Z

is there an issue somewhere for discussing making numba required and just using this instead of the cython versions?

mroeschke · 2021-09-17T16:13:12Z

is there an issue somewhere for discussing making numba required and just using this instead of the cython versions?

Not an ongoing issue, but the last time it was discussed was in #28987. Looks like a lot of the discussion back then revolved around stability & maturity but those issues may not be as bad anymore.

jreback · 2021-09-17T16:49:01Z

is there an issue somewhere for discussing making numba required and just using this instead of the cython versions?

we ought to have this discussion as that would greatly simplify code generally. this is a good start though.

pandas/core/_numba/executor.py

jreback · 2021-09-17T19:27:31Z

pandas/core/_numba/kernels.py

+
+
+@numba.jit(nopython=True, nogil=True, parallel=False)
+def is_monotonic_increasing(bounds):


similar questions about typing (even if it doesn't actually help perf we should do it)

pandas/core/_numba/kernels.py

jreback

looks good

jreback · 2021-09-18T00:12:04Z

pandas/core/_numba/kernels/__init__.py

@@ -0,0 +1 @@
+from pandas.core._numba.kernels.mean_ import sliding_mean  # noqa:F401


alt can use __all__

pandas/core/window/rolling.py

pandas/core/_numba/executor.py

jreback

cc @pandas-dev/pandas-core if any comments

bashtage · 2021-09-20T15:21:00Z

pandas/core/_numba/kernels/mean_.py

+@numba.jit(nopython=True, nogil=True, parallel=False)
+def is_monotonic_increasing(bounds: np.ndarray) -> bool:
+    n = len(bounds)
+    if n == 1:


I don't understand this block. n==1 and n < 2 -> n==0?

Good point I was able to simplify this block.

bashtage · 2021-09-20T15:23:03Z

pandas/core/_numba/kernels/mean_.py

+    min_periods: int,
+) -> np.ndarray:
+    N = len(start)
+    nobs = 0.0


Could nobs ever overflow int64 or uint64?

I suppose with sufficient observations (nobs) this could overflow, but this value should be less than or equal to the window size so the user would also have to provide a window size that overflows u/int64

There is an actual build for the maximum size of a NumPy array, np.intp. This is int64 on Windows64 and Linux, and I suspect on OSX. It seems that nobs should be in integer which should be slightly faster than a float.

Agreed, changed nobs to a int.

jbrockmendel · 2021-09-20T16:00:10Z

pandas/core/_numba/executor.py

+        end: np.ndarray,
+        min_periods: int,
+    ):
+        result = np.empty((len(start), values.shape[1]))


can dtype by specified?

Good point. Added dtype.

bashtage · 2021-09-20T16:19:02Z

pandas/core/_numba/kernels/mean_.py

+def is_monotonic_increasing(bounds: np.ndarray) -> bool:
+    n = len(bounds)
+    if n == 1:
+        return bounds[0] == bounds[0]


Is this to stop single element NaN sequences from being monotonic increasing?

I believe so, yes. This snippet was taken from translating this function, but I was able to remove this condition since we know the inputs should be int64s will no NaNs

pandas/pandas/_libs/algos.pyx

Line 792 in ae049ae

if n == 1:

jreback · 2021-09-23T14:49:39Z

thanks @mroeschke

mroeschke added 8 commits September 12, 2021 14:05

Add mean kernel

5e0b2cc

Add a shared executer function

4f2d298

Add stub of a numba apply function

8ada0be

Hook in numba apply to mean

f201dbd

Merge remote-tracking branch 'upstream/master' into kernels/mean_kernel

bf22d88

Fix caching tests, don't parallelize when ineffective

9ec1ef0

Add whatsnew and fix caching

8132622

add PR number

d9b39bd

mroeschke added numba numba-accelerated operations Performance Memory or execution speed performance labels Sep 16, 2021

jreback reviewed Sep 16, 2021

View reviewed changes

pandas/core/numba_/executor.py Outdated Show resolved Hide resolved

jreback requested changes Sep 16, 2021

View reviewed changes

pandas/core/numba_/executor.py Outdated Show resolved Hide resolved

pandas/core/numba_/executor.py Outdated Show resolved Hide resolved

Make _numba private

9ddf423

Switch args

68524fd

Merge remote-tracking branch 'upstream/master' into kernels/mean_kernel

cc786dd

mroeschke added 2 commits September 17, 2021 11:28

Fix typing

2d19aa0

Tighten docstring

705bb8c

jreback requested changes Sep 17, 2021

View reviewed changes

pandas/core/_numba/kernels.py Outdated Show resolved Hide resolved

pandas/core/_numba/kernels.py Outdated Show resolved Hide resolved

jreback reviewed Sep 17, 2021

View reviewed changes

pandas/core/_numba/kernels.py Outdated Show resolved Hide resolved

mroeschke added 5 commits September 17, 2021 14:46

Keep kernels in their own directory

3622700

Add some typing

0fa551a

Add Series test cases

675b5a1

Type column looper

a169423

Add name

2842199

mroeschke added this to the 1.4 milestone Sep 18, 2021

jreback requested changes Sep 18, 2021

View reviewed changes

mzeitlin11 reviewed Sep 18, 2021

View reviewed changes

pandas/core/_numba/executor.py Show resolved Hide resolved

mroeschke added 2 commits September 19, 2021 11:27

Add __all__

46f1b6b

Merge remote-tracking branch 'upstream/master' into kernels/mean_kernel

05341e3

jreback approved these changes Sep 20, 2021

View reviewed changes

bashtage reviewed Sep 20, 2021

View reviewed changes

jbrockmendel reviewed Sep 20, 2021

View reviewed changes

bashtage reviewed Sep 20, 2021

View reviewed changes

mroeschke added 8 commits September 20, 2021 10:04

Merge remote-tracking branch 'upstream/master' into kernels/mean_kernel

f4f59c8

Add dtype to empty result

c4bd78c

Change nobs to int

8801cf9

Simplify monitonically increasing for bounds

ce6aafb

Merge remote-tracking branch 'upstream/master' into kernels/mean_kernel

f0b38fd

Merge remote-tracking branch 'upstream/master' into kernels/mean_kernel

ca0653b

Merge remote-tracking branch 'upstream/master' into kernels/mean_kernel

09e1eb2

Remove unused kwargs and args

fd595c5

jreback merged commit ffbeda7 into pandas-dev:master Sep 23, 2021

mroeschke deleted the kernels/mean_kernel branch September 23, 2021 16:11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PERF: Improve performance in rolling.mean(engine="numba") #43612

PERF: Improve performance in rolling.mean(engine="numba") #43612

mroeschke commented Sep 16, 2021

jreback commented Sep 16, 2021

mroeschke commented Sep 16, 2021

jbrockmendel commented Sep 17, 2021

mroeschke commented Sep 17, 2021

jreback commented Sep 17, 2021

jreback Sep 17, 2021

jreback left a comment

jreback Sep 18, 2021

mroeschke Sep 19, 2021

jreback left a comment

bashtage Sep 20, 2021

mroeschke Sep 20, 2021

bashtage Sep 20, 2021

mroeschke Sep 20, 2021

bashtage Sep 20, 2021

mroeschke Sep 20, 2021

jbrockmendel Sep 20, 2021

mroeschke Sep 20, 2021

bashtage Sep 20, 2021

mroeschke Sep 20, 2021

jreback commented Sep 23, 2021



		@numba.jit(nopython=True, nogil=True, parallel=False)
		def is_monotonic_increasing(bounds):

		@@ -0,0 +1 @@
		from pandas.core._numba.kernels.mean_ import sliding_mean # noqa:F401

PERF: Improve performance in rolling.mean(engine="numba") #43612

PERF: Improve performance in rolling.mean(engine="numba") #43612

Conversation

mroeschke commented Sep 16, 2021

jreback commented Sep 16, 2021

mroeschke commented Sep 16, 2021

jbrockmendel commented Sep 17, 2021

mroeschke commented Sep 17, 2021

jreback commented Sep 17, 2021

Choose a reason for hiding this comment

jreback left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jreback left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jreback commented Sep 23, 2021