Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: Add numba engine for rolling apply #30151

Merged
merged 56 commits into from
Dec 27, 2019

Conversation

mroeschke
Copy link
Member

@mroeschke mroeschke commented Dec 9, 2019

@mroeschke mroeschke added Dependencies Required and optional dependencies Enhancement Window rolling, ewma, expanding labels Dec 9, 2019
@mroeschke mroeschke added this to the 1.0 milestone Dec 9, 2019
pandas/core/window/rolling.py Outdated Show resolved Hide resolved
@TomAugspurger
Copy link
Contributor

Random question: do you have benchmarks on this anywhere? It would be good to give guidance on when this is helpful (data sizes, type of user-defined function, ratio of time spent in windowing vs. the UDF, etc.).


*args, **kwargs
Arguments and keyword arguments to be passed into func.
args : tuple, default None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

args,kwargs should be at the end

raw=False,
args=None,
kwargs=None,
engine="cython",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

type new arguments

pandas/core/window/rolling.py Show resolved Hide resolved
@mroeschke
Copy link
Member Author

@TomAugspurger performance testing is lightly described in twosigma#29, but good point to have a dedicated section in the documentation describing when the numba engine would be advantageous



def _generate_numba_apply_func(
args: Tuple, kwargs: Dict, func: Callable, engine_kwargs: Optional[Dict]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add some comments here

pandas/tests/window/test_api.py Outdated Show resolved Hide resolved
@alimcmaster1
Copy link
Member

alimcmaster1 commented Dec 24, 2019

Code check failures are related to https://github.com/pandas-dev/pandas/pull/30370/files#r361222880
I can submit a PR to fix this

Check for non-standard imports
##[error]pandas/io/stata.py:12:from collections.abc import Iterator
##[error]pandas/io/sas/sas7bdat.py:16:from collections.abc import Iterator
##[error]pandas/io/sas/sas_xport.py:10:from collections.abc import Iterator
##[error]pandas/io/common.py:5:from collections.abc import Iterator
##[error]pandas/io/parsers.py:6:from collections.abc import Iterator
##[error]pandas/io/json/_json.py:2:from collections.abc import Iterator

PR here: #30455

Copy link
Member

@WillAyd WillAyd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Few more comments. I don't want to go overboard on annotations so those in particular aren't blockers, but generally if subtypes can be added for anything they would be preferred

from pandas.compat._optional import import_optional_dependency


def make_rolling_apply(func, args, nogil, parallel, nopython):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a blocker here since this is large enough, but would be nice to annotate this in a follow up

@@ -92,6 +93,7 @@ def __init__(
self.win_freq = None
self.axis = obj._get_axis_number(axis) if axis is not None else None
self.validate()
self._numba_func_cache: Dict = dict()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I missed this in previous review but similar comment as rest; not a block but subtypes make annotations much more insightful to the reader

@@ -442,6 +444,7 @@ def _apply(
floor: int = 1,
is_weighted: bool = False,
name: Optional[str] = None,
use_numba_cache: Optional[bool] = False,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can just be annotated as bool no? Or do we need to explicitly handle None?

@@ -342,3 +342,32 @@ def test_multiple_agg_funcs(self, func, window_size, expected_vals):
)

tm.assert_frame_equal(result, expected)


class TestEngine:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this class would be more logically placed in test_apply



def test_all_apply(engine, raw):
if engine == "numba":
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we just skip or xfail instead? I think more indicative to reader that it is an invalid combination rather than re-assigning

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could make engine_and_raw fixture using the pattern in #30456 if this is going to come up repeatedly

import pandas.util.testing as tm


@td.skip_if_no("numba", "0.46.0")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can just keep as skip_if_no("numba") since the minimum is already enforced by requirements files

We've added an ``engine`` keyword to :meth:`~Rolling.apply` that allows the user to execute the
routine using `Numba <https://numba.pydata.org/>`__ instead of Cython. Using the Numba engine
can yield significant performance gains if the apply function can operate on numpy arrays and
the data set is larger. For more details, see :ref:`rolling apply documentation <stats.rolling_apply>`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"the data set is larger" here is pretty vague. is the perf gain a function of the array size or more about the user-defined function?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Somewhat both, but more obvious with the data size. I can make this more specific.

@@ -428,6 +439,8 @@ Optional libraries below the lowest tested version may still work, but are not c
+-----------------+-----------------+---------+
| matplotlib | 2.2.2 | |
+-----------------+-----------------+---------+
| numba | 0.46.0 | |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add an X for this

@jbrockmendel
Copy link
Member

This looks really nice. way to go

@mroeschke
Copy link
Member Author

Thanks for the review everyone. Think I addressed all the comments.

Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm. small questions.

cc @jorisvandenbossche @TomAugspurger @jbrockmendel

if any comments

pandas/core/window/common.py Outdated Show resolved Hide resolved
parallel: bool,
nopython: bool,
):
numba = import_optional_dependency("numba")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add a doc-string that says what this function does (the parameters are already documented elsewhere, maybe just mention that)

pandas/core/window/numba_.py Outdated Show resolved Hide resolved
pandas/core/window/numba_.py Show resolved Hide resolved
pandas/core/window/rolling.py Show resolved Hide resolved
Copy link
Member

@WillAyd WillAyd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

pandas/core/window/common.py Outdated Show resolved Hide resolved
@mroeschke
Copy link
Member Author

Thanks for another round of review. Addressed all the comments.

@TomAugspurger
Copy link
Contributor

Looks good.

@jreback jreback merged commit a9fcdc5 into pandas-dev:master Dec 27, 2019
@jreback
Copy link
Contributor

jreback commented Dec 27, 2019

thanks @mroeschke very nice!

@mroeschke mroeschke deleted the numba_rolling_apply branch December 27, 2019 19:28
@jbrockmendel
Copy link
Member

@mroeschke heads up that im now seeing this in test output locally:

pandas/tests/window/test_numba.py::TestApply::test_cache[False-True-False-False]
  /Users/bmendel/Desktop/pd/dtmean/pandas/core/window/numba_.py:75: NumbaPerformanceWarning: 
  The keyword argument 'parallel=True' was specified but no transformation for parallel execution was possible.
  
  To find out why, try turning on parallel diagnostics, see http://numba.pydata.org/numba-doc/latest/user/parallel.html#diagnostics for help.
  
  File "pandas/core/window/numba_.py", line 59:
  
              def impl(window, *_args):
              ^
  
    result[i] = numba_func(window, *args)

@mroeschke
Copy link
Member Author

I'll make a followup to suppress this warning (no baring on what is being test).

AlexKirko pushed a commit to AlexKirko/pandas that referenced this pull request Dec 29, 2019
keechongtan added a commit to keechongtan/pandas that referenced this pull request Dec 29, 2019
…ndexing-1row-df

* upstream/master: (333 commits)
  CI: troubleshoot Web_and_Docs failing (pandas-dev#30534)
  WARN: Ignore NumbaPerformanceWarning in test suite (pandas-dev#30525)
  DEPR: camelCase in offsets, get_offset (pandas-dev#30340)
  PERF: implement scalar ops blockwise (pandas-dev#29853)
  DEPR: Remove Series.compress (pandas-dev#30514)
  ENH: Add numba engine for rolling apply (pandas-dev#30151)
  [ENH] Add to_markdown method (pandas-dev#30350)
  DEPR: Deprecate pandas.np module (pandas-dev#30386)
  ENH: Add ignore_index for df.drop_duplicates (pandas-dev#30405)
  BUG: The setting xrot=0 in DataFrame.hist() doesn't work with by and subplots pandas-dev#30288 (pandas-dev#30491)
  CI: Fix GBQ Tests (pandas-dev#30478)
  Bug groupby quantile listlike q and int columns (pandas-dev#30485)
  ENH: Add ignore_index for df.sort_values and series.sort_values (pandas-dev#30402)
  TYP: Typing hints in pandas/io/formats/{css,csvs}.py (pandas-dev#30398)
  BUG: raise on non-hashable Index name, closes pandas-dev#29069 (pandas-dev#30335)
  Replace "foo!r" to "repr(foo)" syntax pandas-dev#29886 (pandas-dev#30502)
  BUG: preserve EA dtype in transpose (pandas-dev#30091)
  BLD: add check to prevent tempita name error, clsoes pandas-dev#28836 (pandas-dev#30498)
  REF/TST: method-specific files for test_append (pandas-dev#30503)
  marked unused parameters (pandas-dev#30504)
  ...
numba_func = func
else:

@numba.generated_jit(nopython=nopython, nogil=nogil, parallel=parallel)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stuartarchibald sorry for the ping, but I see that generated_jit has been deprecated in numba 0.57. IIRC you helped me add this a while back and am lost on how to write this in terms of overload

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mroeschke no problem, I can try and help with this. I think it needs to look a bit like this (for reference, this is untested, I am just guessing from the context! Also, the pandas variant is obviously wrapped to close over some configuration which I've omitted, so consider this as the function body of make_rolling_apply. I've left comments inline to try and explain what's going on):

import types
import numpy as np
from numba.extending import overload, is_jitted
from numba import njit
import numba


# this provides a local definition to overload
def overload_target(window, *_args):
    # If JIT is disabled, this function will run, so write the implementation here!
    pass


nopython = True
nogil = True
parallel = False

# pretend this is an arg to `make_rolling_apply`
def func(window, *args):
    return window * 2 + args[0]


@overload(overload_target, jit_options={'nopython':nopython, 'nogil':nogil,
                                        'parallel':parallel})
def ol_overload_target(window, *_args):
    # This function "overloads" `overload_target`, whenever the Numba compiler
    # "sees" `overload_target` it will use this function.

    # Using `is_jitted` to avoid `isinstance` on
    # `numba.targets.registry.CPUDispatcher` as that may be considered an
    # internal Numba detail.
    if is_jitted(func):
        # it's already JIT compiled so just reference it
        overload_target_impl = func
    elif getattr(np, func.__name__, False) is func or isinstance(
        func, types.BuiltinFunctionType
    ):
        # it's a NumPy function or builtin so just reference it
        overload_target_impl = func
    else:
        # it's a Python function, so register it as JIT compilable and reference
        # that
        overload_target_impl = numba.jit(func, nopython=nopython, nogil=nogil)

    # This is the Numba implementation of the overload, it will just be JIT
    # compiled whenever the compiler "sees" a reference to "overload_target" in
    # code it is compiling.
    def impl(window, *_args):
        return overload_target_impl(window, *_args)

    return impl


# demo

@njit
def roll_apply(window, *_args):
    return overload_target(window, *_args)


print(roll_apply(np.arange(10.), 1.23))

@overload is basically saying to Numba "when you see this specific python function (the one in the first argument in the @overload decorator) use this implementation". The concept about there being a "typing" part that can be used to dispatch different variants based on type is exactly the same as in @generated_jit. The largest difference is what happens if the JIT compiler is turned off. In the case of @overload the python function being overloaded will run, i.e., the code just executes as would be expected in the interpreter. Whereas in the case of @generated_jit, because the pure python implementation and the Numba implementation are the same function, if you turn the JIT compiler off it will just break (the value returned when calling a @generated_jit function is a function implementing the Numba specialisation). Essentially, @generated_jit is like doing @overload but the function being decorated is also the function being overloaded.

Hope this helps?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the reply! We had a PR recently that refactored this to use extending.register_jittable. Would that be a sufficient alternative? https://github.com/pandas-dev/pandas/pull/53455/files

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No problem! I just took a look at the patch above, I think it'd work but think it might lose some of the dispatch ability offered by generated_jit/overload. As I understand it, the original code would have let a NumPy function or a built-in be passed in as the "user function", whereas I think the register_jittable version requires a user defined Python function. It may be that the register_jittable version is a sufficient alternative for the need/use cases in practice, in which case, it seems appropriate.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great thanks for the context! Yeah this function should expect a custom UDF so thanks for the confirmation

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Glad to get this resolved, thanks for confirming too! It sounds like the replacement above is appropriate. If there are any more issues/queries feel free to open issues on the Numba issue tracker (or ping here!).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stuartarchibald I'm running into a rolling apply issue with pandas 2.1.1 and numba 0.58 that might be related. Discussion is here:
https://numba.discourse.group/t/pandas-source-of-old-style-error-capturing-warning/2169/8

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Dependencies Required and optional dependencies Enhancement Window rolling, ewma, expanding
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add Numba as an optional dependency for rolling.apply for pandas 1.0
10 participants