Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PERF: numexpr doesn't support floordiv, so don't try #40727

Merged
merged 1 commit into from
Apr 1, 2021

Conversation

jorisvandenbossche
Copy link
Member

@jorisvandenbossche jorisvandenbossche commented Apr 1, 2021

See overview of supported operators: https://numexpr.readthedocs.io/projects/NumExpr3/en/latest/user_guide.html#supported-operators. By trying to use it, we were falling back to the (slower) _masked_arith_op

Discovered by investigating the arithmetic benchmarks at #39146 (comment)

For example using the arithmetic.IntFrameWithScalar.time_frame_op_with_scalar(<class 'numpy.float64'>, 3.0, <built-in function floordiv>) benchmark:

import operator

dtype = np.float64
arr = np.random.randn(20000, 100)
df = pd.DataFrame(arr.astype(dtype))
scalar = 3.0
op = operator.floordiv
In [2]: %timeit op(df, scalar)
56.4 ms ± 723 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)  <-- master
24.3 ms ± 236 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)  <-- PR

@jorisvandenbossche jorisvandenbossche added Performance Memory or execution speed performance Numeric Operations Arithmetic, Comparison, and Logical operations labels Apr 1, 2021
Copy link
Member

@jbrockmendel jbrockmendel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM (im assuming the CI failures are unrelated MPL stuff)

@jorisvandenbossche
Copy link
Member Author

There is still a failure in test_expressions.py to fix

@jreback jreback added this to the 1.3 milestone Apr 1, 2021
@jreback jreback merged commit e6f7e7b into pandas-dev:master Apr 1, 2021
@lithomas1
Copy link
Member

lithomas1 commented Apr 1, 2021

@jreback Looks like this was red when merged and should be reverted.

e.g.

=================================== FAILURES ===================================
________ TestExpressions.test_bool_ops_raise_on_arithmetic[//-floordiv] ________
[gw1] linux -- Python 3.7.1 /usr/share/miniconda/envs/pandas-dev/bin/python

self = <pandas.tests.test_expressions.TestExpressions object at 0x7fc5e381b630>
op_str = '//', opname = 'floordiv'

    @pytest.mark.parametrize(
        "op_str,opname", [("/", "truediv"), ("//", "floordiv"), ("**", "pow")]
    )
    def test_bool_ops_raise_on_arithmetic(self, op_str, opname):
        df = DataFrame({"a": np.random.rand(10) > 0.5, "b": np.random.rand(10) > 0.5})
    
        msg = f"operator {repr(op_str)} not implemented for bool dtypes"
        f = getattr(operator, opname)
        err_msg = re.escape(msg)
    
        with pytest.raises(NotImplementedError, match=err_msg):
>           f(df, df)
E           Failed: DID NOT RAISE <class 'NotImplementedError'>

from https://github.com/pandas-dev/pandas/runs/2245866400

@jreback
Copy link
Contributor

jreback commented Apr 1, 2021

whoops

hard to tell sometimes

ok

jreback added a commit to jreback/pandas that referenced this pull request Apr 1, 2021
jreback added a commit that referenced this pull request Apr 1, 2021
@jreback
Copy link
Contributor

jreback commented Apr 1, 2021

reverted #40742

@jorisvandenbossche
Copy link
Member Author

There was even a comment that I still needed to fix the test .. ;)

@jorisvandenbossche jorisvandenbossche deleted the ops-numexpr-floordiv branch April 2, 2021 06:29
vladu pushed a commit to vladu/pandas that referenced this pull request Apr 5, 2021
vladu pushed a commit to vladu/pandas that referenced this pull request Apr 5, 2021
LarWong pushed a commit to LarWong/pandas that referenced this pull request Apr 11, 2021
MarcoGorelli added a commit that referenced this pull request Apr 15, 2021
* TYP: Added overloads for fillna() in frame.py and series.py

* TYP: Added overloads for fillna() in frame.py and series.py #40737

* TYP: Added fillna() overloads to generic.py #40727

* TYP: removed generic overloads #40737

* fixed redundant cast error

* reverting prior changes

* remove cast again

* removed unnecessary overloads in frame.py and series.py

* fixed overloads

* reverted value typing

* remove extra types (lets keep this to overloads)

Co-authored-by: Marco Gorelli <[email protected]>
yeshsurya pushed a commit to yeshsurya/pandas that referenced this pull request Apr 21, 2021
* TYP: Added overloads for fillna() in frame.py and series.py

* TYP: Added overloads for fillna() in frame.py and series.py pandas-dev#40737

* TYP: Added fillna() overloads to generic.py pandas-dev#40727

* TYP: removed generic overloads pandas-dev#40737

* fixed redundant cast error

* reverting prior changes

* remove cast again

* removed unnecessary overloads in frame.py and series.py

* fixed overloads

* reverted value typing

* remove extra types (lets keep this to overloads)

Co-authored-by: Marco Gorelli <[email protected]>
yeshsurya pushed a commit to yeshsurya/pandas that referenced this pull request May 6, 2021
* TYP: Added overloads for fillna() in frame.py and series.py

* TYP: Added overloads for fillna() in frame.py and series.py pandas-dev#40737

* TYP: Added fillna() overloads to generic.py pandas-dev#40727

* TYP: removed generic overloads pandas-dev#40737

* fixed redundant cast error

* reverting prior changes

* remove cast again

* removed unnecessary overloads in frame.py and series.py

* fixed overloads

* reverted value typing

* remove extra types (lets keep this to overloads)

Co-authored-by: Marco Gorelli <[email protected]>
JulianWgs pushed a commit to JulianWgs/pandas that referenced this pull request Jul 3, 2021
JulianWgs pushed a commit to JulianWgs/pandas that referenced this pull request Jul 3, 2021
JulianWgs pushed a commit to JulianWgs/pandas that referenced this pull request Jul 3, 2021
* TYP: Added overloads for fillna() in frame.py and series.py

* TYP: Added overloads for fillna() in frame.py and series.py pandas-dev#40737

* TYP: Added fillna() overloads to generic.py pandas-dev#40727

* TYP: removed generic overloads pandas-dev#40737

* fixed redundant cast error

* reverting prior changes

* remove cast again

* removed unnecessary overloads in frame.py and series.py

* fixed overloads

* reverted value typing

* remove extra types (lets keep this to overloads)

Co-authored-by: Marco Gorelli <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Numeric Operations Arithmetic, Comparison, and Logical operations Performance Memory or execution speed performance
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants