Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

REF: move check for disallowed bool arithmetic ops out of numexpr-related expressions.py #41161

Conversation

jorisvandenbossche
Copy link
Member

@jorisvandenbossche jorisvandenbossche commented Apr 26, 2021

The current _bool_arith_check helper function in expressions.py does 2 things: 1) check for operations we do not allow in pandas for boolean data (division, power) and 2) check for ops that are not supported by numexpr but have a pure python fallback.

That first aspect is not related to numexpr, though: we always check this (and raise an error if appropriate), whether numexpr is used or not.

So this PR does:

  • Move the check for disallowed bool ops out of the numexpr code to array_ops.py. This is a cleaner separation of concerns, and ensures that the check is done consistently (eg if not calling floordiv through numexpr (PERF: numexpr doesn't support floordiv, so don't try  #40727), or if in the future not calling expressions.py if we know we can't use numexpr, such as in PERF/REF: Check use of numexpr earlier in the DataFrame operation #41122)
  • In expressions.py, move the remaining check for unsupported ops (by numexpr) inside an except block: this way, we don't check up-front (for all ops, for all types), and only do the types check if there was actually a NotImplementedError. That avoids paying the overhead costs for all other ops.

@jorisvandenbossche jorisvandenbossche added Numeric Operations Arithmetic, Comparison, and Logical operations Clean labels Apr 26, 2021
@jorisvandenbossche jorisvandenbossche added this to the 1.3 milestone Apr 26, 2021
with tm.assert_produces_warning(UserWarning):
# "evaluating in Python space because ..."
op(s, e.value)
op(s, e.value)
Copy link
Member Author

@jorisvandenbossche jorisvandenbossche Apr 26, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was a wrong test, but because putting assert_produces_warning inside pytest.raises doesn't fail if it doesn't warn, this wasn't noticed (encountered this in #40325 (comment) as well).
The ops for which we raise a warning vs raise NotImplementedError are not overlapping.

# numexpr is used or not
# with tm.assert_produces_warning(UserWarning):
# # "evaluating in Python space because ..."
op(s, e.value)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fact that it is no longer warning is I think correct, since we are not using numexpr for such a small test case (so the warning about "not supported by numexpr" is not relevant for the user).
But, we should still test that we do raise the warning when numexpr is used. This can probably be done in a similar was as #40463

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will take care of this in #41178

@@ -205,6 +205,10 @@ def arithmetic_op(left: ArrayLike, right: Any, op):
# Timedelta is included because numexpr will fail on it, see GH#31457
res_values = op(left, right)
else:
# TODO we should handle EAs consistently and move this check before the if/else
# (https://github.com/pandas-dev/pandas/issues/41165)
_bool_arith_check(op, left, right)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Opened an issue for this: #41165

@jreback
Copy link
Contributor

jreback commented Apr 26, 2021

looks ok, can you rebase as merged the numexpr test updates.

@jorisvandenbossche jorisvandenbossche merged commit d7b5a10 into pandas-dev:master Apr 28, 2021
@jorisvandenbossche jorisvandenbossche deleted the ops-refactor-bool-arithm-check branch April 28, 2021 10:22
yeshsurya pushed a commit to yeshsurya/pandas that referenced this pull request May 6, 2021
JulianWgs pushed a commit to JulianWgs/pandas that referenced this pull request Jul 3, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Clean Numeric Operations Arithmetic, Comparison, and Logical operations
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants