-
-
Notifications
You must be signed in to change notification settings - Fork 18.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: Fix __truediv__ numexpr error #3764
BUG: Fix __truediv__ numexpr error #3764
Conversation
didn't realized ne could do truediv...good to know fyi....easy to mod panel to use ne, series should be straightforward as well.... going to add an issue about that |
see #3765 |
FYI, i just pushed this: c278ca6, because of the original failing test numexpr is only installed on 1 travis build (the 3.2), BUT it is under numpy 1.6.1 which causes weirdness....hmmm |
@jreback so why did it fail on 2.7? that test only runs if numexpr is installed. Also, probably a good idea to enable numexpr for one of the 2.x builds, right? |
it failed on 2.7, weird (failed for 3.2 for me)..... ...just testing enabling numexpr for all builds |
@jreback check out the travis ci link: https://travis-ci.org/jtratner/pandas/builds/7822069 . |
(that link is for a different build, obviously). Just for clarity - what needs to happen to incorporate this fix? Or are you saying this still fails in 3.2? |
that s your test though (numexpr IS enabled on that build)...look at the print_versions at the very bottom (you have to click it) |
I can't believe this never failed before...really weird https://www.travis-ci.org/jreback/pandas/jobs/7824130 |
@jreback fixed the test to not use div. obviously |
yep.....numexpr was only recently turned on....since we are now using it practially everythwere going to turn it on |
@jreback so yeah, I'm happy to help out. probably can't get to putting this into |
I usually write the test first, make sure it breaks, then put the fix in if I get too many commits or I am 'trying' to fix something...then I just squash them at the end... e.g. the HDFStore py3k this was like 20 commits...squashed back into 5....had to try a bunch of things! |
@jreback is there any way to confirm that numexpr was triggered? I added some print statements for my own testing, but I'm unclear whether |
put in some temporary print statements around the actual call to ne.evaluate (I think I may have had them at one time, but took them out) |
FYI...you might be interested in the long discussion about #3393, ne going to get even heavier use going forward |
@jreback just to be clear, what I mean is that if you set def _evaluate_numexpr(op, op_str, a, b, raise_on_error = False, **eval_kwargs):
result = None
if _can_use_numexpr(op, op_str, a, b, 'evaluate'):
try:
a_value, b_value = a, b
if hasattr(a_value,'values'):
a_value = a_value.values
if hasattr(b_value,'values'):
b_value = b_value.values
# This causes fallback to _evaluate_standard
raise ValueError("unknown type object")
result = ne.evaluate('a_value %s b_value' % op_str,
local_dict={ 'a_value' : a_value,
'b_value' : b_value },
casting='safe', **eval_kwargs)
except (ValueError), detail:
if 'unknown type object' in str(detail):
pass
except (Exception), detail:
if raise_on_error:
raise TypeError(str(detail))
if result is None:
result = _evaluate_standard(op,op_str,a,b,raise_on_error)
return result |
2 different issues here
answer your question? |
Thanks for digging into this with me - I appreciate you taking the time to I get what you're saying and it makes sense. However, test_expressions On Wed, Jun 5, 2013 at 9:02 PM, jreback [email protected] wrote:
|
numexpr rarely raises itself (mainly on a type issue, which we catch), or the catchall (which Exception), I think if some other kind of error, don't remember. We only bail on the catchall otherwise it is computed as normal (which can itself raise an exception, but only a weird case I think) so its not tested per se that calling from a data frame works (and uses numexpr when it is installed). I suppose you could add some functions to that way you could do something like this:
and and
whatever |
Sounds like it's not really worth the time/overhead to check though. |
what do u mean? |
Well, as in: would it slow things down much to check for a global variable each time? do you think it's actually worth checking for? |
this was just for testing you would have a single check inside evaluate that would set the test result you could actually set the status each time but you don't want to store the result at all because of he memory refs easy enough to store the passed string and and the status |
I want to leave the test to check that numexpr is actually used to be included with the #3765 , since that will probably turn into a more general check that has to happen for all expressions tests. Anything else I need to do before this can be merged? (Hopefully this gets into 0.11.1, since it reflects a numerical bug, rather than performance bug). |
Update on this: everything passes now and I added all the standard arithmetic operators to the test cases. This exposed a regression where Accelerating In summary, this pull request adds test cases for |
@jtratner see here: http://pandas.pydata.org/pandas-docs/dev/whatsnew.html (the module/div section). I believe numpy is wrong in returning 0.0000, but numexpr is correct in return nice job on this... |
Thanks! On mod/floordiv - can you show me where you implemented that fix? is that |
exactly I can be reached on my cell 917-971-6387 On Jun 8, 2013, at 10:57 PM, Jeff Tratner [email protected] wrote:
|
I think I figured out the issue...didn't include the explicit requirement to fill zeros on all...will fix soon. |
Also, any reason why __mod__ = _arith_method(operator.mod, '__mod__', default_axis=None, fill_zeros=np.nan) |
I was consistent with how numpy treats floats
|
Okay, this should resolve everything. I have a separate raft of commits that adds truediv and floordiv to all the containers as well as fixes up fill_zeros (which is still not consistent here). If it's okay with you, I'd rather handle all of those separately, because they make a logical group (and also because the multiple changes makes it harder to cherry-pick frame fixes to here). The key thing was to move the |
ok by me |
@jtratner this pr doesn't actually change the default to truediv right? just makes ne do it if its already enabled otherwise will need a big API warning |
No, From Python 2 --> Python 3 the behavior of |
This is failing because of issue in #3814 (I think). But I do think it should be incorporated for 0.11.1, given that it fixes a bug ( |
Only add 'div' if not python 3 Also checks dtype in test cases for series
by passing truediv=True to evaluate for __truediv__ and truediv=False for __div__.
`%` (modulus) intermittently causes floating point exceptions when used with numexpr. `//` (floordiv) is inconsistent between unaccelerated and accelerated when dividing by zero (unaccelerated produces 0.0000, accelerated produces `inf`)
@jreback okay, this is definitely ready to go. It doesn't change anything API-wise, except for fixing a bug with using true division with integer arrays of greater than 10,000 cells with numexpr installed. (And we know it's a bug because it means that we'd end up with different output than numpy arrays, etc.) |
BUG: Fix __truediv__ numexpr error
thanks @jtratner your contribs are great! keep em comin! |
thanks :) On Tue, Jun 18, 2013 at 9:01 AM, jreback [email protected] wrote:
|
Fixes a number of items relating to accelerated arithmetic in frame.
__truediv__
had been set up to use numexpr, but this happened by passingthe division operator as the string. numexpr's evaluate only checked 2 frames up, which meant that it picked up the division setting fromframe.py
and would do floor/integer division when both inputs were integers.You'd only see that issue with a dataframe large enough to trigger numexpr evaluation (>10000 cells)This adds test cases to
test_expression.py
that exhibit this failure under thePython2.7 full deps test. The testcases only testSeries
andDataFrame
(though it looks like neitherSeries
norPanel
usenumexpr
). It doesn't fail under Python 3 because integer division is totally gone.Now
evaluate
,_evaluate_standard
and_evaluate_numexpr
all accept extra keyword arguments, which are passed tonumexpr.evaluate
.The test case is currently a separate commit that fails. I wasn't sure whether I should have combined it with the bugfix commit or not. Happyto change it if that's more appropriate.