-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
abs behavior incorrect/not documented correctly #13549
Comments
Duplicate of #5796 - we decided not to special case typemin. Agreed that |
I have tried contributing to Julia in the past, and the pain (build failures, configuration issues, buggy packages, etc) has been too much of a barrier to me. I could write more on this topic, but I think http://danluu.com/julialang/ summarizes the state very well. I may do this within a few months, but since the fix is trivial, it may be easier for a regular contributor to do it before then. |
Sorry that you've experienced trouble with such things. The incredibly large scope of Julia, and all the platforms that it tries to support (multiple OSes, 32/64 bit/ARM, lots of dependencies), coupled with the fact that it is under active development mean that things are bound to break occasionally (though hopefully only on the master branch). But please do open issues if you experience such problems. |
Build failures have been pretty rare for quite a long time here. Anyway, you can make a PR to change the docs without even building Julia (actually, you can even make it from GitHub, though somebody will have to run a script afterwards). |
Probably the best place to document this behavior is in the Overflow behavior subsection of the manual. I had planned to draft something up and show it here, giving @gajjanag a fair opportunity to make the pull request prior to doing so myself. However, I found a wrinkle that may merit some further discussion and a decision before proceeding with this change. EDIT--I've withdrawn this thought, feel free to ignore the following: The wrinkle is documented in the next subsection, Division errors. Namely that integer division of
It strikes me as inconsistent to have I suggest that I've made a first attempt to do this myself but didn't quite make it. And I thought it wise to bring this point up for consideration before putting too much work into a change that may not be well received. Here is what I've done so far. It looked to me as if the
After building with this change, I had hoped to see:
However, I still get the |
I noticed another issue: overflow errors with abs, etc can propagate. For instance, the gcd function can trigger the behavior as it calls abs. This is not documented, and documenting just abs is insufficient: a client of gcd can't (and should not) expect to read abs to learn about gcd's overflow behavior. Thus, this issue is not at all a simple one, and a clean solution goes well beyond my level of expertise or energy for this. I hence defer it to others interested in this. Maybe as a start, the label on this can be changed: I doubt this issue is an "intro issue" anymore, since it requires significant work to fully and cleanly address. |
@gajjanag so far I've seen nothing that can't be grasped by understanding the policy that Julia's integer arithmetic is modular and that certain operations promote their types. I'm satisfied that this policy represents an appropriate balance between performance and safety. Prompted by your comment on
Can you provide a specific example of problematic I'd rather see |
@MichaeLeroy sorry for not being clear enough. Per the docs, gcd always returns "Greatest common (positive) divisor (or zero if x and y are both zero)." This is clearly being violated by your third example. It is fundamentally due to the fact that abs(typemin(Int)) == typemin(Int). |
@gajjanag I'd put that third example into the category of proper modular arithmetic, so it is not a violation of the "integer arithmetic is modular arithmetic" policy, and it has exactly the right value. I think that we both agree that |
@MichaeLeroy I don't agree, simply because of your documentation claiming it is positive, which is clearly incorrect above. If you claim it is abs(x), it needs to be documented. For me it is very clear and simple: a function's behavior needs to be documented correctly and concisely with no "edge cases" missing out. I am thus not comfortable with this situation, but I won't fix it myself. Personally, I don't mind a gcd that gave a value that is either positive or negative. But whatever solution gets adopted, it must be documented correctly. |
I withdraw my suggestion that
So there may be good reason to keep the current behavior of |
@gajjanag I'm sorry if I gave the wrong impression. I'm not arguing that the documentation can't or shouldn't be improved on this point. I'm merely saying that I think that it is perfectly reasonable that Please do have a go at modifying the manual and generating a pull request. I'll refrain from any such action for a decent interval to give you the first crack at closing this issue. My opinion is that it is sufficient to clarify and amplify the Overflow behavior subsection and that we need not add notes to each of the math functions that like |
@gajjanag Have you read the manual's FAQ on Integer Arithmetic? There is a great deal of helpful information there on the rationale for and consequences of making integer arithmetic modular arithmetic. From my perspective, referencing this FAQ entry in the Overflow behavior is a good way to resolve this issue. |
@MichaeLeroy: I read it, and although it is a great entry, it is by no means sufficient in terms of documentation. For instance, it is not at all clear why gcd needs to behave absurdly exactly on typemin(T) as one of the inputs, other 0, or both typemin(T) etc (I may even have missed some even after a single look at the code). How can a client of gcd expect to get that information from the FAQ entry? The only way he/she or other code can get such information is by digging into the source code currently. This is why I say the problem is not a simple one. All the FAQ does is a "general caveat". It tells nothing about specific functions and where they can trigger weird, unexpected behavior. |
Matlab:
No documentation explaining what, mathematically speaking, is clearly a confusing situation. @gajjanag, I think this is the kind of situation where it's either incumbent upon you to contribute a resolution, or to learn to live with the fact that numbers are represented with bits that are only approximations of mathematical objects and that these corner cases are just going to be weird one way or another. |
Yikes, that Matlab behavior is bananas. Having |
@gajjanag I certainly appreciate your position and see the value of having documentation that is exacting in calling out dangerous edge case behavior. However, exhaustively calling out edge cases on all functions may be impractical (perhaps impossible). Also a manual where each function's documentation is loaded down with caveats or tabulations of failure call-outs would arguably be quite user hostile, more confusing, and difficult to maintain than the current situation. Perhaps, I'm being unfair, making a straw man of your point. If so, could you set me right in my understanding? |
I think this is worth mentioning in the documentation of |
Or perhaps linking to a longer explanation about overflow and corner cases. Explaining this properly isn't exactly short, and I'd hate for the important stuff to get overwhelmed by material needed to explain what is literally one case out of 2^64. |
I like the idea of addressing some of the general documentation weaknesses that @gajjanag has called out. But I'd like to do so in a manner that is readable, practical and maintainable. It seems to me that the best place to do so is in the code, using the new Documentation facilities. If so, perhaps the thing to do is to close this issue with general improvements to the manual's documentation of integer overflow issues and to generate as a separate issue or strategy for the improvement of Julia function documentation. (Perhaps such a policy exists, I'm very new in my Julia development efforts and haven't yet read myself in on all of the development manuals.) The FAQ Entry does a great job of explaining why Julia uses modular integer arithmetic. As @timholy suggests, a more detailed discussion of the consequences of this behavior may be in order. Does it make sense to make this discussion a separate FAQ entry rather than messing around with an already well written section on the rationale? @gajjanag I imagine that you'd not be completely satisfied with such a solution, but are you interested writing/collaborating on such a FAQ entry. I may be too comfortable with the integer arithmetic is modular situation to do a great job of explaining its pitfalls. |
I think it may help if I explain where I am coming from, and how I found this out. I mostly work with C, where such integer overflow issues are often undefined behavior. There, these things can be security vulnerabilities, e.g if the abs() return is then used in some arithmetic to index a buffer. As such, I am quite paranoid about these things, and highly appreciate the fact that man abs on my system gives me this caveat, which is easily missed: after all it is a classic off by one and is an artifact of the standard two's complement behavior. I have been involved with FFmpeg lately on fixing and improving their gcd function, auditing abs usage (and replacing judiciously with a "negative abs" which is always safe), etc. I currently do not actively use Julia, but have done so in the past and may do so again in the future. I highly appreciate the Julia project, and am interested in helping out by pointing out these things as possible improvements. As for the exact solution, due to my current lack of investment in Julia, and my general lack of time for these things, I am afraid I can't be of much use. I see good ideas on this thread, and @MichaeLeroy above in particular has a solution that although not ideal in my view, is a very solid first step. |
Since we check that array accesses are within bounds, this is not nearly such a problem in Julia. |
@gajjanag thanks for your input and also for providing the background that you bring to this issue. BTW, I imagine that any improvements that you might be able to bring to Julia's This issue and #12989 point to seemingly contradictory missions for Julia's documentation. On the one hand to layout all of the gory details, caveats and limitations of an object and on the other to provide a very compact hint as to what this object is. Ideally the documentation can be structured to make it suitable for either purpose (and others besides). That's the point that I was getting at when I suggested that there ought to be a strategy for Julia's documentation. I'm not claiming that there is no such strategy, I've only just begun to learn the documentation system in the hope of seeing this strategy or of contributing to its development if the Julia community agrees that having such documentation is a good idea. Based on what's been said here, I'll take a crack at writing a FAQ section on integer overflow to supplement the existing one on why integer arithmetic is modular. I'll provide links to these FAQ sections in basic numbers section of the manual. I'll also suggest an update to the in-code documentation for |
Sounds great @MichaeLeroy, thanks in advance. |
There are |
+1 to checked_abs On Thursday, October 29, 2015, Scott P. Jones [email protected]
|
+1 to checked_abs as well. Thanks @ScottPJones for pointing out the availability of these overflow trapped functions. I'll mention them in the integer overflow FAQ entry that I've agreed to write. It is nice to know that such functions are available, but I hope that their use is quite rare and carefully considered. |
I made an attempt at a
I'm concerned that this test is dominated by data generation rather than the calls to If I hear no objections or advice to the contrary, I'll add this implementation of |
for benchmarking, I think it works better to put all the looping and the function into a function, and you may need to take care to pull out a value, so that it isn't all optimized away. |
@ScottPJones thanks for your advice on benchmarking, it certainly helped me to get to more reasonable tests. I did have to play around to get something that worked, examining
So there is a significant if not brutal penalty to using |
Thanks! Personally, if you need checked arithmetic (which I may), that penalty does not look so bad, and I'm sure can be improved on later - the important thing is you can submit a PR with that function (and we'll pray TPTB will bless it 😄 ) Good Job! 👍 |
@ScottPJones I appreciate your help and encouragement. Real life intrudes, so I can't get the PR done right now, but I'll give it a shot the next time I have a bit of time. |
It's very hard to make something that might raise an exception as fast as a few integer ops, so I wouldn't hold out much hope for making this that much faster. |
Is #13841 sufficient to close this? |
In addition to #13841, I've committed to writing a FAQ entry on Integer overflow to complement the already existing entry on native arithmetic. The focus on the new entry will be on how to cope with integer arithmetic. I hope to get this done by the end of the week. I'm more than happy to hear suggestions about what should be included in this entry, and this thread may be a good place for that to happen. But if it makes sense instead to close the issue and open a fresh one focused on this task, that would work for me as well. I'll try to seed/provoke some discussion around this FAQ entry by commenting with a simple outline in the next day or so. I'm hardly an expert on the topic, which doesn't deter me but does make me somewhat cautious, interested in advice, and open to correction. |
Sounds like a great contribution. I look forward to reading it! |
My work to provide an additional overflow FAQ entry is dragging on a bit, mostly because I had to put in some study to avoid too many ignorant pronouncements. I've gotten a basic grasp of the issue (I hope) and created an overflow_research repository to facilitate work on a draft (which is the README there). The draft is getting somewhat beyond the outline stage. Feel free to have a look and comment here regarding suggestions, complaints, criticisms, or exposes of my ignorance. In broad outline, the questions are addressed are:
I'll be off the grid for much of this weekend and otherwise engaged early next week. So don't be shocked if I fail to respond promptly to any comments or if development on this stalls until the middle of next week. |
This issue could be closed now I think. |
Currently (at least on 0.3.11, have checked master docs but not implementation), abs does not behave in a mathematically correct manner: abs(typemin(Int)) == typemin(Int) on my Linux x86-64 machine.
This problem plagues the C standard library abs(), llabs(), etc as well due to subtle asymmetry in 2's complement arithmetic. The C standard leaves behavior on INT_MIN, etc undefined, and this is fortunately documented, though easily missed by many programs.
I thus see 2 possible approaches:
The text was updated successfully, but these errors were encountered: