Change abs() to return unsigned integers #1017

pitdicker · 2015-03-26T06:26:24Z

Change the function abs() to return unsigned integers, instead of signed integers as it does today.
See http://internals.rust-lang.org/t/pre-rfc-change-abs-to-return-unsigned-integers/1747/5

…d-integers/1747/5

oli-obk · 2015-03-27T09:24:42Z

Another alternative: return an Option<T> that only contains the value if a positive value could be obtained (e.g. fails for float NaN, or int min value). Note: I'm not advocating this alternative, just listing it for completeness

Gankra · 2015-03-28T13:54:34Z

Do any languages with a signed-unsigned dichotomy actually do this? Java and C++ both do Self -> Self (Java only recently got support for dealing with unsigned values, though).

oli-obk · 2015-03-28T15:04:12Z

Ada's standard library complex numbers return a real value from the abs function.

   -- the interface definition
   function "abs"   (Right : Complex) return Real'Base;
   function "abs"   (Right : Imaginary) return Real'Base;

Then there are the vectors and matrices (http://www.adaic.org/resources/add_content/standards/05rat/html/Rat-7-6.html)

There are two operations "abs" applying to a Real_Vector thus 
function "abs"(Right: Real_Vector) return Real_Vector;
function "abs"(Right: Real_Vector) return Real'Base;

One returns a vector each of whose elements is the absolute value of the corresponding element of the parameter (rather boring) and the other returns a scalar which is the so-called L2-norm of the vector. This is the square root of the inner product of the vector with itself or √(Σxixi) – or just √(xixi) using the summation convention (which will be familiar to those who dabble in the relative world of tensors). This is provided as a distinct operation in order to avoid any intermediate overflow that might occur if the user were to compute it directly using the inner product "*".

Note: the Integer type's abs function does not yield a Natural type result. (Natural is defined as a subtype of Integer, which is like restricting an i32 to only zero or positive values by automatically inserting runtime-checks). This is most likely an oversight and still there for backwards compatibility, but I could not find any sources proving that.

oli-obk · 2015-03-30T15:50:56Z

Another alternative: insert debug checks similar to overflow checks to prevent Int::min_value() to be used in abs. Note: you get this for free if abs returns a unsigned type, since you then have to cast, which again checks for overflow.

pitdicker · 2015-03-30T17:25:53Z

Interestingly, Firefox does not use the abs function of the C standard library, but a custom one that returns unsigned integers:
https://bugzilla.mozilla.org/show_bug.cgi?id=847480
http://whereswalden.com/2013/04/30/introducing-mozillaabs-to-mfbt/

These links also explain why C returns signed integers by default (because when the result is used in a formula, other values are automatically promoted to unsigned). This is a problem that Rust does not have.

llogiq · 2015-04-01T06:59:02Z

@gankro: Java not only has incomplete support for working with unsigned integers. As of Java 8 there is no notion of (un)signedness in the type system, and the defined operations on java.lang.Long/java.lang.Integer don't even fully support the usual arithmetics.

C/C++ has the problem of implicit coercions, as pitdicker already pointed out.

This change would make the result strictly more correct, also the type would give a strong indication of the possible return values. Of course this would require explicit casts in those cases where calculations should continue with signed values, but this serves to make the problematic case (min_value) more obvious. I can also imagine it would help debugging if the failure comes from the cast operation instead of the abs function.

rkjnsn · 2015-04-02T17:59:03Z

👍 I have seen some of the problems abs can cause in C. It can even lead to subtle security vulnerabilities when an algorithm assumes a positive result, and an attacker is able to construct an input that results in INT_MIN being passed. I think the proposal here is definitely the way to go. It makes the default behavior correct, while still leaving all of the other options open to the programmer.

aturon · 2015-04-02T23:42:17Z

@pnkfelix @nikomatsakis Thoughts?

nikomatsakis · 2015-04-03T00:17:23Z

👍 I've been bitten by abs(INT_MIN) == INT_MIN causing real-world bugs before as well, and an unsigned result seems fairly logical. The analysis of actual uses suggests that this wouldn't really break much code "in the wild" -- and it's pretty easy to workaround.

pnkfelix · 2015-04-07T16:44:37Z

Regarding the RFC itself, this seems ... incomplete. In particular, support for an fn abs method like this seems like it would require adding an associated type to the trait, something like this:

trait SignedInt {
    type UnsignedVariant;
    fn abs(&self) -> Self::UnsignedVariant;
    ...
}

right?

pnkfelix · 2015-04-07T16:45:07Z

@aturon my gut reaction is that I usually expect operations like abs to be transformations that stay within their domain: abs: N -> N.

But it seems like a fair number of people here are in favor of this change, and it sounds like the real world bugs here may be relevant.

@nikomatsakis when you say "pretty easy to workaround", I wonder if you are expecting people to use as, or a checked-variant ... (where using as just reinjects the bug here ...)

Anyway, I do not object to this RFC, assuming the oversight noted above is addressed in some way.

barosl · 2015-04-08T01:39:55Z

I think that we should make sure whether we want the unsigned result or not. I don't quite remember, but in most cases the results have been more likely to be casted to signed. In that case, the occurrence of min_value() will be catched only in a debug build, as the overflow behavior is only enabled in it.

I'm more fond of doing a runtime check and panicking, or even making abs() return Option<T>. These two methods are a bit more expensive, but it certainly ensures that you will never meet such an invalid value even in a debug build.

oli-obk · 2015-04-08T08:08:52Z

@barosl if you look at http://internals.rust-lang.org/t/pre-rfc-change-abs-to-return-unsigned-integers/1747/5?u=ker you can see that there's not a single case in rustc that casts it back to signed (except for the abs unit tests, but that's not a use case).

I could check out the most popular crates and make the same analysis on them...

I'm more fond of doing a runtime check and panicking, or even making abs() return Option. These two methods are a bit more expensive, but it certainly ensures that you will never meet such an invalid value even in a debug build.

could this be combined? by defining a trait for the abs function that is implemented twice. once returning a signed version with all the necessary checks in place, and once returning the unsigned version without any checks?

ftxqxd · 2015-04-08T10:17:41Z

I've always thought that abs was the most obvious use of unsigned integers in any language, so it's consistently surprised me that basically every language seems to have abs return a signed integer instead. In fact, I remember trying to use abs as a way of converting a signed integer to an unsigned one, where the exact means wasn't important. (I don't quite remember the exact reason, but it was probably for testing purposes where I was, for example, trying to draw a sine wave onto a canvas that expected positive coordinates.)

@pnkfelix SignedInt and the other numeric traits are deprecated in favour of inherent methods on the built-in numeric types, so I don't think the RFC really needs any more detail in that regard.

pnkfelix · 2015-04-08T11:08:26Z

The team discussed this in the weekly meeting last night: https://github.com/rust-lang/meeting-minutes/blob/master/weekly-meetings/2015-04-07.md#change-abs-to-return-unsigned

There was opposition to changing the type signature of abs.

(I did suggest the more conservative option of treating INT_MIN.abs() as an overflow condition -- i.e. it would be allowed to panic when debug-assertions are enabled. The response to this was that it would not accomplish the actual goal here, which I can accept.)

It seems like this is not going to be accepted, and instead one should write abs(a) as u32, which I believe will always work going forward. (Note that as does not check for overflow; see the updated RFC 560 for discussion.)

nikomatsakis · 2015-04-08T17:22:20Z

(I did suggest the more conservative option of treating INT_MIN.abs() as an overflow condition -- i.e. it would be allowed to panic when debug-assertions are enabled. The response to this was that it would not accomplish the actual goal here, which I can accept.)

It's true that we talked about this in the meeting, but I think we didn't give it the consideration it deserves. It feels to me like abs should employ a debug-assert that the input is not INT_MIN.

pnkfelix · 2015-04-09T06:37:32Z

@nikomatsakis but then at that point we should probably offer both variants (one that returns a signed int that may panic when given the minimum value, and another that returns an unsigned int that is guaranteed not to panic), right?

bombless · 2015-04-09T12:48:08Z

more like

trait SignedInt {
    fn abs(&self) -> <Self as ToUnsigned>::Target where Self: ToUnsigned {
        let x = if self < 0 { -self } else { self };
        ToUnsigned::to_unsigned(x)
    }
    ...
}

rkjnsn · 2015-04-09T17:20:00Z

I can understand not wanting to change the return type of abs now that beta has been released, though it's unfortunate. I still think the best solution is to change the return type of abs. It brings the edge case to the user's attention, but the user can easily cast back to signed if that's what he/she needs.

If that is not possible at this stage, I think it is very important that abs does a debug assert, and that we provide a uabs or similar variant that returns an unsigned integer. Furthermore, the documentation for abs should make it clear that the user needs to check for INT_MIN before calling it, and strongly recommend that he/she uses uabs instead if at all possible.

pitdicker · 2015-04-09T19:32:38Z

Just added the alternatives discussed above to the rfc rendered.

@pnkfelix You are right that the rfc seems incomplete. But I did not know what to put in the detailed design section, as the implementation is only a couple of lines :).

iopq · 2015-04-10T00:00:04Z

I think this is a very good idea. This statically guarantees no overflow vs. the debug assert solution that is SOMETIMES triggered by tests, but sometimes not covered.

theemathas · 2015-04-10T02:50:08Z

I believe that uabs is very important, since you can easily implement the old abs in terms of uabs (just add as), but not vice versa.

Gankra · 2015-04-10T03:36:09Z

@theemathas this is simply incorrect. uabs is easily implemented by abs due to the fact that two's complement INTEGER_TYPE::MIN has the same bitwise representation as the positive unsigned value. That is -128i8 and 128u8 are the same bitwise representation: 1000...000.

This runs successfully:

use std::i8;

fn main() {
    for (i, u) in (i8::MIN..1).rev().zip(0u8..) {
        assert_eq!(u, i.abs() as u8);
    }
}

liigo · 2015-04-10T05:54:01Z

+1 to this change (and fix bug/oversight of `INT_MIN.abs() == INT_MIN`). It's never late to fix bug, we're still pre-1.0 final. (Maybe the core team made a wrong decision here.)

pnkfelix · 2015-04-10T09:47:49Z

@gankro if we adopt the potential-panic-on-int-min semantics for abs, then one will indeed not be able to implement uabs in terms of abs. (I admit that the text of @theemathas was ambiguous about whether this was what was meant.)

theemathas · 2015-04-10T09:57:44Z

@gankro @pnkfelix I thought that the plan is that as will panic on debug builds in the case of overflow.

In any case, you can always have a branch checking for the minimum integer, but if you are going that far, you might as well implement abs() yourself

pnkfelix · 2015-04-10T10:09:17Z

@theemathas No, plans changed. as does not and will not check for overflow; see the updated RFC 560 for discussion.

pnkfelix · 2015-04-17T12:51:19Z

So, while mistakenly looking for this thread on the rust repo, I did find these old links:

which is mostly a historical curiosity.

My immediate question is this: Should be mark the fn abs method as unstable now on the 1.0 beta channel, so that we do not rush into a decision on this matter now?

Or should we just make a decision, period?

(Update: Or should we add stable fn uabs method, mark fn abs as unstable, and plan to change fn abs to potentially panic?)

pitdicker · 2015-04-17T16:17:04Z

@pnkfelix
What would be the effects of marking abs as unstable? It would be unavailable on beta and stable. Probably libraries that use abs would just implement it themselves (I have already seen 2 that provide their own wrapper).
And it may break more crates than implementing this RFC. I would vote for a decision :)

I have my doubts about adding a seperate fn uabs. Why would integer types need two functions, but float and big integers only one? How would this work together with a possible trait for abs?
And, altough I started the RFC and would like to see it 'right', I think this function is not important enough to have two versions of.

nikomatsakis · 2015-04-17T16:50:08Z

@pitdicker well, fixed-size integers are the only ones vulnerable to this discontinuity. One can make the same argument but about types: why should integer types change to "unsigned" variances but not floating-point or bigint?

My current feeling is that abs should be implemented as:

if value < 0 { -abs } else { abs }

this implies that it may be panic on overflow.

I have no objection to adding uabs. I do agree this a lot of fns for such a simple thing, but I like that the name is different and the type signature is different (uabs is not T -> T, it's T -> U).

pitdicker · 2015-04-17T16:50:22Z

Previous weekend brson was kind enough to test what impact this change would have on the crates on crates.io.
Of the 989 crates tested, 492 build succesfully before the change. I had to modify 7 crates to get all to build again:

glm-0.1.16
image-0.3.7
nalgebra-0.2.10
num-0.1.22
pitch_calc-0.9.3
quickcheck-0.2.12
utp-0.2.3

glm and nalgebra contain a wrapper for abs, so if the return type of abs changes, so must the return type of these functions (or not, depending on the goals of the authors).

num uses abs for the implementations of fn gcd. This function is plagued by the same corner case as abs, as gcd(MIN, MIN)==MAX+1. So this highlights a potential error.

image, pitch_calc and utp use abs when calculating the difference between two integers (e.g. (a-b).abs). The values are supposed to be very small compared to the integer size, so changing abs to unsigned does not have much benefit here. However, this way of calculating a difference can easily overflow if the values are larger, and I have a WIP crate just for cases like this.

For quickcheck, this code goes over my head... :)

Conclusion: I think the impact is small, but could be worth it especially when a crate fails to build.
And as a quick fix for a crate, a single cast is all that is needed to restore the old behaviour.

rkjnsn · 2015-04-17T16:54:48Z

I think there should definitely be an abs that returns unsigned. A version correct for all inputs should be easily accessible and usable. I think changing the current abs function makes the most sense. The default behavior will be correct, and any user who needs to can still easily cast back to a signed it, explicitly acknowledging the range limitations.

If, however, the current return type of abs isn't changed (due to proximity to our final release or any other reason), I think we absolutely should introduce a separate uabs function. Again, correct behavior should be easily accessible.

alexcrichton · 2015-06-02T23:18:34Z

This RFC unfortunately did not reach consensus before the 1.0 release, and these methods have now been stabilized. The abs method does indeed panic on overflow (following the panic semantics of the rest of the standard library), however. Due to this API being unable to change now I'm going to close this RFC.

Thanks regardless for the RFC @pitdicker!

rkjnsn · 2015-06-02T23:40:20Z

To propose the addition of a uabs method, would it be better to start with an RFC or a pull request? (I know there was some discussion about major vs. minor changes, etc.)

alexcrichton · 2015-06-02T23:44:06Z

I'd recommend an RFC for that API. The libraries subteam hasn't quite settled on what concretely needs an RFC and what doesn't, but I suspect that the guidelines would recommend an RFC for a method like uabs.

Gankra · 2015-06-03T02:21:11Z

(I would personally accept a PR for it; but that's just me)

http://internals.rust-lang.org/t/pre-rfc-change-abs-to-return-unsigne…

8be4d27

…d-integers/1747/5

pitdicker changed the title ~~http://internals.rust-lang.org/t/pre-rfc-change-abs-to-return-unsigned-i...~~ Change abs() to return unsigned integers Mar 26, 2015

pitdicker mentioned this pull request Mar 27, 2015

Stabilize std::num rust-lang/rust#23549

Merged

Added more alternatives

9eab0d8

Fix link

52f178d

nrc assigned brson Apr 9, 2015

pnkfelix mentioned this pull request Apr 17, 2015

Check for overflow in arithmetic negation rust-lang/rust#24500

Merged

pnkfelix mentioned this pull request May 13, 2015

std::i32::MIN.abs() results in panicked at 'arithmetic operation overflowed' rust-lang/rust#25378

Closed

alexcrichton added the T-libs-api Relevant to the library API team, which will review and decide on the RFC. label May 18, 2015

alexcrichton closed this Jun 2, 2015

glaebhoerl mentioned this pull request Jul 16, 2016

Add non-panicking abs() functions to all signed integer types. #1678

Closed

sollyucko mentioned this pull request Apr 25, 2020

uabs/unsigned_abs #2914

Closed

Change abs() to return unsigned integers #1017

Change abs() to return unsigned integers #1017

Conversation

pitdicker commented Mar 26, 2015

oli-obk commented Mar 27, 2015

Gankra commented Mar 28, 2015

oli-obk commented Mar 28, 2015

oli-obk commented Mar 30, 2015

pitdicker commented Mar 30, 2015

llogiq commented Apr 1, 2015

rkjnsn commented Apr 2, 2015

aturon commented Apr 2, 2015

nikomatsakis commented Apr 3, 2015

pnkfelix commented Apr 7, 2015

pnkfelix commented Apr 7, 2015

barosl commented Apr 8, 2015

oli-obk commented Apr 8, 2015

ftxqxd commented Apr 8, 2015

pnkfelix commented Apr 8, 2015

nikomatsakis commented Apr 8, 2015

pnkfelix commented Apr 9, 2015

bombless commented Apr 9, 2015

rkjnsn commented Apr 9, 2015

pitdicker commented Apr 9, 2015

iopq commented Apr 10, 2015

theemathas commented Apr 10, 2015

Gankra commented Apr 10, 2015

liigo commented Apr 10, 2015 via email

pnkfelix commented Apr 10, 2015

theemathas commented Apr 10, 2015

pnkfelix commented Apr 10, 2015

pnkfelix commented Apr 17, 2015

pitdicker commented Apr 17, 2015

nikomatsakis commented Apr 17, 2015

pitdicker commented Apr 17, 2015

rkjnsn commented Apr 17, 2015

alexcrichton commented Jun 2, 2015

rkjnsn commented Jun 2, 2015

alexcrichton commented Jun 2, 2015

Gankra commented Jun 3, 2015