Use checked arithmetic by default #9469

brson · 2013-09-24T18:21:43Z

There's a lot of concern that having +, -, and * overflow by default is incorrect - it's a source of many security vulnerabilities. Let's try changing the default to being checked and measure the impact on performance and code size.

There are some open questions:

what happens on overflow? Probably fail!. It could also raise a condition, but having codegen raise conditions is a big step to take, and we're not sure we like conditions.
how do you turn it off? Either with unchecked blocks, explicit methods, or additional types.

Nominating well defined.

The text was updated successfully, but these errors were encountered:

thestinger · 2013-09-24T18:34:42Z

You can try using it via the checked arithmetic traits I implemented, and there's a huge overhead in terms of both performance and code size. LLVM is almost never capable of removing the checks, just like it rarely actually removes array bounds checks. It makes whole passes like loop and scalar vectorization useless.

Failing on overflow doesn't make the incorrect code any more correct, since a failure is still going to be a bug in 99% of programs. I feel strongly that the correct solution is to make integer literals not default to a signed fixed-size integer, and making generic literals work for big integers.

The most common integer type for application logic could be the one proposed in #8635.

thestinger · 2013-09-24T19:42:14Z

Here's an optimized IR comparison, to give an idea of the code generation when it can't eliminate the check. In these examples, it can't eliminate it because there are inputs to a function (it must inline deep enough to see the values or range of values, to have a hope of eliminating them):

fn foo(x: u32, y: u32) -> u32 {
    x + y
}

; Function Attrs: nounwind readnone uwtable
define i32 @_ZN3foo17h73f0a05a6ca431af4v0.0E({ i64, %tydesc*, i8*, i8*, i8 }* nocapture readnone, i32, i32) #2 {
"function top level":
  %3 = add i32 %2, %1
  ret i32 %3
}

Checked operation, calling fail on overflow:

fn bar(x: u32, y: u32) -> u32 {
    x.checked_add(&y).unwrap()
}

; Function Attrs: uwtable
define i32 @_ZN3bar17h73f0a05a6ca431av4v0.0E({ i64, %tydesc*, i8*, i8*, i8 }* nocapture readnone, i32, i32) #3 {
"function top level":
  %3 = alloca %str_slice, align 8
  %4 = alloca %str_slice, align 8
  %5 = call { i32, i1 } @llvm.uadd.with.overflow.i32(i32 %1, i32 %2) #1
  %6 = extractvalue { i32, i1 } %5, 1
  br i1 %6, label %match_else.i, label %_ZN6option6Option6unwrap21h1283b974be2d27c1lBau4v0.0E.exit

match_else.i:                                     ; preds = %"function top level"
  %7 = bitcast %str_slice* %3 to i8*
  call void @llvm.lifetime.start(i64 16, i8* %7)
  %8 = bitcast %str_slice* %4 to i8*
  call void @llvm.lifetime.start(i64 16, i8* %8)
  %9 = getelementptr inbounds %str_slice* %3, i64 0, i32 0
  store i8* getelementptr inbounds ([44 x i8]* @str1262, i64 0, i64 0), i8** %9, align 8
  %10 = getelementptr inbounds %str_slice* %3, i64 0, i32 1
  store i64 43, i64* %10, align 8
  %11 = getelementptr inbounds %str_slice* %4, i64 0, i32 0
  store i8* getelementptr inbounds ([46 x i8]* @str1263, i64 0, i64 0), i8** %11, align 8
  %12 = getelementptr inbounds %str_slice* %4, i64 0, i32 1
  store i64 45, i64* %12, align 8
  call void @"_ZN3sys28FailWithCause$__extensions__9fail_with20hdb4c44d01ce41167qaF4v0.8E"({}* sret undef, { i64, %tydesc*, i8*, i8*, i8 }* undef, %str_slice* %3, %str_slice* %4, i64 323)
  unreachable

_ZN6option6Option6unwrap21h1283b974be2d27c1lBau4v0.0E.exit: ; preds = %"function top level"
  %13 = extractvalue { i32, i1 } %5, 0
  %14 = bitcast %str_slice* %3 to i8*
  call void @llvm.lifetime.start(i64 16, i8* %14)
  %15 = bitcast %str_slice* %4 to i8*
  call void @llvm.lifetime.start(i64 16, i8* %15)
  call void @llvm.lifetime.end(i64 16, i8* %14)
  call void @llvm.lifetime.end(i64 16, i8* %15)
  ret i32 %13
}

thestinger · 2013-09-24T19:53:41Z

Note that the function is also no longer marked readnone/nounwind so this bubbles up into the rest of the code and results in many more invoke instructions and landing pads being generated along with the call being viewed as impure.

jfager · 2013-09-24T23:18:10Z

I agree that if safety is the motivation using bigints by default makes more sense. Failing on overflow doesn't seem that much better than wrapping in day-to-day development. But it seems like using something like GMP is a better option than a rust impl - why fight that battle again?

I'd be disappointed if unchecked arithmetic required anything more than just specifying a fixed-width int type. I like Rust because it balances safety and control. Making unchecked arithmetic hard feels like it goes too far over to the safety side of the fence.

Fwiw this is the approach Haskell takes.

thestinger · 2013-09-24T23:24:31Z

I don't there should should be a default at all, just generic literals and the choice to use what's appropriate by not falling back to int. There are cases when you want a big integer, and others where you want a flexible type expanding to one only as necessary.

pnkfelix · 2013-09-26T17:12:28Z

cc me

catamorphism · 2013-09-26T17:12:59Z

Accepted for well-defined

UtherII · 2013-10-13T00:58:47Z

About how to turning checked arithmetic off, would it be possible to use unsafe block but allowing to make them more specific about unsafe operations.

For example something like:

    unsafe "int_overflow" {
        // integer overflows don't fail, others unsafe operations are still forbidden
    }
    unsafe "ffi", "raw_ptr"{
        // ffi and dereferencing raw pointers allowed, others unsafe operations are still forbidden
    }

Thiez · 2013-10-13T22:49:58Z

Overflow is not unsafe when you explicitly want it to occur, so I don't think it makes sense to wrap it in an 'unsafe' block.

Perhaps some new types that implement the arithmetic traits but perform checked arithmetic? Or a Checked<T> wrapper? We already have the Checked_<Op> traits defined on int.

thestinger · 2013-10-13T22:53:11Z

I think a type-based solution is best. Signed/unsigned integer types using the checked operations and expanding to a big integer on overflow would make sense. I don't think there's much benefit to a type throwing a failure on an overflow, because that's still going to be a bug in the application.

Something to note is that the checked operations aren't implemented in hardware on every architecture. We need to start using compiler-rt to provide cross-platform support. A notable example is the lack of 64-bit checked multiplication on i686.

pnkfelix · 2013-10-14T20:55:16Z

I am surprised by the statements that a fail-on-overflow is not much different from the semantic errors you get from modular-arithmetic wrapping; I think that a proper fail! on overflow is a very different development experience than an under- or overflowed value then propagating somewhere else in the application.

(But do not interpret the above as a vote against bigints; I do agree they make the most sense in a number of contexts, though I'm not yet sure how best to incorporate them.)

@thestinger what are your thoughts about how to handle underflow for an unsigned integer type? I assume you would want at least that to fail! ?

thestinger · 2013-10-14T21:05:42Z

@pnkfelix: Underflow would still expand to a big integer, I was thinking it might make sense performance-wise to have coverage over different ranges before expanding to a big integer. It's probably enough to just have a single type based on int using an enum.

I don't really see much point in having an unsigned big integer type like we have in extra and I can't find precedence for it in other languages/libraries.

pnkfelix · 2013-10-14T21:15:03Z

@thestinger Ah, I see. I had thought you had meant that an unsigned fixed-width uint would overflow to an unsigned big uint. But now I interpret your suggestion as saying that uint should over- and underflow to signed integer.

thestinger · 2013-10-14T23:33:44Z

@pnkfelix: I don't want to change the semantics of the built-in integers from what they are now, but rather introduce a new type expanding to a big integer when necessary. There isn't a need to have an unsigned version, it was a silly thought.

jfager · 2013-10-15T16:42:44Z

@pnkfelix It is a different developer experience (fail-fast vs. fail-mysteriously-at-some-point) and if those were the only two options with no other tradeoffs fail-fast would definitely be the preferable one. But they're both still runtime failures. For either approach, your code has to be aware of and prevent overflow conditions in order to not blow up.

@thestinger's suggestion seems to make way more sense. You write code blissfully ignoring overflow issues, and overflow issues never bite you. And if you want overflow or to not pay the cost of overflow checking, then you can just go back to using standard fixed-width integer types and accept the tradeoff.

thestinger · 2013-10-15T16:45:42Z

If we implemented #6023 and #4169, alternate types would be as first-class as the built-in integers (excluding constant expressions) and there wouldn't be a push from the language to use the fixed-size ones. There would be no default, as all integer types would be equal.

tiffany352 · 2013-10-19T02:19:03Z

It'd be nice if the type system could compute the values that are valid for any given function that don't cause overflow and limit it to those, but that would require dependant types.

As an example, you could have a function which adds two numbers, and the type system would figure out that A+B can never exceed UINT_MAX, which would be expressed through the type system and enforced at each invocation. If you took one number from user input, and added it to a constant foo, the integer read from user input could only be as large as UINT_MAX-foo without returning None

thestinger · 2014-01-12T21:49:22Z

Simple example:

extern mod extra;
use std::unstable::intrinsics::{abort, u32_mul_with_overflow};
use extra::test::BenchHarness;

#[inline(never)]
fn control(xs: &mut [u32]) {
    for x in xs.mut_iter() {
        *x *= 5;
    }
}

#[inline(never)]
fn check(xs: &mut [u32]) {
    for x in xs.mut_iter() {
        unsafe {
            let (y, o) = u32_mul_with_overflow(*x, 5);
            if o {
                abort()
            }
            *x = y;
        }
    }
}

#[inline(never)]
fn check_libstd(xs: &mut [u32]) {
    for x in xs.mut_iter() {
        *x = x.checked_mul(&5).unwrap();
    }
}

#[bench]
fn bench_control(b: &mut BenchHarness) {
    b.iter(|| {
        let mut xs = [0, ..1000];
        control(xs)
    });
}

#[bench]
fn bench_check(b: &mut BenchHarness) {
    b.iter(|| {
        let mut xs = [0, ..1000];
        check(xs)
    });
}

#[bench]
fn bench_check_libstd(b: &mut BenchHarness) {
    b.iter(|| {
        let mut xs = [0, ..1000];
        check_libstd(xs)
    });
}

--opt-level=2

test bench_check        ... bench:      1085 ns/iter (+/- 28)
test bench_check_libstd ... bench:      1082 ns/iter (+/- 38)
test bench_control      ... bench:       349 ns/iter (+/- 12)

--opt-level=3

test bench_check        ... bench:      1080 ns/iter (+/- 14)
test bench_check_libstd ... bench:      1177 ns/iter (+/- 16)
test bench_control      ... bench:       350 ns/iter (+/- 11)

Ouch. It becomes a larger slowdown multiplier when you add more operations to the loop too. Since it's increasing the code size a lot, it will bloat the instruction cache too.

thestinger · 2014-01-12T21:50:44Z

We will be slower than Java and Haskell on numeric loops. Benchmarks of Rust will look absolutely terrible unless you're already an expert with the language and know the default integers are slow. Ignoring scalar/loop optimizations on integers in a language claiming to be aimed at replacing C++ just isn't going to work.

brson · 2014-02-19T21:21:49Z

Nominating for removal from milestone.

hatahet · 2014-02-19T21:28:18Z

Would it be feasible to have some CheckedNum trait, with corresponding numeric classes implementing it, and overloading the numerical operators?

pnkfelix · 2014-02-27T21:25:14Z

there is a lot of open-research to do on this topic, so we cannot make it a high priority for 1.0, which means we cannot make it something we turn on my default.

If we want a separate opt-in trait for checked arithmetic, or a compiler flag, etc, then those are things to consider, but they should be in a separate bug.

Closing as wontfix.

R030t1 · 2020-08-24T16:33:37Z

A method to handle arithmetic correctness as a language feature would be extremely valuable. A program aborting due to an arithmetic error is usually better than the program silently continuing with data corruption.

It may be convenient to write this off as not worth it by virtue of being intractable at Rust's level, but finally pointing out it is a massive headache to have incomplete arithmetic units on processors may finally get somewhere. (Yes, I'm aware of the overflow flags present on basically every architecture, but that is different than generating an interrupt.)

"We Need Hardware Traps for Integer Overflows" https://blog.regehr.org/archives/1154

Fix FormatArgsExpn parsing of FormatSpec positions Woops, forgot visitors don't walk themselves Fixes rust-lang#9468 r? `@giraffate` changelog: none

pnkfelix mentioned this issue Oct 21, 2013

RFC: rename int and uint to intptr/uintptr #9940

Closed

brson added the I-nominated label Feb 19, 2014

pnkfelix closed this as completed Feb 27, 2014

EvanJP mentioned this issue Jun 18, 2020

Style: checked integer arithmetic? unicode-org/icu4x#126

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use checked arithmetic by default #9469

Use checked arithmetic by default #9469

brson commented Sep 24, 2013

thestinger commented Sep 24, 2013

thestinger commented Sep 24, 2013

thestinger commented Sep 24, 2013

jfager commented Sep 24, 2013

thestinger commented Sep 24, 2013

pnkfelix commented Sep 26, 2013

catamorphism commented Sep 26, 2013

UtherII commented Oct 13, 2013

Thiez commented Oct 13, 2013

thestinger commented Oct 13, 2013

pnkfelix commented Oct 14, 2013

thestinger commented Oct 14, 2013

pnkfelix commented Oct 14, 2013

thestinger commented Oct 14, 2013

jfager commented Oct 15, 2013

thestinger commented Oct 15, 2013

tiffany352 commented Oct 19, 2013

thestinger commented Jan 12, 2014

thestinger commented Jan 12, 2014

brson commented Feb 19, 2014

hatahet commented Feb 19, 2014

pnkfelix commented Feb 27, 2014

R030t1 commented Aug 24, 2020

Use checked arithmetic by default #9469

Use checked arithmetic by default #9469

Comments

brson commented Sep 24, 2013

thestinger commented Sep 24, 2013

thestinger commented Sep 24, 2013

thestinger commented Sep 24, 2013

jfager commented Sep 24, 2013

thestinger commented Sep 24, 2013

pnkfelix commented Sep 26, 2013

catamorphism commented Sep 26, 2013

UtherII commented Oct 13, 2013

Thiez commented Oct 13, 2013

thestinger commented Oct 13, 2013

pnkfelix commented Oct 14, 2013

thestinger commented Oct 14, 2013

pnkfelix commented Oct 14, 2013

thestinger commented Oct 14, 2013

jfager commented Oct 15, 2013

thestinger commented Oct 15, 2013

tiffany352 commented Oct 19, 2013

thestinger commented Jan 12, 2014

thestinger commented Jan 12, 2014

brson commented Feb 19, 2014

hatahet commented Feb 19, 2014

pnkfelix commented Feb 27, 2014

R030t1 commented Aug 24, 2020