-
Notifications
You must be signed in to change notification settings - Fork 12.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use checked arithmetic by default #9469
Comments
You can try using it via the checked arithmetic traits I implemented, and there's a huge overhead in terms of both performance and code size. LLVM is almost never capable of removing the checks, just like it rarely actually removes array bounds checks. It makes whole passes like loop and scalar vectorization useless. Failing on overflow doesn't make the incorrect code any more correct, since a failure is still going to be a bug in 99% of programs. I feel strongly that the correct solution is to make integer literals not default to a signed fixed-size integer, and making generic literals work for big integers. The most common integer type for application logic could be the one proposed in #8635. |
Here's an optimized IR comparison, to give an idea of the code generation when it can't eliminate the check. In these examples, it can't eliminate it because there are inputs to a function (it must inline deep enough to see the values or range of values, to have a hope of eliminating them): fn foo(x: u32, y: u32) -> u32 {
x + y
} ; Function Attrs: nounwind readnone uwtable
define i32 @_ZN3foo17h73f0a05a6ca431af4v0.0E({ i64, %tydesc*, i8*, i8*, i8 }* nocapture readnone, i32, i32) #2 {
"function top level":
%3 = add i32 %2, %1
ret i32 %3
} Checked operation, calling fail on overflow: fn bar(x: u32, y: u32) -> u32 {
x.checked_add(&y).unwrap()
} ; Function Attrs: uwtable
define i32 @_ZN3bar17h73f0a05a6ca431av4v0.0E({ i64, %tydesc*, i8*, i8*, i8 }* nocapture readnone, i32, i32) #3 {
"function top level":
%3 = alloca %str_slice, align 8
%4 = alloca %str_slice, align 8
%5 = call { i32, i1 } @llvm.uadd.with.overflow.i32(i32 %1, i32 %2) #1
%6 = extractvalue { i32, i1 } %5, 1
br i1 %6, label %match_else.i, label %_ZN6option6Option6unwrap21h1283b974be2d27c1lBau4v0.0E.exit
match_else.i: ; preds = %"function top level"
%7 = bitcast %str_slice* %3 to i8*
call void @llvm.lifetime.start(i64 16, i8* %7)
%8 = bitcast %str_slice* %4 to i8*
call void @llvm.lifetime.start(i64 16, i8* %8)
%9 = getelementptr inbounds %str_slice* %3, i64 0, i32 0
store i8* getelementptr inbounds ([44 x i8]* @str1262, i64 0, i64 0), i8** %9, align 8
%10 = getelementptr inbounds %str_slice* %3, i64 0, i32 1
store i64 43, i64* %10, align 8
%11 = getelementptr inbounds %str_slice* %4, i64 0, i32 0
store i8* getelementptr inbounds ([46 x i8]* @str1263, i64 0, i64 0), i8** %11, align 8
%12 = getelementptr inbounds %str_slice* %4, i64 0, i32 1
store i64 45, i64* %12, align 8
call void @"_ZN3sys28FailWithCause$__extensions__9fail_with20hdb4c44d01ce41167qaF4v0.8E"({}* sret undef, { i64, %tydesc*, i8*, i8*, i8 }* undef, %str_slice* %3, %str_slice* %4, i64 323)
unreachable
_ZN6option6Option6unwrap21h1283b974be2d27c1lBau4v0.0E.exit: ; preds = %"function top level"
%13 = extractvalue { i32, i1 } %5, 0
%14 = bitcast %str_slice* %3 to i8*
call void @llvm.lifetime.start(i64 16, i8* %14)
%15 = bitcast %str_slice* %4 to i8*
call void @llvm.lifetime.start(i64 16, i8* %15)
call void @llvm.lifetime.end(i64 16, i8* %14)
call void @llvm.lifetime.end(i64 16, i8* %15)
ret i32 %13
} |
Note that the function is also no longer marked |
I agree that if safety is the motivation using bigints by default makes more sense. Failing on overflow doesn't seem that much better than wrapping in day-to-day development. But it seems like using something like GMP is a better option than a rust impl - why fight that battle again? I'd be disappointed if unchecked arithmetic required anything more than just specifying a fixed-width int type. I like Rust because it balances safety and control. Making unchecked arithmetic hard feels like it goes too far over to the safety side of the fence. Fwiw this is the approach Haskell takes. |
I don't there should should be a default at all, just generic literals and the choice to use what's appropriate by not falling back to |
cc me |
Accepted for well-defined |
About how to turning checked arithmetic off, would it be possible to use unsafe block but allowing to make them more specific about unsafe operations. For example something like:
|
Overflow is not unsafe when you explicitly want it to occur, so I don't think it makes sense to wrap it in an 'unsafe' block. Perhaps some new types that implement the arithmetic traits but perform checked arithmetic? Or a |
I think a type-based solution is best. Signed/unsigned integer types using the checked operations and expanding to a big integer on overflow would make sense. I don't think there's much benefit to a type throwing a failure on an overflow, because that's still going to be a bug in the application. Something to note is that the checked operations aren't implemented in hardware on every architecture. We need to start using compiler-rt to provide cross-platform support. A notable example is the lack of 64-bit checked multiplication on i686. |
I am surprised by the statements that a fail-on-overflow is not much different from the semantic errors you get from modular-arithmetic wrapping; I think that a proper (But do not interpret the above as a vote against bigints; I do agree they make the most sense in a number of contexts, though I'm not yet sure how best to incorporate them.) @thestinger what are your thoughts about how to handle underflow for an unsigned integer type? I assume you would want at least that to |
@pnkfelix: Underflow would still expand to a big integer, I was thinking it might make sense performance-wise to have coverage over different ranges before expanding to a big integer. It's probably enough to just have a single type based on I don't really see much point in having an unsigned big integer type like we have in |
@thestinger Ah, I see. I had thought you had meant that an unsigned fixed-width uint would overflow to an unsigned big uint. But now I interpret your suggestion as saying that uint should over- and underflow to signed integer. |
@pnkfelix: I don't want to change the semantics of the built-in integers from what they are now, but rather introduce a new type expanding to a big integer when necessary. There isn't a need to have an unsigned version, it was a silly thought. |
@pnkfelix It is a different developer experience (fail-fast vs. fail-mysteriously-at-some-point) and if those were the only two options with no other tradeoffs fail-fast would definitely be the preferable one. But they're both still runtime failures. For either approach, your code has to be aware of and prevent overflow conditions in order to not blow up. @thestinger's suggestion seems to make way more sense. You write code blissfully ignoring overflow issues, and overflow issues never bite you. And if you want overflow or to not pay the cost of overflow checking, then you can just go back to using standard fixed-width integer types and accept the tradeoff. |
It'd be nice if the type system could compute the values that are valid for any given function that don't cause overflow and limit it to those, but that would require dependant types. As an example, you could have a function which adds two numbers, and the type system would figure out that A+B can never exceed UINT_MAX, which would be expressed through the type system and enforced at each invocation. If you took one number from user input, and added it to a constant foo, the integer read from user input could only be as large as UINT_MAX-foo without returning None |
Simple example: extern mod extra;
use std::unstable::intrinsics::{abort, u32_mul_with_overflow};
use extra::test::BenchHarness;
#[inline(never)]
fn control(xs: &mut [u32]) {
for x in xs.mut_iter() {
*x *= 5;
}
}
#[inline(never)]
fn check(xs: &mut [u32]) {
for x in xs.mut_iter() {
unsafe {
let (y, o) = u32_mul_with_overflow(*x, 5);
if o {
abort()
}
*x = y;
}
}
}
#[inline(never)]
fn check_libstd(xs: &mut [u32]) {
for x in xs.mut_iter() {
*x = x.checked_mul(&5).unwrap();
}
}
#[bench]
fn bench_control(b: &mut BenchHarness) {
b.iter(|| {
let mut xs = [0, ..1000];
control(xs)
});
}
#[bench]
fn bench_check(b: &mut BenchHarness) {
b.iter(|| {
let mut xs = [0, ..1000];
check(xs)
});
}
#[bench]
fn bench_check_libstd(b: &mut BenchHarness) {
b.iter(|| {
let mut xs = [0, ..1000];
check_libstd(xs)
});
} --opt-level=2
--opt-level=3
Ouch. It becomes a larger slowdown multiplier when you add more operations to the loop too. Since it's increasing the code size a lot, it will bloat the instruction cache too. |
We will be slower than Java and Haskell on numeric loops. Benchmarks of Rust will look absolutely terrible unless you're already an expert with the language and know the default integers are slow. Ignoring scalar/loop optimizations on integers in a language claiming to be aimed at replacing C++ just isn't going to work. |
Nominating for removal from milestone. |
Would it be feasible to have some |
there is a lot of open-research to do on this topic, so we cannot make it a high priority for 1.0, which means we cannot make it something we turn on my default. If we want a separate opt-in trait for checked arithmetic, or a compiler flag, etc, then those are things to consider, but they should be in a separate bug. Closing as wontfix. |
A method to handle arithmetic correctness as a language feature would be extremely valuable. A program aborting due to an arithmetic error is usually better than the program silently continuing with data corruption. It may be convenient to write this off as not worth it by virtue of being intractable at Rust's level, but finally pointing out it is a massive headache to have incomplete arithmetic units on processors may finally get somewhere. (Yes, I'm aware of the overflow flags present on basically every architecture, but that is different than generating an interrupt.) "We Need Hardware Traps for Integer Overflows" https://blog.regehr.org/archives/1154 |
Fix FormatArgsExpn parsing of FormatSpec positions Woops, forgot visitors don't walk themselves Fixes rust-lang#9468 r? `@giraffate` changelog: none
There's a lot of concern that having
+
,-
, and*
overflow by default is incorrect - it's a source of many security vulnerabilities. Let's try changing the default to being checked and measure the impact on performance and code size.There are some open questions:
fail!
. It could also raise a condition, but having codegen raise conditions is a big step to take, and we're not sure we like conditions.unchecked
blocks, explicit methods, or additional types.Nominating well defined.
The text was updated successfully, but these errors were encountered: