Skip to content
This repository has been archived by the owner on Oct 9, 2018. It is now read-only.

integer type style guidelines #24

Open
1fish2 opened this issue Jul 17, 2014 · 10 comments
Open

integer type style guidelines #24

1fish2 opened this issue Jul 17, 2014 · 10 comments

Comments

@1fish2
Copy link

1fish2 commented Jul 17, 2014

RFC PR #161 proposes some style guidelines, but the details depend on the acceptance and implementation of that RFC and/or RFC PR #146: Scoped attributes for checked arithmetic.

If RFC PR #161 is not accepted, I'd suggest these style guidelines:

  • Use the int and uint types only for array indexing and similar purposes, that is, when you need number ranges that scale up with memory capacity. Despite the familiar names, these are not the "default," "native," or fastest integer types. They might not be the same size as C int. They could be 16 bits in small embedded devices.
  • As a "default" integer type (see RFC PR #161), use i32.

A style guideline on using unsigned types for numbers that should never be negative:

In particular, do not use unsigned types to say a number will never be negative. Instead, use assertions for this. ...

Some people, including some textbook authors, recommend using unsigned types to represent numbers that are never negative. This is intended as a form of self-documentation. However, in C, the advantages of such documentation are outweighed by the real bugs it can introduce.

That's because C lacks overflow detection. To quote Gábor Lehel:

This suggestion [signed integers with assertions] makes a lot of sense in a context where overflow/underflow silently wraps around. However, if something like RFC PR #146 were to be implemented, then it would once again make sense to use types which more accurately express the range of legal values (i.e., which are self-documenting), because compiler-added checks can be enabled to catch errors where the value would go out of range. Accurate types with compiler-added assertions beats inaccurate types with programmer-added assertions.

Depending on the RFC PR #161 choices, the types int and uint might be renamed to, say index and uindex, or specified to always be at least 32 bits. In the latter case, it's less bad to use int and uint for arithmetic.

@lilyball
Copy link
Contributor

Why i32 and not i64?

I also disagree with the claim that you should use signed integers. The stated reasons to do so from that C++ Style Guide do not apply to Rust. The first reason is it's easy to write a bad decrementing loop, but in Rust we don't use that type of for loop, we use iterators. In the rare case where you need a negative integral iterator, range(0u, max).rev() works correctly with uint. The second reason was comparing signed and unsigned, but Rust doesn't have automatic integral promotion, so there's no issues with silent conversions

@1fish2
Copy link
Author

1fish2 commented Jul 17, 2014

The experience with i32 as the default is quite good in Java. Programmers should analyze for specific needs. A default of i32 means not prematurely optimizing to i16 or i8 and avoiding lots of conversions; using those types where the tradeoff is worthwhile. Deciding on a default of i64 or BigInt is a fine alternative if people are OK with the space & time costs in the standard libraries.

Since Rust avoids some of the problems with unsigned integers, that reduces the risks with unsigned integers. Still, underflow can happen when subtracting unsigned integers. Typing the values unsigned makes underflow harder to detect. If subtraction quietly wraps around, will you catch that with an assertion?

@lilyball
Copy link
Contributor

The only compelling argument I see against using int is the potential for int to be an i16 on some embedded 16-bit machine. However, I'm not convinced that's ever really a problem, because any such machine is likely to have restrictions preventing the use of most Rust libraries (similar to the restrictions encountered when writing a kernel in Rust). We could try defining int as "at least 32 bits", but then it's not technically correct to use as the intptr_t type on said 16-bit machine. But since this 16-bit issue seems largely theoretical, I don't see a practical issue with just using int as it exists today.

Underflow can happen when subtracting unsigned integers, yes, but it can also happen when subtracting (or adding!) signed integers too. If you're worried about overflow/underflow, we have a set of "checked" math operations in std::num. And of course the main reason why this is an issue in C is writing for loops, which as I already said isn't an issue in Rust due to the use of iterators instead of integer math. Can you give an example of Rust code that would be silently problematic with unsigned integers, but would not have similar issues with signed integers?

@1fish2
Copy link
Author

1fish2 commented Jul 19, 2014

RFC PR #161 proposes a 32-bit minimum size for int. There'll be billions of embedded devices, some with 16-bit CPUs/MCUs, and it'd be nice to program them in Rust without forgoing all the libraries.

If the style guide recommends checked math operations for cases like unsigned subtraction and programmers remember to do that, that would be a fine solution.

One advantage of signed integers is you can check an aggregate computation once. Also it's more obvious what happened when an array bounds check complains about a negative value rather than a huge value.

@lilyball
Copy link
Contributor

The fundamental problem with signed integers is it adds runtime failure to a whole slew of places that would otherwise be unnecessary with unsigned integers. Basically, for every use of an unsigned integer today, if you convert it to take a signed integer, you likely need to add an assert(intarg >= 0) and document that it can't be negative.

@1fish2
Copy link
Author

1fish2 commented Jul 19, 2014

Converting a large, unsigned value to a too-narrow signed integer is an example of overflow/underflow that can happen during any arithmetic/conversion operation with integers that are too narrow. In theory we avoid that by picking wide enough integer types (or BigInt).

Better yet, we'd be able to turn on checked math to detect these cases without every programmer manually adding comprehensive assertions. Then, unsigned types provide the documentation and the assertions, so when a program decrements too far or subtracts offsets in the wrong order, they'll find out when it's easy to debug and before collateral damage. I vote for this.

Lacking checked math, we'd better pick integer types that can hold the range of intermediate values. When unexpected negative values are more likely than unexpectedly huge values, a sign bit is more useful than another magnitude bit. That's my thought.

@UtherII
Copy link

UtherII commented Aug 5, 2014

The only compelling argument I see against using int is the potential for int to be an i16 on some embedded 16-bit machine. However, I'm not convinced that's ever really a problem, because any such machine is likely to have restrictions preventing the use of most Rust libraries

I see a lot of compeling arguments.

  • The comportement is inconsistent when porting to another architecture :
    • working code could start to overflow.
    • performance can decrese anormaly (it happend to me when i switched from 32 bit to 64 bit)
  • The int name is misleading since it is seem natural choice for integer calculation, while it is not the optimal type for that (i32/u32 are pretty faster on 64bit machines). It is sized for indexing.

@pnkfelix
Copy link
Member

pnkfelix commented Sep 1, 2014

What we decide here is going to have an impact on the interface to some of the libraries, such as std::container::Bitv.

Nominating, suggest 1.0 P-backcompat-libs.

(Oops, thought this was in the rust repository, but this is the rust-guidelines repo. I'll go nominate rust-lang/rust/issues/15526 instead.)

@l0kod
Copy link

l0kod commented Sep 1, 2014

cc rust-lang/rfcs#161

@tbu-
Copy link

tbu- commented Nov 16, 2014

The only compelling argument I see against using int [...]

What is an argument for using int as default? I don't think there is any besides int being the default in other languages. If it'd be called intptr nobody would suggest to use it as the default.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants