Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: enum discriminant ought to be smaller #1647

Closed
nikomatsakis opened this issue Jan 24, 2012 · 20 comments
Closed

RFC: enum discriminant ought to be smaller #1647

nikomatsakis opened this issue Jan 24, 2012 · 20 comments
Labels
A-FFI Area: Foreign function interface (FFI) A-runtime Area: std's runtime and "pre-main" init for handling backtraces, unwinds, stack overflows

Comments

@nikomatsakis
Copy link
Contributor

We use int not i32. This is both wasteful and not compatible with C, where enums are always C integers (i32). We ought to fix this but #1645 makes it hard to do in general.

However, we could make this work in the special case of C-like enums, by changing trans::type_of_tag() to represent such tags as ints.

@kevina
Copy link
Contributor

kevina commented Jan 24, 2012

Note, C integers are not always i32, the C standard allows them to be different sizes depending on the possible values they can take, but making them i32 will cover most of the cases.

Ideally, the size of the discriminant should be allowed to change based on the size of the values it can take, just like it is done in C, so that they can take values outside of the range of i32.

@nikomatsakis
Copy link
Contributor Author

@kevina thanks for the clarification. Yes, we should adapt C-like enum representation to match precisely what C compilers would do (or as close as we can get). I guess the key point is that C-like enums can be represented differently from enums with data with relative ease. To what extent is the size of integer used for an enum defined in the C spec? Is it all impl. dependent?

@kevina
Copy link
Contributor

kevina commented Jan 24, 2012

@graydon
Copy link
Contributor

graydon commented Feb 16, 2012

This has more to do with us not being LP64 I imagine. Which is to say: our ints are bigger than those in C, on a lot of 64 bit platforms. But we don't have long (and I don't want it), so we wound up effectively going ILP64, which is rarer.

Not really sure what to do about it. Comments welcome. Being LP64 -- that is, demoting our int down to the same size whatever our platform's C int is -- would make the C interop story a bit, certainly. So it might be wise. See http://www.viva64.com/media/images/content/a/size_t_and_ptrdiff_t/image2.png for a rough picture of the popularity of each.

Something about keeping 32-bit integers alive-by-accident longer than necessary just rubs me the wrong way, I guess. It's 2012. I imagine I can buy 64-bit-word phones by now.

@ghost ghost assigned nikomatsakis Apr 12, 2012
@pcwalton
Copy link
Contributor

pcwalton commented Aug 7, 2012

This sounds like an RFC. I guess I slightly favor the current system; "enum is always an int" seems like a simpler rule and if the enum has to contain anything it'll probably be 64 bit aligned anyhow.

@nikomatsakis
Copy link
Contributor Author

I could see an argument for representing C-like enums as C integers (for compatibility).

@graydon
Copy link
Contributor

graydon commented Mar 14, 2013

see also #1704 and #1271

@jld
Copy link
Contributor

jld commented Mar 15, 2013

I've been working on making enums smaller. I think what @nikomatsakis and I have converged on is: enum discriminants should be as small as (efficiently) possible, with the option to specify a specific size or C enum compatibility, and a lint warning (like what we already have for Rust's int) if enums without a C-friendly hint are used for FFI.

@thestinger
Copy link
Contributor

There isn't actually a defined ABI for C enums, the size can vary per declaration. There's a -fshort-enums flag for gcc but it's default on some platforms like AAPCS-based ARM.

C libraries should really be using static const (or #define) in headers if they expect to support bindings and be compiler independent but that's often not the case....

@jld
Copy link
Contributor

jld commented May 15, 2013

I think it's true (or at least sufficiently true for our purposes) that, given a C enum and given an ABI, there exists an integer type to be used for it. It might vary per enum, and it might not be the same on an AMD64-SysV-based platform and an APCS-based platform, but this is already true for data representations in general. So, a "represent this like C" flag could work.

The other thing going on here is Rust's existing knowledge of C representations, which is currently in core::libc as typedefs rather than in the compiler. This makes things more interesting, because any enum annotation that would allow referencing them (“this enum should be represented as a C int”) needs to either be real syntax so it can contain a type expression, or else (e.g., if it's done with attributes) call into name resolution in a way that I don't think anything else does, or maybe language items could work. It also suggests that the knowledge of C enum layout should be in core::libc, and that probably needs at least 12 typedefs ({8, 16, 32, 64} × {fits in unsigned, fits in signed, fits in either}), and I don't know what else would need it.

The other other thing is that I don't know how much of this is actually needed, at least in the short term.

@thestinger
Copy link
Contributor

@jld: there isn't an enum ABI shared between compilers on the platform though - there's no way to represent one like C, you can just represent it like a specific C compiler with specific compiler flags.

#include <stdio.h>

enum foo { A, B };
enum bar { C = 256, D, E };
enum baz { F = 8796093022208, G };

int main() {
  printf("%zu\n", sizeof(enum foo));
  printf("%zu\n", sizeof(enum bar));
  printf("%zu\n", sizeof(enum baz));
  return 0;
}

clang/gcc on x86_64 without flags:

4
4
8

with -fshort-enums (default on some platforms)

1
2
8

If Rust represented them with the smallest size possible, it would be compatible with -fshort-enums everywhere (assuming the compiler supports it and implements it like gcc).

@jld
Copy link
Contributor

jld commented May 15, 2013

Right, but you can't (in general) link C code compiled with -fshort-enums and -fno-short-enums, because they'll disagree on sizes. GCC has other ABI-breaking flags like that, and the documentation warns about it. Rust already has to decide what C ABI it's interoperating with; how are enums different?

@pcwalton
Copy link
Contributor

I don't believe this is backwards incompatible, renominating.

@graydon
Copy link
Contributor

graydon commented May 30, 2013

just a bug, removing milestone/nomination.

@graydon
Copy link
Contributor

graydon commented May 30, 2013

we are not going to commit to a stable enum representation

@bstrie
Copy link
Contributor

bstrie commented May 30, 2013

In which case, can this be closed?

@jld
Copy link
Contributor

jld commented May 31, 2013

Adjusted the issue title to match the current plans, such as they are. Or should this be about opt-in C-compatible enums? Should a separate issue be opened for whichever one this issue isn't?

@jld
Copy link
Contributor

jld commented Jun 2, 2013

I've found some more context on the ARM enum ABI issue: http://gcc.gnu.org/ml/gcc/2012-02/msg00315.html

So there are two variants of the ABI. A quick test for arm-linux-androideabi and arm--netbsdelf shows that they appear to both use the long-enum variant of the ABI.

The i386 and MIPS variants of the SysV ABI are handled annoyingly, in the versions of the spec I found: enums are listed as 4 bytes always, and the long long type doesn't appear to exist. GCC for i386, at least, will use an 8-byte type for enums that need it.

bors added a commit that referenced this issue Aug 26, 2013
This is in preparation for making discriminants not always be int (#1647), but it also makes compiles for a 64-bit target not behave differently — with respect to how many bits of discriminants are preserved — depending on the build host's word size, which is a nice property to have.

We may want to standardize how to abbreviate "discriminant" in a followup change.
@huonw
Copy link
Member

huonw commented Nov 17, 2013

@jld did #9613 fix this?

@thestinger
Copy link
Contributor

Yeah, this was fixed. There are obviously still potential size optimizations for certain sets of variants but the general case is as good as it will get.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-FFI Area: Foreign function interface (FFI) A-runtime Area: std's runtime and "pre-main" init for handling backtraces, unwinds, stack overflows
Projects
None yet
Development

No branches or pull requests

8 participants