Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Match on repr(C) enum returned from C library with unknown value leads to UB #36927

Closed
tcosprojects opened this issue Oct 3, 2016 · 20 comments
Closed
Labels
C-feature-request Category: A feature request, i.e: not implemented / a PR. T-lang Relevant to the language team, which will review and decide on the PR/issue.

Comments

@tcosprojects
Copy link

tcosprojects commented Oct 3, 2016

Summary

It is not uncommon for C libraries to add more values to an enum in their API while considering it a non-breaking change. However, with a rust library linked to the C library it causes undefined behavior when the change is made if the rust enum definition is not updated.

Examples

I've reduced the issue down to this sample code where get_bad_kind() represents a call into a C library that returns a new enum value unknown to rust which leads to a crash at runtime (signal: 11, SIGSEGV: invalid memory reference or stack overflow): https://play.rust-lang.org/?gist=f066d8e489a6e220866958065891a812&version=stable&backtrace=0

The C version of this code does not result in this issue, instead it hits the default case and returns -1: https://gist.github.com/tcosprojects/49d0c809d40b8f008e25dacf49c508bf

Tested on

rustc 1.12.0 (3191fba 2016-09-23)
binary: rustc
commit-hash: 3191fba
commit-date: 2016-09-23
host: x86_64-unknown-linux-gnu
release: 1.12.0

and

rustc 1.12.0 (3191fba 2016-09-23)
binary: rustc
commit-hash: 3191fba
commit-date: 2016-09-23
host: i686-pc-windows-msvc
release: 1.12.0

Discussion

Discussing the issue on the #rust channel led to the following:

A binding library would need to match the rust definition of the enum with the C definition exactly at all times when linked to a C library to prevent either this undefined behavior or passing unsupported flags to older versions of the C library. This could be done in the cargo build script that checks the C library version and enables feature flags for these enum values. It would still not not guarantee safety though when using dynamically linked libraries. Or it could be resolved by not using repr(C) enums and instead translating them at runtime with a function that handles invalid values. If the latter should be done, it would be helpful to have it mentioned in the docs somewhere. It also makes me uncertain what the purpose of repr(C) enum is, if not to be compatible with and used with C FFI calls.

In any case, as a C programmer this bug was surprising and took some significant debugging. It required going into the assembly output of the match statement to see it jump based on the constant of the highest value from the rust enum definition and then corrupt the stack to pin down the cause of the crash.

A warning about this compatibility issue regarding repr(C) enum in the FFI docs would be helpful. I suspect it is not uncommon for rust bindings libraries to use repr(C) enum this way while expecting it to work.

If the match expression could jump to unreachable!() in debug builds when encountering an unknown value on a repr(C) enum type it would help catch this issue without so much debugging.

Questions

  1. In the discussion on #rust there was some disagreement over whether this is undefined behavior for C. Is it?
  2. In the discussion there was also disagreement over whether this should be considered a breaking ABI change by the C library. In practice it does not seem uncommon for C libraries to do this while considering it non-breaking. Is it?
  3. Should this be a concern for rustc? Or handled elsewhere?
  4. Is it possible for rust to match the behavior of the C program's switch when an enum is repr(C)?
@nagisa
Copy link
Member

nagisa commented Oct 3, 2016

match in Rust has exhaustive semantics and it is very prominently documented pretty much everywhere where match is mentioned. repr(C) only changes the representation of the enum (i.e. how it appears in memory) to match C representation, but no representational change affects (or should affect) the behaviour of rust expressions.

TL;DR: match is match, not switch by design, no matter what representation the enum has.

In the discussion there was also disagreement over whether this should be considered a breaking ABI change by the C library. In practice it does not seem uncommon for C libraries to do this while considering it non-breaking. Is it?

It is not a ABI breaking change, as adding a variant to a C enum (as long as the size of the actual integer which represents the enum in memory does not change) has no influence on the registers/memory locations where this enum gets placed when returned or passed as an argument.

It is, however, an API-breaking change.

Should this be a concern for rustc? Or handled elsewhere?

Rustc (its match) is behaving as intended IMO.

Is it possible for rust to match the behavior of the C program's switch when an enum is repr(C)?

Matching on enum as i64 gets you switch-like semantics. Though, it is by no means convenient.

@Aatch
Copy link
Contributor

Aatch commented Oct 3, 2016

Ok, so some research suggests that 1) C and C++ have similar, but slightly different rules about the valid values of an enumeration type. However, it seems that neither are limited to the values defined by the declaration.

As far as I can tell, an enumeration type in C can take on any value of the underlying type. This means that adding an enum constant may-or-may-not be a breaking ABI change depending on the compiler and target (because of course it is). Fortunately, most platforms will default to i32 unless specifically fixed to another type.

With that in mind, we should probably do two things:

  1. Not supply range metadata to LLVM for repr(C) enums, that eliminates one source of UB and allows for manual checking of the values if necessary by casting to an integer type.
  2. Generate a default case if there isn't one already in the match when debug_assertions is enabled. I think that keeping exhaustiveness is valuable. It's also easier.

@retep998
Copy link
Member

retep998 commented Oct 3, 2016

The fact that Rust enums cause UB when they hold a value that isn't a valid variant is the primary reason I don't use Rust enums in winapi and instead use simple integer constants (The other reason being that integer constants don't require any trait impls or methods and significantly reduce compile times). C/C++ will often even use enums as bitflags which definitely don't align with Rust enums.

In the discussion on #rust there was some disagreement over whether this is undefined behavior for C. Is it?

Windows API uses enums as bitflags in both C and C++ which indicates that it is not undefined behavior in MSVC C/C++.

@arielb1
Copy link
Contributor

arielb1 commented Oct 4, 2016

Sure. We purged all Rust enums from rustllvm unless we control the other side. Maybe prohibit repr(C) on enums (we would have to find another repr tag)?

@arielb1
Copy link
Contributor

arielb1 commented Oct 4, 2016

Matching on enum as i64 gets you switch-like semantics. Though, it is by no means convenient.

That's wrong. Creating a Rust enum with an out-of-range value is instant UB - we generate !range metadata, which would cause your switch to be optimized out.

@DemiMarie
Copy link
Contributor

DemiMarie commented Oct 30, 2016

I think that there should be a lint against using an enum type in FFI, except for the case where the enum cannot have been created or modified by foreign code. It is way too easy to mess this up.

A newtype of an integer is a much better solution.

@retep998
Copy link
Member

Perhaps we should allow extern functions which take an enum by value, but lint on any which return an enum or take a mutable pointer to an enum. And by perhaps, I mean we really need this because this is a giant footgun waiting to catch someone out.

@zackw
Copy link
Contributor

zackw commented Jan 6, 2017

/subscribe

If a C API takes an enum by value, there's a decent chance it's actually bitflags, i.e. you're expected to be able to do E_THIS|E_THAT|E_THEOTHER in the call.

@steveklabnik steveklabnik added T-lang Relevant to the language team, which will review and decide on the PR/issue. and removed A-lang labels Mar 24, 2017
@mjbshaw
Copy link
Contributor

mjbshaw commented Apr 26, 2017

The usage of C enums as bitflags can be handled in Rust with things like the bitflags crate (or something similar). So that shouldn't be a big deal.

But this issue of undefined behavior for "out of range" enums is a real concern for me. bindgen even generates Rust enums from C enums. Describing this current state of affairs as a "foot gun" is a perfect description. Rust really needs an ergonomic solution to using C enums in a Rust FFI setting. Otherwise Rust's safety promises are moot.

@tonychain
Copy link

Wouldn't the simplest option be allowing the user to specify the default value to be selected in the event the enum is out-of-range? Something like:

#[repr(C, default = Invalid)]
enum Thing {
    Invalid,
    ...

At least for the enums I happen to be working with in this FFI, they all support a good choice of a default option.

@jeandudey
Copy link

@mjbshaw

bindgen even generates Rust enums from C enums

Yes, but bindgen's generated bindings are intended to be created at compile time, so this should only be a problem for users that generate bindings from the command line.

Rust really needs an ergonomic solution to using C enums in a Rust FFI setting.

I would first approach this with a proc-macro that generates constants at compile time before adding yet-another-feature to the language.

Otherwise Rust's safety promises are moot.

I do not agree with this, I do not think that C enumerations are part of Rust's safety measures, what happens is that this problem is unsolvable (in my humble opinion) within the Rust language itself, rust-bindgen does great job trying to force users to generate bindings at compile time, thus avoiding UB with C enums. The only problem I see with rust-bindgen is the dependency of libclang (and the unstable API it has).

@cuviper
Copy link
Member

cuviper commented Apr 27, 2017

rust-bindgen does great job trying to force users to generate bindings at compile time, thus avoiding UB with C enums.

I don't think this provides the guarantees that you want! C enums can still have values outside of the explicitly stated variants. AIUI, basically anything within the same bit range is still legal.

@emilio
Copy link
Contributor

emilio commented Apr 27, 2017

Note, also, that bindgen allows customizing the enums as needed (generating plain integers, for example).

The possibilities would be much larger with rust-lang/rfcs#1758, though.

@retep998
Copy link
Member

retep998 commented Apr 27, 2017

I would first approach this with a proc-macro that generates constants at compile time before adding yet-another-feature to the language.

Or you could use a regular declarative macro to define enums using normal syntax and have them turned into constants like winapi does.

@mjbshaw
Copy link
Contributor

mjbshaw commented Apr 27, 2017

@jeandudey

Yes, but bindgen's generated bindings are intended to be created at compile time, so this should only be a problem for users that generate bindings from the command line.

How does generating bindings at compile time help? As I mentioned in rust-lang/rust-bindgen#667 if the Rust program dynamically links to the C library then the C library could easily return an "out of range" enum (e.g. if the user's computer has a slightly different version of the C library; C programmers generally don't consider adding a value to an enum to be an API or ABI breaking change). Whether Rust's bindings are generated at compile time or not is irrelevant here.

@Cldfire
Copy link
Contributor

Cldfire commented May 21, 2017

Before I take the time to switch over to using numerical constants instead of enums in my own project, may I ask if this is still a concern? I just tried running the playground example that was linked in the original post on every rust version from 1.12.0 up to 1.17.0:

  • 1.12.0:
thread 'main' has overflowed its stack
fatal runtime error: stack overflow
error: Process didn't exit successfully: `target/debug/playground_application` (signal: 6, SIGABRT: process abort signal)
  • 1.13.0-1.16.0:
thread 'main' has overflowed its stack
fatal runtime error: stack overflow

On 1.17.0, however, everything seems to be fine:

Finished dev [unoptimized + debuginfo] target(s) in 0.22 secs
     Running `target/debug/playground_application`
-1

It exhibits the same behavior when run on play.rust-lang.org; not one of stable, beta, or nightly segfaults.

EDIT: It also doesn't segfault even if you don't have a _ => ... case: https://play.rust-lang.org/?gist=4f40cea0cb5f402019d0398341153119&version=stable&backtrace=0

@cuviper
Copy link
Member

cuviper commented May 21, 2017

The worst part of UB is that it might just do the "reasonable" thing you'd want...

@tcosprojects
Copy link
Author

Have you tried it with optimization? In my case whether the undefined behavior manifested depended on surrounding code and if it was optimized.

@Cldfire
Copy link
Contributor

Cldfire commented May 24, 2017

If by 'optimization' you mean release mode, then yeah. Other than that, no.

Regardless, what cuviper said made sense; I see now that it would be foolhardy to rely on this behavior :)

FauxFaux added a commit to FauxFaux/zstd-rs that referenced this issue Jul 4, 2018
e.g.

```
cargo build --features bindgen
cp target/**/bindings.rs src/bindings.rs
```

This might not be a good idea:

rust-lang/rust-bindgen#758
rust-lang/rust#36927
@steveklabnik
Copy link
Member

So, it's been two years. re-reading this issue, it seems that this is working as intended, generally. As such, I'm going to give it a close. Anything that changes here seems big enough to require an RFC, and probably should be pursued through those channels. Thanks!

toyboot4e added a commit to toyboot4e/rokol that referenced this issue Nov 15, 2020
bindgen option of rustified_enum or newtype_enum are NOT enabled:
* newtype_enum: They can't be created from u32, so we can't make our fancy enums with rustic case.
* rusitified_enum: They're dangerous if we try to match them: rust-lang/rust#36927
* default: Enum variants are defined as constants. To get strictly typed, we need to wrap them into an `enum`. FFI (fields and functions) will be weakly typed
toyboot4e added a commit to toyboot4e/rokol that referenced this issue Nov 15, 2020
That's nice!!

bindgen option of rustified_enum or newtype_enum are NOT enabled:
* newtype_enum: They can't be created from u32, so we can't make our fancy enums with rustic case.
* rusitified_enum: They're dangerous if we try to match them: rust-lang/rust#36927
* default: Enum variants are defined as constants and FFI (fields and parameters) will be weakly typed. To get strictly typed, we need to wrap them into our `enum`(s).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-feature-request Category: A feature request, i.e: not implemented / a PR. T-lang Relevant to the language team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests