Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Specifying linkage on externs silently removes indirection #31508

Open
jethrogb opened this issue Feb 9, 2016 · 3 comments
Open

Specifying linkage on externs silently removes indirection #31508

jethrogb opened this issue Feb 9, 2016 · 3 comments
Labels
A-codegen Area: Code generation A-linkage Area: linking into static, shared libraries and binaries C-bug Category: This is a bug. requires-nightly This issue requires a nightly compiler in some way. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@jethrogb
Copy link
Contributor

jethrogb commented Feb 9, 2016

When compiling the following code:

// externs-c.c
unsigned char myarr[10]={1,2,3,4,5,6,7,8,9,10};
unsigned char (*implicitvar)[10]=&myarr;
unsigned char (*explicitvar)[10]=&myarr;
// externs-rust.rs
#![feature(linkage)]

extern {
    static implicitvar: *const [u8;10];
    // Should have no effect, external linkage is the default in an extern block
    #[linkage="external"]
    static explicitvar: *const [u8;10];
}

fn as_option(p: *const [u8;10]) -> Option<&'static [u8;10]> {
    unsafe{std::mem::transmute(p)}
}

fn main() {
    println!("implicitvar = {:?}",as_option(implicitvar));
    println!("explicitvar = {:?}",as_option(explicitvar));
}

using

clang -c externs-c.c && rustc externs-rust.rs -C link-args=./externs.o

running ./externs will output something like the following:

implicitvar = Some([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
explicitvar = Some([168, 4, 122, 85, 85, 85, 0, 0, 0, 0])

Wat.

Taking a look at the IR:

; externs-c.ll
@myarr = global [10 x i8] c"\01\02\03\04\05\06\07\08\09\0A", align 1
@implicitvar = global [10 x i8]* @myarr, align 8
@explicitvar = global [10 x i8]* @myarr, align 8
; externs-rust.ll
@implicitvar = external global [10 x i8]*
@explicitvar = external global [10 x i8]
@_rust_extern_with_linkage_explicitvar = internal global [10 x i8]* @explicitvar

So, Rust removes a layer of indirection defining static explicitvar: [u8;10] and adding a new variable static _rust_extern_with_linkage_explicitvar: *const [u8;10]=&explicitvar. All mentions of explicitvar in Rust source code get replaced with _rust_extern_with_linkage_explicitvar. This results in the C version and this new Rust version not having the same type! To get “correct” behavior in the example above, you would need to define static explicitvar: *const *const [u8;10] instead.

This weird assymmetry between the types associated with symbols in Rust and in C is a source of great confusion and can easily lead to bugs. In the example above, we just read 2 bytes past some pointer by interpreting it as a 10-byte array.

This weird behavior was introduced in #12556 (see also #11978), the rationale being weak linkage and the fact that some pointers can't be null in the Rust typesystem. While true, I don't think that's sufficient rationale to add this layer of indirection. I think the layer of indirection should be removed completely. For weak linkage, a restriction can be added to allow only zeroable types.

@Aatch
Copy link
Contributor

Aatch commented Feb 9, 2016

This seems like very strange behaviour to me. I'm not sure if "some pointers can't be null" is a sensible reason for this behaviour at all.

@Aatch Aatch added A-linkage Area: linking into static, shared libraries and binaries A-codegen Area: Code generation labels Feb 9, 2016
@alexcrichton
Copy link
Member

The supported added in #12556 (the indirection here) was only really intended for weak symbols. Weak symbols are currently required to be pointers as otherwise this is not memory safe:

extern {
    #[linkage = "extern_weak"]
    static SYMBOL: extern fn();
}

because fn types aren't nullable. I believe it was intended that this linkage restriction was detected at compile-time in some typechecking pass, although I'm not sure if that was ever implemented.

@jethrogb
Copy link
Contributor Author

I guess this bug in particular is about the non-weak pointer case. However, I think the weak pointer case also warrants more discussion, for which I've opened a topic on internals.r-l.o.

Here's the code in question btw: https://github.com/rust-lang/rust/blob/3e9589c/src/librustc_trans/trans/foreign.rs#L139-L145 . It does not distinguish at all between different types of linkage.

@Mark-Simulacrum Mark-Simulacrum added the C-bug Category: This is a bug. label Jul 24, 2017
@pnkfelix pnkfelix added the T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. label Sep 20, 2018
@Enselic Enselic added the requires-nightly This issue requires a nightly compiler in some way. label Oct 15, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-codegen Area: Code generation A-linkage Area: linking into static, shared libraries and binaries C-bug Category: This is a bug. requires-nightly This issue requires a nightly compiler in some way. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

6 participants