Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Zero-length WASM functions cause strange behaviour in SymbolMap #315

Closed
t-veor opened this issue Jan 27, 2021 · 3 comments
Closed

Zero-length WASM functions cause strange behaviour in SymbolMap #315

t-veor opened this issue Jan 27, 2021 · 3 comments
Assignees

Comments

@t-veor
Copy link

t-veor commented Jan 27, 2021

Sometimes rustc can produce a zero-length WASM function. When compiling something like this:

//! ```cargo
//! [package]
//! name = "zero-size-test"
//! version = "0.1.0"
//! edition = "2018"
//!
//! [lib]
//! crate-type = ["cdylib"]
//!
//! [profile.release]
//! debug = 2
//! opt-level = "s"
//! ```

#![no_std]

use core::panic::PanicInfo;

#[panic_handler]
fn panic(_info: &PanicInfo) -> ! {
    loop {}
}

#[no_mangle]
pub extern "C" fn main() {
    panic!();
}

with cargo build --target=wasm32-wasi --release, one of the functions it generates is this (output from wasm-objdump):

...
0000af func[2] <_ZN4core3ptr13drop_in_place17h0004e302d1c67d0eE>:
 0000b0: 0b                         | end
...

(I've no idea why this function doesn't get pruned by dead code elimination.)

This function seems to break WasmObject::symbol_map, which returns this output for this binary:

SymbolMap {
    symbols: [
        Symbol {
            name: "_ZN4core3ptr13drop_in_place17h0004e302d1c67d0eE",
            address: 0x0,
            size: 0xb5,
        },
        Symbol {
            name: "rust_begin_unwind",
            address: 0x90,
            size: 0x8,
        },
        Symbol {
            name: "main",
            address: 0x98,
            size: 0x1d,
        },
        Symbol {
            name: "_ZN4core9panicking5panic17h97b5c3a1a3625519E",
            address: 0xb5,
            size: 0x55,
        },
        Symbol {
            name: "_ZN4core9panicking9panic_fmt17hcdbc22275273f460E",
            address: 0x10a,
            size: 0x41,
        },
        Symbol {
            name: "_ZN36_$LT$T$u20$as$u20$core..any..Any$GT$7type_id17hf7e256a7acecf50fE",
            address: 0x14b,
            size: 0x0,
        },
    ],
}

As you can see, the symbol map thinks that _ZN4core3ptr13drop_in_place17h0004e302d1c67d0eE ranges from 0x0 to 0xb5, which completely covers the range of rust_begin_unwind and overlaps with the range of main. This actually messes up name lookups later as DebugSession::functions will report that that rust_begin_unwind is called _ZN4core3ptr13drop_in_place17h0004e302d1c67d0eE and that _ZN4core3ptr13drop_in_place17h0004e302d1c67d0eE is called main, because they're both fully included in the wonky ranges that symbol_map reports.

The culprit appears to be this function that WasmSymbolIterator uses (here https://github.com/getsentry/symbolic/blob/master/symbolic-debuginfo/src/wasm.rs#L315):

fn get_addr_of_function(func: &walrus::Function) -> u64 {
    if let walrus::FunctionKind::Local(ref loc) = func.kind {
        let entry_block = loc.entry_block();
        let seq = loc.block(entry_block);
        seq.instrs.get(0).map_or(0, |x| x.1.data() as u64)
    } else {
        0
    }
}

Note that this function returns 0 if there are no instructions in the function's entry block. Walrus appears to not include the end pseudo-instruction that is at the end of every function, and so as a result we get a weird range for the zero-size function which fully includes every other function up to where it appears in the code section.

As an aside, this function seems a little wonky to me anyway for getting the address of a function - it gets the address of the first instruction in the function, but this skips over the function declaration and local declarations. For example, compare the ranges of _ZN4core9panicking5panic17h97b5c3a1a3625519E and _ZN4core9panicking9panic_fmt17hcdbc22275273f460E with the output from wasm-objdump here:

...
0000b2 func[3] <_ZN4core9panicking5panic17h97b5c3a1a3625519E>:
 0000b3: 01 7f                      | local[0] type=i32
 0000b5: 23 80 80 80 80 00          | global.get 0
 ...
 000105: 0b                         | end
000107 func[4] <_ZN4core9panicking9panic_fmt17hcdbc22275273f460E>:
 000108: 01 7f                      | local[0] type=i32
 00010a: 23 80 80 80 80 00          | global.get 0
 ...
 000148: 0b                         | end
00014a func[5] <_ZN36_$LT$T$u20$as$u20$core..any..Any$GT$7type_id17hf7e256a7acecf50fE>:
 00014b: 42 d4 c9 d5 d6 d6 b3 d4 be | i64.const 5511651255515440340
 ...

_ZN4core9panicking5panic17h97b5c3a1a3625519E is actually 0xb2..0x107 but the symbol map reports it as 0xb5..0x10a, which overlaps a little with the next function _ZN4core9panicking9panic_fmt17hcdbc22275273f460E, because 0x10a is where this function thinks _ZN4core9panicking9panic_fmt17hcdbc22275273f460E begins.

@mitsuhiko
Copy link
Member

Two things. I think part of this might explain what I filed her: rust-lang/rust#79410

Second: using walrus here to get the addr is indeed questionable. I wanted to change it to wasmparser already but didn't finish it yet. Walrus is kinda the wrong tool for this job.

@loewenheim
Copy link
Contributor

Can you reproduce this with a current version of symbolic?

@ashwoods
Copy link
Contributor

ashwoods commented Apr 3, 2023

Closing due to inactivity.

@ashwoods ashwoods closed this as completed Apr 3, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants