Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segfault in _Backtrace_Unwind #47551

Closed
bossmc opened this issue Jan 18, 2018 · 25 comments · Fixed by #85395
Closed

Segfault in _Backtrace_Unwind #47551

bossmc opened this issue Jan 18, 2018 · 25 comments · Fixed by #85395
Labels
A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. C-bug Category: This is a bug. E-needs-mcve Call for participation: This issue has a repro, but needs a Minimal Complete and Verifiable Example I-crash Issue: The compiler crashes (SIGSEGV, SIGABRT, etc). Use I-ICE instead when the compiler panics. O-musl Target: The musl libc T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.

Comments

@bossmc
Copy link
Contributor

bossmc commented Jan 18, 2018

Upstream issue: https://bugs.llvm.org/show_bug.cgi?id=36005

The version of libunwind used in the x86_64-unknown-linux-musl (and possibly the i686-...-musl one too?) standard library has a bug where it will sometimes walk off the end of the segment containing the .eh_frame section and segfault.

I've struggled to reproduce this except in some proprietary code, but the backtrace looks like:

Program terminated with signal 11, Segmentation fault.
#0  0x0000000000fa733b in libunwind::LocalAddressSpace::get32(unsigned long) ()
(gdb) bt
#0  0x0000000000fa733b in libunwind::LocalAddressSpace::get32(unsigned long) ()
#1  0x0000000000faa0a2 in libunwind::CFI_Parser<libunwind::LocalAddressSpace>::findFDE(libunwind::LocalAddressSpace&, unsigned long, unsigned long, unsigned int, unsigned long, libunwind::CFI_Parser<libunwind::LocalAddressSpace>::FDE_Info*, libunwind::CFI_Parser<libunwind::LocalAddressSpace>::CIE_Info*) ()
#2  0x0000000000fa983d in libunwind::UnwindCursor<libunwind::LocalAddressSpace, libunwind::Registers_x86_64>::getInfoFromDwarfSection(unsigned long, libunwind::UnwindInfoSections const&, unsigned int) ()
#3  0x0000000000fa923d in libunwind::UnwindCursor<libunwind::LocalAddressSpace, libunwind::Registers_x86_64>::setInfoBasedOnIPRegister(bool) ()
#4  0x0000000000fa8fff in libunwind::UnwindCursor<libunwind::LocalAddressSpace, libunwind::Registers_x86_64>::step() ()
#5  0x0000000000fa820e in unw_step ()
#6  0x0000000000fa6ac0 in _Unwind_Backtrace ()

There's a proposed fix in the linked LLVM issue, which can possibly be patched into libunwind before building the musl target if taking an updated libunwind isn't possible.

@pietroalbini pietroalbini added I-crash Issue: The compiler crashes (SIGSEGV, SIGABRT, etc). Use I-ICE instead when the compiler panics. A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. O-musl Target: The musl libc C-bug Category: This is a bug. labels Jan 23, 2018
@sfackler
Copy link
Member

sfackler commented Feb 1, 2018

I ran into this as well.

@bossmc
Copy link
Contributor Author

bossmc commented Feb 1, 2018

There doesn't seem to be any movement on the LLVM issue since I raised it, not even an assignee. Do you know if this is normal behaviour for them? Or if I've mis-raised the issue?

@jwilm
Copy link

jwilm commented May 7, 2018

Running into this as well at OneSignal

@aidanhs
Copy link
Member

aidanhs commented Jun 21, 2018

I can reproduce this on latest stable on an open source project:

docker run -it ubuntu:18.04 bash
apt update
apt install curl git gcc make musl-tools file
curl https://sh.rustup.rs -sSf | sh
source $HOME/.cargo/env
rustup target add x86_64-unknown-linux-musl
git clone https://github.com/mozilla/sccache.git
cd sccache
export TARGET=x86_64-unknown-linux-musl && export OPENSSL_DIR=/openssl-musl
./scripts/travis-musl-openssl.sh
cargo build --target x86_64-unknown-linux-musl
RUST_LOG=sccache=debug RUST_BACKTRACE=1 SCCACHE_NO_DAEMON=1 SCCACHE_START_SERVER=1 $(pwd)/target/x86_64-unknown-linux-musl/debug/sccache &
RUST_LOG=sccache=debug $(pwd)/target/x86_64-unknown-linux-musl/debug/sccache gcc -c src/test/test.c -o /tmp/test.o
RUST_LOG=sccache=debug $(pwd)/target/x86_64-unknown-linux-musl/debug/sccache gcc -c src/test/test.c -o /tmp/test.o

On the second run of the final command, the background process will segfault.

@bossmc
Copy link
Contributor Author

bossmc commented Jul 20, 2018

After much investigation, I think there's two bugs here, one that LLVM's unwinder will consider unreadable memory as part of the .eh_frame section, and one that the rust compiler creates invalid (unusual?) unwind information.

The crash is specifically happening when trying to find unwind information for the frame above main (not crate::main but the "c-runtime" main. Since the frames above the C entry point are provided by the runtime (musl in this case), they are contained in crti.o/crt1.o which, in rust's musl-targeting stdlib have been provided. The provided object files have no unwind information in them:

$ readelf -S ~/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-musl/lib/crti.o 
There are 19 section headers, starting at offset 0x4a0:

Section Headers:
  [Nr] Name              Type             Address           Offset
       Size              EntSize          Flags  Link  Info  Align
  [ 0]                   NULL             0000000000000000  00000000
       0000000000000000  0000000000000000           0     0     0
  [ 1] .text             PROGBITS         0000000000000000  00000040
       0000000000000000  0000000000000000  AX       0     0     1
  [ 2] .data             PROGBITS         0000000000000000  00000040
       0000000000000000  0000000000000000  WA       0     0     1
  [ 3] .bss              NOBITS           0000000000000000  00000040
       0000000000000000  0000000000000000  WA       0     0     1
  [ 4] .init             PROGBITS         0000000000000000  00000040
       0000000000000001  0000000000000000  AX       0     0     1
  [ 5] .fini             PROGBITS         0000000000000000  00000041
       0000000000000001  0000000000000000  AX       0     0     1
  [ 6] .note.GNU-stack   PROGBITS         0000000000000000  00000042
       0000000000000000  0000000000000000           0     0     1
  [ 7] .debug_line       PROGBITS         0000000000000000  00000042
       0000000000000056  0000000000000000           0     0     1
  [ 8] .rela.debug_line  RELA             0000000000000000  000002e0
       0000000000000030  0000000000000018   I      17     7     8
  [ 9] .debug_info       PROGBITS         0000000000000000  00000098
       0000000000000049  0000000000000000           0     0     1
  [10] .rela.debug_info  RELA             0000000000000000  00000310
       0000000000000048  0000000000000018   I      17     9     8
  [11] .debug_abbrev     PROGBITS         0000000000000000  000000e1
       0000000000000012  0000000000000000           0     0     1
  [12] .debug_aranges    PROGBITS         0000000000000000  00000100
       0000000000000040  0000000000000000           0     0     16
  [13] .rela.debug_arang RELA             0000000000000000  00000358
       0000000000000048  0000000000000018   I      17    12     8
  [14] .debug_ranges     PROGBITS         0000000000000000  00000140
       0000000000000040  0000000000000000           0     0     16
  [15] .rela.debug_range RELA             0000000000000000  000003a0
       0000000000000060  0000000000000018   I      17    14     8
  [16] .shstrtab         STRTAB           0000000000000000  00000400
       000000000000009f  0000000000000000           0     0     1
  [17] .symtab           SYMTAB           0000000000000000  00000180
       0000000000000150  0000000000000018          18    12     8
  [18] .strtab           STRTAB           0000000000000000  000002d0
       000000000000000d  0000000000000000           0     0     1
Key to Flags:
  W (write), A (alloc), X (execute), M (merge), S (strings), l (large)
  I (info), L (link order), G (group), T (TLS), E (exclude), x (unknown)
  O (extra OS processing required) o (OS specific), p (processor specific)

This in and of itself isn't a huge problem, the libunwind code stops trying to unwind when it can't find unwind information for the next frame (https://github.com/llvm-mirror/libunwind/blob/release_39/src/UnwindCursor.hpp#L1323-L1324), so it should stop after main which would be fine. so long as we can safely fail to find the unwind information.

The issue is that libunwind's logic for searching for FDE (frame description entry) for the parent frame is pretty all-encompassing and runs in many phases, and one of these crashes (sometimes) if there searched for address is not present in the unwind information). At a high level the process looks like (https://github.com/llvm-mirror/libunwind/blob/release_39/src/UnwindCursor.hpp#L1196)

  • Look for a "compact unwinding table"
  • Look for "DWARF unwind information"
  • Look for ARM EHABI unwind info
  • Look for dynamically created DWARF unwind entries

For Rust x86_64 musl binaries, the compiler has provided DWARF unwind information, so we fall into the second bullet (https://github.com/llvm-mirror/libunwind/blob/release_39/src/UnwindCursor.hpp#L866):

  • No compact encoding hint, so the first section isn't interesting
  • Search the .eh_frame_hdr section for an index for the frame - not present
  • Check the local cache in case we've seen this frame before - we haven't
  • Scan the whole of the .eh_frame section for the address of interest - CRASH!

Digging further, we see that the logic for scanning the .eh_frame sector looks like (https://github.com/llvm-mirror/libunwind/blob/release_39/src/DwarfParser.hpp#L175):

  • Read a length field
    • If it's 0 stop, we're done - rustc never seems to generate this "end-of-information" marker
  • Skip over CIE entries
  • For non-CIE entries work out if the address we're looking for is in the range of the entry, if so we've got our FDE
  • Jump forward by the length of the CFI entry

In each loop we check that the newly found CIE entry is in a reasonable place (between ehSectionStart and ehSectionEnd). If it's not, bail out for safety and fail the lookup.

Unfortunately, as in the linked issue originally ehSectionEnd is incorrectly set too big (it's set to ehSectionStart + **segmentLength** which is almost always wrong) so the unwind code will step right off the end of the .eh_frame segment and into one of:

  • .gcc_except_table (LSDA) - This starts with a signature that is approximately 0x9c9bff, which is bigger than the error introduced by the ehSectionEnd miscalculation, so (after failing to parse the LSDA as a CFI entry) it jumps forward out of the range ehSectionStart..ehSectionEnd and bails out because the of the "reasonable location" check I mentioned above.
  • The gap before .gcc_except_table (if the .eh_frame did not have a length that's a multiple of 32 and thus there's padding between the sections). In this case when it tries to read the next entry one of three things happens:
    • The padding bytes happen to have a large value (when interpreted as a u32) and we're in a similar position to the case where there's no padding, fail to parse the CFI entry, try to jump to the next one, realise we're jumping out of the allowed range and bail out for safety.
    • The padding bytes are medium sized when interpreted as a u32, in this case, after the failed parse, the unwinder jumps far enough to be off the end of the loaded segment, but not far enough to be past the ehSectionEnd so the sanity check passes and the unwinder attempts to read the next CFI entry, reads a 32-bit number from an invalid address and... 💥
    • The padding bytes are small when interpreted as a u32, and the unwinder jumps into a random part of the LSDA and we're back in a situation that's functionally similar to when we're reading the padding bytes and the same three options are still applicable in the next lookup.

So, what's the bug(s) here?

  • LLVM is miscalculating the section length (the original issue)
  • Rustc (or the linker?) doesn't include the end-of-records marker at the end of .eh_frame

Either of these would fix this issue, probably both should be done?

@bossmc
Copy link
Contributor Author

bossmc commented Jul 20, 2018

Ah, there's another possible route for the crash:

  • If the LOAD segement is really big, the signature at the start of the LSDA might not be big enough to jump past ehSectionEnd leading to an invalid read and a 💥

@cetra3
Copy link

cetra3 commented Sep 3, 2018

I've ran into this issue as well. What's the status of this?

@bossmc
Copy link
Contributor Author

bossmc commented Sep 7, 2018

I've been looking at a proper fix for this and neither of the two paths I proposed above are yeilding simple results:

  • LLVM miscalculating the section length:
    • The simple fix I proposed in the upstream issue allows the unwinder to read arbitrary parts of the LOAD segment before bailing, which might have unexpected effects
    • A more complex fix needs to calculate the section length at runtime, which is tricky as sections don't exist at runtime...
      • Neither the .eh_frame nor .eh_frame_ptr section contains a length field so we're not simply told the length
      • Maybe read the executable off disk (look in /proc/self/exe) and parse the section table? Might not be readable to the current user. Might not even exist (not sure if this is possible, can you create a linux executable from an in-memory image?)
  • Add terminator entry to the end of the .eh_frames section
    • This relies on the static linker's behaviour, which is a shame as rustc allows you to bring your own linker
    • ld and gold both seem inconsistent about whether they add terminator entries, sometimes they do, sometimes they don't
    • This feels like something that rustc can't fix locally or rely on externally

Another option would be for LLVM to skip walking the .eh_frame section if the lookup in the .eh_frame_ptr fails. It's unlikely to ever reveal anything new, and it seems that this section is just un-walkable with any kind of reliability.

@bossmc
Copy link
Contributor Author

bossmc commented Sep 21, 2018

Here are two tested workarounds, neither very pretty, but both work:

1 - Custom link script

  1. Add rustflags = ["-C", "-Wl,--verbose"] to your Cargo config

  2. Do a build, extract the link script from the linker verbose output

  3. Find the lines:

     .eh_frame       : ONLY_IF_RO { KEEP (*(.eh_frame)) *(.eh_frame.*) }
     ...
     .eh_frame       : ONLY_IF_RW { KEEP (*(.eh_frame)) *(.eh_frame.*) }
    
  4. Change them to:

     .eh_frame       : ONLY_IF_RO { KEEP (*(.eh_frame)) *(.eh_frame.*) LONG(0x0) }
     ...
     .eh_frame       : ONLY_IF_RW { KEEP (*(.eh_frame)) *(.eh_frame.*) LONG(0x0) }
    

    (0x00000000 is the CIE terminator)

  5. Save off this script somewhere

  6. Set rustflags = ["-C", "-Wl,-T<script>"] in Cargo config (probably under the Musl target)

  7. Rebuild

2 - Add a custom object

  1. Create an object containing just the .eh_frame terminator:

    ; Create a terminator entry for the `.eh_frame` section of rust binaries.
    ;
    ; See https://github.com/rust-lang/rust/issues/47551 for details
    ;
    ; You can build this with:
    ;
    ;    nasm -f elf64 eh_frame_terminator.asm
    ;
    section .eh_frame
      ; The terminator is a 0-length CIE, the first field of which is the length
      ; as a 32-bit number.
      dd 0x00000000
    

2, Add rustflags = ["-C", "link-args=-Wl,<path/to/>eh_frame_terminator.o"] to your Cargo config (probably under the Musl target)

@jonas-schievink
Copy link
Contributor

Hi! Is this still an issue? If so, is there perhaps a simpler way to reproduce this than the instructions linked above?

@jonas-schievink jonas-schievink added E-needs-mcve Call for participation: This issue has a repro, but needs a Minimal Complete and Verifiable Example T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. labels Nov 22, 2019
@bossmc
Copy link
Contributor Author

bossmc commented Nov 22, 2019

The setup that's needed is:

  • A large LOAD segment (the one containing .text and .eh_frame)
    • Specifically the .eh_frame section needs to be deep into the segment
    • How deep it has to be is roughly determined by the first u32 of the .gcc_except_table section

So something like the following should trigger the fault:

#[derive(Clone, Copy)]
struct Foo {
    array: [u64; 10240],
}

impl Foo {
    const fn new() -> Self {
        Self {
            array: [0x1122_3344_5566_7788; 10240]
        }
    }
}

static BAR: [Foo; 10240] = [Foo::new(); 10240];

fn main() {
    let bt = backtrace::Backtrace::new();
    println!("Hello, world! {:?}", bt);
    println!("{:x}", BAR[0].array[0]);
}

This builds a huge .rodata section, which lives before the .eh_frame section so should lead to the crash.

Hedging words because I can't reproduce the issue on my development machine (I run out of RAM and get OOM-killered).

@jonas-schievink
Copy link
Contributor

Hmm, that doesn't seem to be sufficient (tested on the current stable and nightly). I get this output when building and running this on Arch Linux:

Hello, world! stack backtrace:
   0: unwind_repro::main
             at src/main.rs:19
   1: std::rt::lang_start::{{closure}}
             at /rustc/5c5b8afd80e6fa1d24632153cb2257c686041d41/src/libstd/rt.rs:61
   2: std::rt::lang_start_internal::{{closure}}
             at src/libstd/rt.rs:48
      std::panicking::try::do_call
             at src/libstd/panicking.rs:287
   3: __rust_maybe_catch_panic
             at src/libpanic_unwind/lib.rs:86
   4: std::panicking::try
             at src/libstd/panicking.rs:265
      std::panic::catch_unwind
             at src/libstd/panic.rs:395
      std::rt::lang_start_internal
             at src/libstd/rt.rs:47
   5: std::rt::lang_start
             at /rustc/5c5b8afd80e6fa1d24632153cb2257c686041d41/src/libstd/rt.rs:61
   6: main

1122334455667788

I also get the same output when using std::backtrace::Backtrace::force_capture() to capture the backtrace.

@rprichard
Copy link
Contributor

I noticed the wrong ehSectionEnd calculation independently and filed https://bugs.llvm.org/show_bug.cgi?id=46829.

@rprichard
Copy link
Contributor

It looks like the gcc driver on Alpine adds the /usr/lib/gcc/x86_64-alpine-linux-musl/9.3.0/crtendS.o file to the link, which includes a terminator in .eh_frame. It seems that the rustc driver doesn't add something like this? (LLVM has a compiler-rt/lib/crt/crtend.c that has an .eh_frame terminator, so maybe that's relevant to rustc.)

FWIW, it looks like LLVM's LLD linker automatically adds a terminator to the end of .eh_frame, even if one isn't present in the linker inputs, so that could be a workaround:

$ rustc hello.rs && objdump -Wf hello | grep ZERO
$ rustc hello.rs -C link-args=-fuse-ld=lld && objdump -Wf hello | grep ZERO
00004d50 ZERO terminator

I think libunwind should stop scanning .eh_frame when it finds unwind info from .eh_frame_hdr (i.e. the GNU_EH_FRAME segment).

@bossmc
Copy link
Contributor Author

bossmc commented Sep 22, 2020

Rustc's musl-targeting link includes:

crt1.o crti.o [rust/native objects...] crtn.o

I don't exactly know where the crtX.o files come from (they're shipped with the compiler toolchain, e.g. mine are in ~/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/x86_64-unknown-linux-musl/lib/self-contained/) but looking at the musl codebase, there's no terminator in crtn.o:

https://git.musl-libc.org/cgit/musl/tree/crt/x86_64/crtn.s

.section .init
	pop %rax
	ret

.section .fini
	pop %rax
	ret

(Intriguingly there's also no crt1.s in the musl codebase, suggesting Rustc gets these objects from somewhere else?)

I wouldn't expect to see a crtbeginS.o or crtend(S).o since the rustc musl target doesn't include a C++ runtime and musl libc doesn't define either of these objects for the C runtime either.

@rprichard
Copy link
Contributor

Intriguingly there's also no crt1.s in the musl codebase...

It looks like musl's crt1 is a C file, not assembly. crt/{S,r,}crt1.c

When crtbegin and crtend were added to LLVM (https://reviews.llvm.org/D28791), there was some discussion about which project is responsible for which aspects of the CRT begin/end files. Apparently the status quo on Linux is that:

  • the compiler (e.g. libgcc) provides crtbegin/crtend
  • the libc provides crti/crte (and maybe crt1, too?)

That is the case on my gLinux (i.e. Debian) system:

  • /usr/lib/x86_64-linux-gnu/crt[1ie].o all come from the libc6-dev:amd64 package
  • /usr/lib/gcc/x86_64-linux-gnu/9/{crtbegin.o,crtend.o} come from libgcc-9-dev:amd64

The .eh_frame terminator is usually in crtend, which is part of libgcc (not glibc or musl). LLVM's compiler-rt also provides crtbegin/crtend files.

The gcc and clang drivers link a crtend object even for C programs. On Alpine, the gcc driver links a libgcc crtend object.

It looks like rustc typically invokes the cc driver, but when it's targeting musl, it enters a mode where it provides the start files explicitly, and in this mode, it doesn't include crtbegin/crtend. For Alpine, I see -nostartfiles, rcrt1.o, crti.o, and crtn.o. For rustc/glibc, I don't see any of those flags, so the cc driver uses the default CRT begin/end/i/n/1 files.

I see that rustc documents a lot of this at compiler/rustc_target/src/spec/crt_objects.rs

//! Unlike native toolchains, rustc only currently adds the libc's objects during linking,
//! but not gcc's. As a result rustc cannot link with C++ static libraries (#36710)
//! when linking in self-contained mode.

FWIW, it looks like the MinGW rustc target has rsbegin.rs and rsend.rs files. rsend.rs has the .eh_frame terminator.

The unwinder needs the .eh_frame terminator, because the .eh_frame_hdr search table is optional, and because .eh_frame_hdr has the start address of .eh_frame, but not its size. (https://reviews.llvm.org/D86256 / https://reviews.llvm.org/D87750). I don't think it's reasonable to open the ELF file indicated by dlpi_name -- in principle, that duplicates the work done by the loader, and I can think of many specific things that wouldn't work. (e.g. the vdso, the executable w/glibc, Android's android_dlopen_ext and ability to map DSOs from zip files, unlinked files, etc).

@petrochenkov
Copy link
Contributor

Using self-contained mode is not generally recommended for serious work.
If the compiler heuristics enable it by default, then it can be explicitly disabled with -C link-self-contained=no, then gcc will be used for linking without -nostartfiles and will find all the necessary CRT objects by itself.
Then it's important to make sure that gcc for the right target (musl in this case) is used by rustc.

@petrochenkov
Copy link
Contributor

If the compiler heuristics enable it by default

Here's a FIXME for improving the heuristic for musl targets:

// FIXME: Find a better heuristic for "native musl toolchain is available",
// based on host and linker path, for example.
// (https://github.com/rust-lang/rust/pull/71769#issuecomment-626330237).
Some(CrtObjectsFallback::Musl) => sess.crt_static(Some(crate_type)),

@bossmc
Copy link
Contributor Author

bossmc commented Sep 23, 2020

@petrochenkov Thanks for that, very interesting. I think there are two separate use cases for musl-targeting builds:

  • an environment where there's a full "native musl" toolchain (like on Alpine etc) where I think you're correct that we shouldn't use self-contained and should let the toolchain sort it out for us though I can't see an easy way to tell if a real toolchain is available
  • a generic linux environment where the goal is fully static binaries (for easy of redistribution/embedding in thin container images/etc.) where there's likely not a native toolchain and rustc is forced to work out the answers for itself (as an aside musl-gcc no longer works as the "linker" for musl targets, since it doesn't support -pie which is enabled in rustc as of the last release)

I'm worrying about the latter case and I think from above the fix for that is:

  1. Update rsend.rs to add the terminator in linux-musl builds as well as windows-gnu ones
  2. Build rsend.o and include it in the x86_64-unknown-linux-musl target package alongside the crtX.o (so rustup etc fetch it)
  3. Update the link command rustc emits for the musl target to include rsend.o before crtn.o

Some side questions:

  • Does the above sound sensible to people?
  • Do we need to include any of rsbegin.rs too? (I think not, but I'm no expert 🤷)
  • Do other targets need similar treatment? i686 musl?

@bossmc
Copy link
Contributor Author

bossmc commented Sep 23, 2020

In the latter case it would make sense to me for rustc to invoke an actual linker (ld/lld/gold/etc) since at the moment it goes via a compiler but has to tell the compiler to not do anything and just to pass the link args to the linker anyway. This is off topic for this discussion though.

@petrochenkov
Copy link
Contributor

petrochenkov commented Sep 23, 2020

If we need to improve the self-contained mode specifically, then I think we can just ship the begin/end objects and link to them like gcc does obsoleting the comment cited in #47551 (comment). That appears to be the simplest solution.

@bossmc
If that doesn't work out (e.g. due to licensing), then the necessary parts can be added to rsend.rs and linked as rsend.o, as you suggested above.

@bossmc
Copy link
Contributor Author

bossmc commented Sep 29, 2020

Which crtbegin/crtend objects were you thinking? gcc's? LLVM's? If you're right and it's the compiler that ships the begin/end then maybe it's right for rustc to have it's own (though given that the link might contain code compiled by random compilers (from build.rs scripts) as well as compiled by rustc, which is "the compiler" for the sake of this discussion)?

@petrochenkov
Copy link
Contributor

@bossmc

gcc's? LLVM's?

I don't know, but looks like they should be compatible.
At least on Ubuntu both gcc and clang link to gcc's objects.
The gcc ones can be just copied by rustbuild if the license allows, the LLVM ones can be used if we need to build them by ourselves (they are a part of compiler-rt).

nikic added a commit to nikic/rust that referenced this issue Feb 26, 2021
For some targets, rustc uses a "CRT fallback", where it links CRT
object files it ships instead of letting the host compiler link
them.

On musl, rustc currently links crt1, crti and crtn (provided by
libc), but does not link crtbegin and crtend (provided by libgcc).
In particular, crtend is responsible for terminating the .eh_frame
section. Lack of terminator may result in segfaults during
unwinding, as reported in rust-lang#47551 and encountered by the LLVM 12
update in rust-lang#81451.

This patch links crtbegin and crtend for musl as well, following
the table at the top of crt_objects.rs.
@petrochenkov
Copy link
Contributor

#82534 adds crtbegin/crtend objects to Rust distribution and links to them in self-contained mode as was suggested in #47551 (comment), so it should fix this issue.

Dylan-DPC-zz pushed a commit to Dylan-DPC-zz/rust that referenced this issue Feb 27, 2021
Link crtbegin/crtend on musl to terminate .eh_frame

For some targets, rustc uses a "CRT fallback", where it links CRT
object files it ships instead of letting the host compiler link
them.

On musl, rustc currently links crt1, crti and crtn (provided by
libc), but does not link crtbegin and crtend (provided by libgcc).
In particular, crtend is responsible for terminating the .eh_frame
section. Lack of terminator may result in segfaults during
unwinding, as reported in rust-lang#47551 and encountered by the LLVM 12
update in rust-lang#81451.

This patch links crtbegin and crtend for musl as well, following
the table at the top of crt_objects.rs.

r? `@nagisa`
Dylan-DPC-zz pushed a commit to Dylan-DPC-zz/rust that referenced this issue Feb 27, 2021
Link crtbegin/crtend on musl to terminate .eh_frame

For some targets, rustc uses a "CRT fallback", where it links CRT
object files it ships instead of letting the host compiler link
them.

On musl, rustc currently links crt1, crti and crtn (provided by
libc), but does not link crtbegin and crtend (provided by libgcc).
In particular, crtend is responsible for terminating the .eh_frame
section. Lack of terminator may result in segfaults during
unwinding, as reported in rust-lang#47551 and encountered by the LLVM 12
update in rust-lang#81451.

This patch links crtbegin and crtend for musl as well, following
the table at the top of crt_objects.rs.

r? ``@nagisa``
@bors bors closed this as completed in 7baa7af May 31, 2021
@SchrodingerZhu
Copy link

I still have this problem with clang's compiler-rt: https://llvm.discourse.group/t/segfault-in-libunwind-during-cpu-profiling/5806/3

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. C-bug Category: This is a bug. E-needs-mcve Call for participation: This issue has a repro, but needs a Minimal Complete and Verifiable Example I-crash Issue: The compiler crashes (SIGSEGV, SIGABRT, etc). Use I-ICE instead when the compiler panics. O-musl Target: The musl libc T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

10 participants