Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

With Zig 0.14-pre in Debug mode, ARM binary has ARM.extab sections placed in RAM memory region #22685

Open
marnix opened this issue Jan 30, 2025 · 10 comments
Labels
bug Observed behavior contradicts documented or intended behavior

Comments

@marnix
Copy link

marnix commented Jan 30, 2025

Zig Version

0.14.0-dev.2989+bf6ee7cb3

Steps to Reproduce and Observed Behavior

Build an STM32 binary in Debug mode (for the STM32F3DISCOVERY board in this case), with 0.14.0-dev.2989+bf6ee7cb3 results in something that is way too big to flash (128 MB), apparently because of .ARM.extab.t[...] section being placed in RAM instead of in flash.

Reproduction scenario: With https://github.com/marnix/microzig/tree/zig-issue-22603-0.14.0-dev.2851%2Bb074fb7dd (and Zig 0.14.0-dev.2989+bf6ee7cb3), in examples/stmicro/stm32, run zig build and inspect the resulting zig-out/firmware/stm32f3discovery.elf file.

The readelf -S output is shown below, and after that the generated and used linker.ld file, which has the memory layout of the specific MCU that I compile for and use.

The resulting binary therefore covers 0x0800_0000...0x1000_06cf (slightly over 128 MB), with zeroes in the range 0x0800_4bb8...0x1000_069f (first RAM is at 0x1000_000...0x1000_1fff), which goes beyond the flash range of 0x0800_0000...0x0803_ffff.)

So the issue seems to be that the ARM.extab sections (which seem contain exception tables, each is 12 bytes large) are put in RAM instead of in flash. But these sections are not even mentioned in linker.ld...

My question: What could be causing this?


Note that this only happens in Debug mode: with ReleaseSafe and ReleaseFast there are fewer sections, but they are all put in the flash region; and with ReleaseSmall these sections completely disappear; so in all those cases the resulting binary can be flashed.


For completeness, this is for -ODebug -target thumb-freestanding-eabihf -mcpu cortex_m4+vfp4d16sp.

readelf -S output:

There are 24 section headers, starting at offset 0xe6380:

Section Headers:
  [Nr] Name              Type            Addr     Off    Size   ES Flg Lk Inf Al
  [ 0]                   NULL            00000000 000000 000000 00      0   0  0
  [ 1] .text             PROGBITS        08000000 010000 00441e 00  AX  0   0  4
  [ 2] .ARM.exidx        ARM_EXIDX       08004420 014420 0000f8 00  AL  1   0  4
  [ 3] .data             PROGBITS        10000000 020000 0006a0 00 AMS  0   0  8
  [ 4] .ARM.extab.t[...] PROGBITS        100006a0 0206a0 00000c 00   A  0   0  4
  [ 5] .ARM.extab.t[...] PROGBITS        100006ac 0206ac 00000c 00   A  0   0  4
  [ 6] .ARM.extab.t[...] PROGBITS        100006b8 0206b8 00000c 00   A  0   0  4
  [ 7] .ARM.extab.t[...] PROGBITS        100006c4 0206c4 00000c 00   A  0   0  4
  [ 8] .bss              NOBITS          100006d0 0206d0 000000 00  WA  0   0  1
  [ 9] .flash_end        PROGBITS        08004bb8 0206d0 000000 00  WA  0   0  1
  [10] .debug_loc        PROGBITS        00000000 0206d0 05c5af 00      0   0  1
  [11] .debug_abbrev     PROGBITS        00000000 07cc7f 0006ad 00      0   0  1
  [12] .debug_info       PROGBITS        00000000 07d32c 02a3c6 00      0   0  1
  [13] .debug_ranges     PROGBITS        00000000 0a76f2 00b420 00      0   0  1
  [14] .debug_str        PROGBITS        00000000 0b2b12 00c532 01  MS  0   0  1
  [15] .debug_pubnames   PROGBITS        00000000 0bf044 0038bb 00      0   0  1
  [16] .debug_pubtypes   PROGBITS        00000000 0c28ff 001c0e 00      0   0  1
  [17] .ARM.attributes   ARM_ATTRIBUTES  00000000 0c450d 000045 00      0   0  1
  [18] .debug_frame      PROGBITS        00000000 0c4554 003ddc 00      0   0  4
  [19] .debug_line       PROGBITS        00000000 0c8330 01b1bb 00      0   0  1
  [20] .comment          PROGBITS        00000000 0e34eb 000067 01  MS  0   0  1
  [21] .symtab           SYMTAB          00000000 0e3554 001580 10     23 329  4
  [22] .shstrtab         STRTAB          00000000 0e4ad4 000175 00      0   0  1
  [23] .strtab           STRTAB          00000000 0e4c49 001736 00      0   0  1
Key to Flags:
  W (write), A (alloc), X (execute), M (merge), S (strings), I (info),
  L (link order), O (extra OS processing required), G (group), T (TLS),
  C (compressed), x (unknown), o (OS specific), E (exclude),
  D (mbind), y (purecode), p (processor specific)

linker.ld:

/*
 * This file was auto-generated by microzig
 *
 * Target CPU:  cortex_m4
 * Target Chip: STM32F303VC
 */

ENTRY(microzig_main);

MEMORY
{
  flash0    (rx!w) : ORIGIN = 0x08000000, LENGTH = 0x00040000
  ram0      (rw!x) : ORIGIN = 0x10000000, LENGTH = 0x00002000
  ram1      (rw!x) : ORIGIN = 0x20000000, LENGTH = 0x0000A000
}

SECTIONS
{
  .text :
  {
     KEEP(*(microzig_flash_start))
     *(.text*)
  } > flash0

  .ARM.exidx : {
      *(.ARM.exidx* .gnu.linkonce.armexidx.*)
  } >flash0

  .data :
  {
     microzig_data_start = .;
     *(.sdata*)
     *(.data*)
     *(.rodata*)
     microzig_data_end = .;
  } > ram0 AT> flash0

  .bss (NOLOAD) :
  {
      microzig_bss_start = .;
      *(.bss*)
      *(.sbss*)
      microzig_bss_end = .;
  } > ram0

  .flash_end :
  {
      microzig_flash_end = .;
  } > flash0

  microzig_data_load_start = LOADADDR(.data);
}

Expected Behavior

With Zig 0.13.0, no .ARM.extab.t[...] section is ever created.

Reproduction scenario: Do the same thing with https://github.com/marnix/microzig/tree/zig-issue-22603-0.13.0, with Zig 0.13.0, and compare the readelf -S output.

(That code is identical to the code above, minus changes to match Zig 0.13.0, most of which are in marnix/microzig@cebfb1b.)

@marnix marnix added the bug Observed behavior contradicts documented or intended behavior label Jan 30, 2025
@marnix
Copy link
Author

marnix commented Jan 30, 2025

@mattnite and probably others on the MicroZig side will try to debug, and dig up some more info. Please let us know if there is any additional info that would be helpful. Thanks!

@alexrp
Copy link
Member

alexrp commented Jan 30, 2025

This is probably happening because in 0.13.0, there was this:

pub fn supportsUnwinding(target: std.Target) bool {
return switch (target.cpu.arch) {
.x86 => switch (target.os.tag) {
.linux, .netbsd, .solaris, .illumos => true,
else => false,
},
.x86_64 => switch (target.os.tag) {
.linux, .netbsd, .freebsd, .openbsd, .macos, .ios, .solaris, .illumos => true,
else => false,
},
.arm => switch (target.os.tag) {
.linux => true,
else => false,
},
.aarch64 => switch (target.os.tag) {
.linux, .netbsd, .freebsd, .macos, .ios => true,
else => false,
},
else => false,
};
}

However, in 0.14.0, the code is not as needlessly restrictive:

pub fn supportsUnwinding(target: std.Target) bool {
return switch (target.cpu.arch) {
.amdgcn,
.nvptx,
.nvptx64,
.spirv,
.spirv32,
.spirv64,
.spu_2,
=> false,
// Enabling this causes relocation errors such as:
// error: invalid relocation type R_RISCV_SUB32 at offset 0x20
.riscv64, .riscv32 => false,
// Conservative guess. Feel free to update this logic with any targets
// that are known to not support Dwarf unwinding.
else => true,
};
}

This then means unwind tables are enabled:

zig/src/target.zig

Lines 410 to 420 in 53598e3

pub fn defaultUnwindTables(target: std.Target, libunwind: bool, libtsan: bool) std.builtin.UnwindTables {
if (target.os.tag == .windows) {
// The old 32-bit x86 variant of SEH doesn't use tables.
return if (target.cpu.arch != .x86) .@"async" else .none;
}
if (target.os.tag.isDarwin()) return .@"async";
if (libunwind) return .@"async";
if (libtsan) return .@"async";
if (std.debug.Dwarf.abi.supportsUnwinding(target)) return .@"async";
return .none;
}

So basically, use -fno-unwind-tables (or .unwind_tables = .none in build.zig) and you shouldn't see any of this data. (Of course you'll get worse stack traces, but that's to be expected.)

@marnix
Copy link
Author

marnix commented Jan 31, 2025

@alexrp Thanks! That change https://github.com/ziglang/zig/pull/20908/files#diff-ea8523987dc356aebef7f9592db4fc8cb4eddf66236bca61ed8a20ec04bd9319R29 in #20908 indeed exactly explains why on ARM embedded we see ARM.extab sections now, since 0.14: The default of 'unwind tables' switched from false to true.

(I think that also explains why I see __aeabi_unwind_cpp_pr0 and __aeabi_unwind_cpp_pr1 symbols, in most compilation modes, as reported in #22603: Presumably those exception tables point to those symbols.)

I'll test I have tested with .unwind_tables = .none (in MicroZig's build.zig's addExecutable() call soon, but I'm fully expecting that to and that does indeed work (i.e., make the sections go away -> flashing will work again).


However, if it can be made to work, I'd like to have unwind tables for my ARM embedded debugging.

Therefore my question is still: Why would on ARM embedded these sections be placed in a RAM section, and not in a flash section? Except that flash is correctly used in all release modes?

Is that linker.ld script incorrect or incomplete? (E.g., should it mention ARM.extab?) Is there something else wrong on the MicroZig side? Or is there some issue in Zig's linker?

@alexrp
Copy link
Member

alexrp commented Jan 31, 2025

There's a bit of nuance as to what the term "unwind tables" actually means.

There are generally two "levels": Synchronous and asynchronous, with the latter being precise at every instruction and thus more useful for stack traces or things like garbage collectors, but also larger as a result.

In addition, there's a distinction between tables emitted into the .eh_frame and .debug_frame sections. The former is intended for tables that have semantic importance for the program (e.g. for exception handling in a C++ program), while the latter is only expected to be used for debugging purposes and thus could be stripped with no adverse effects on execution.

(The section names and table format differ on Arm because Arm has its own EH ABI, whereas most other architectures use DWARF. But the same principles mostly apply.)

When you request unwind tables with -f(async-)unwind-tables, you're asking for .eh_frame, and this is what the code above does, and why you see this data in RAM. I'm not exactly sure why we do this, but if I had to guess, maybe LLVM doesn't have a way of requesting async-level tables in .debug_frame...?

You might be able to use linker scripts to force this data into flash. I don't think that should cause any harm.

@marnix
Copy link
Author

marnix commented Jan 31, 2025

@alexrp You write that "that is why you see this data in RAM." But I see these sections in RAM only in Debug mode (even though there is ample room for those 48 bytes in the flash region). It is in flash, where I want it, in all Release... modes.

That seems a strange inconsistency?


The good news is that I think I've hit on the actual fix in MicroZig, which is to add

  .ARM.extab : {
      *(.ARM.extab* .gnu.linkonce.armextab.*)
  } >flash0

to the generated linker.ld file (see the initial description of this issue). I don't know what that means, but it works.


So perhaps the bug in Zig here is that if the linker script doesn't mention anything about .ARM.extab, then Zig's linker should fail, instead of dumping these sections in an arbitrary MEMORY region?


Regardless of the answer to that question, as far as I can see this behavior is not a regression, and not something that should hold up the 0.14 release.

marnix added a commit to marnix/microzig that referenced this issue Feb 1, 2025
For ARM-based embedded firmware this requires
explicit declaration of `ARM.extab` sections in linker.d,
otherwise the exception tables can incorrectly end up in RAM.
(See ziglang/zig#22685.)
@alexrp
Copy link
Member

alexrp commented Feb 1, 2025

Sorry, I should have been more precise with my statement above. I'm not actually sure why it only gets put in RAM in Debug mode; I would actually expect it to get put in RAM in all modes.

If possible, perhaps you can try building without MicroZig's linker script just to see what Zig/LLD do on their own in different modes?

mattnite pushed a commit to ZigEmbeddedGroup/microzig that referenced this issue Feb 2, 2025
For ARM-based embedded firmware this requires
explicit declaration of `ARM.extab` sections in linker.d,
otherwise the exception tables can incorrectly end up in RAM.
(See ziglang/zig#22685.)
@marnix
Copy link
Author

marnix commented Feb 4, 2025

@alexrp "try building without MicroZig's linker script". I'm not sure what that would look like? Obviously having no linker.ld script would make things fail:

  • To create an ELF binary, the linker need at least the MEMORY description, to know where to put the code (the address of flash, in this case).
  • MicroZig relies on various microzig_..._start/end symbols defined in linker.ld, and the build fails if those are not present.

I tried removing some things from linker.ld (quoted in the description above), but none of the variations I tried worked, that is, I couldn't even get Zig to produce an ELF binary...

As an ELF/linker noob (and embedded noob), I'm probably missing something here. Feel free to enlighten me! And let me know if there is some other experiment I can do.

@alexrp
Copy link
Member

alexrp commented Feb 4, 2025

What I'm getting at is that the compiler is probably doing something subtly different about these sections depending on build mode. I figure we should be able to more easily spot what that something is if we remove MicroZig from the equation. You could try just building an empty program with a _start function for the same target and see what differences, if any, exist in the sections based on build mode.

@marnix
Copy link
Author

marnix commented Feb 4, 2025

@alexrp I've asked others in the MicroZig community to look at that, because I don't know enough to do such an experiment, unfortunately...

Alternatively, reproducing the issue is just a git clone, a git checkout, and a zig build away. :-)

@Grazfather
Copy link

Grazfather commented Feb 21, 2025

I have a similar issue, but I am seeing it on v0.13.0

I can reproduce sometimes even with release=fast.

When I uncomment out a particular function call, my device fails to boot.

Turns out that this is due to the inclusion of the .ARM.extab.t section. Its inclusion is shifting the vector table forward by 12 bytes and so the device can't even leave reset correctly.

# readelf -S hdgood.elf

There are 21 section headers, starting at offset 0x10b3e8:

Section Headers:
  [Nr] Name              Type            Addr     Off    Size   ES Flg Lk Inf Al
  [ 0]                   NULL            00000000 000000 000000 00      0   0  0
  [ 1] .boot2            PROGBITS        10000000 010000 000100 00   A  0   0  1
  [ 2] .text             PROGBITS        10000100 010100 0050c8 00 AXMS  0   0  4
  [ 3] .ARM.exidx        ARM_EXIDX       100051c8 0151c8 0000d8 00  AL  2   0  4
  [ 4] .data             PROGBITS        20000000 020000 000002 00  WA  0   0  1
  [ 5] .bss              NOBITS          20000002 020002 000002 00  WA  0   0  1
  [ 6] .flash_end        PROGBITS        100052a2 020002 000000 00  WA  0   0  1
  [ 7] .debug_loc        PROGBITS        00000000 020002 06fb50 00      0   0  1
  [ 8] .debug_abbrev     PROGBITS        00000000 08fb52 00085d 00      0   0  1
  [ 9] .debug_info       PROGBITS        00000000 0903af 02f8cc 00      0   0  1
  [10] .debug_ranges     PROGBITS        00000000 0bfc7b 009770 00      0   0  1
  [11] .debug_str        PROGBITS        00000000 0c93eb 013113 01  MS  0   0  1
  [12] .debug_pubnames   PROGBITS        00000000 0dc4fe 0041f1 00      0   0  1
  [13] .debug_pubtypes   PROGBITS        00000000 0e06ef 0085a0 00      0   0  1
  [14] .ARM.attributes   ARM_ATTRIBUTES  00000000 0e8c8f 000041 00      0   0  1
  [15] .debug_frame      PROGBITS        00000000 0e8cd0 003124 00      0   0  4
  [16] .debug_line       PROGBITS        00000000 0ebdf4 01e48e 00      0   0  1
  [17] .comment          PROGBITS        00000000 10a282 000013 01  MS  0   0  1
  [18] .symtab           SYMTAB          00000000 10a298 0008d0 10     20 117  4
  [19] .shstrtab         STRTAB          00000000 10ab68 0000d9 00      0   0  1
  [20] .strtab           STRTAB          00000000 10ac41 0007a4 00      0   0  1
Key to Flags:
  W (write), A (alloc), X (execute), M (merge), S (strings), I (info),
  L (link order), O (extra OS processing required), G (group), T (TLS),
  C (compressed), x (unknown), o (OS specific), E (exclude),
  D (mbind), y (purecode), p (processor specific)

# readelf -S hdbad.elf
There are 22 section headers, starting at offset 0x10f1ec:

Section Headers:
  [Nr] Name              Type            Addr     Off    Size   ES Flg Lk Inf Al
  [ 0]                   NULL            00000000 000000 000000 00      0   0  0
  [ 1] .boot2            PROGBITS        10000000 010000 000100 00   A  0   0  1
  [ 2] .ARM.extab.t[...] PROGBITS        10000100 010100 00000c 00   A  0   0  4
  [ 3] .text             PROGBITS        1000010c 01010c 0058f4 00 AXMS  0   0  4
  [ 4] .ARM.exidx        ARM_EXIDX       10005a00 015a00 0000f0 00  AL  3   0  4
  [ 5] .data             PROGBITS        20000000 020000 000002 00  WA  0   0  1
  [ 6] .bss              NOBITS          20000002 020002 000002 00  WA  0   0  1
  [ 7] .flash_end        PROGBITS        10005af2 020002 000000 00  WA  0   0  1
  [ 8] .debug_loc        PROGBITS        00000000 020002 07101c 00      0   0  1
  [ 9] .debug_abbrev     PROGBITS        00000000 09101e 000892 00      0   0  1
  [10] .debug_info       PROGBITS        00000000 0918b0 030305 00      0   0  1
  [11] .debug_ranges     PROGBITS        00000000 0c1bb5 009b30 00      0   0  1
  [12] .debug_str        PROGBITS        00000000 0cb6e5 0136a7 01  MS  0   0  1
  [13] .debug_pubnames   PROGBITS        00000000 0ded8c 004313 00      0   0  1
  [14] .debug_pubtypes   PROGBITS        00000000 0e309f 0086c0 00      0   0  1
  [15] .ARM.attributes   ARM_ATTRIBUTES  00000000 0eb75f 000041 00      0   0  1
  [16] .debug_frame      PROGBITS        00000000 0eb7a0 003268 00      0   0  4
  [17] .debug_line       PROGBITS        00000000 0eea08 01ede7 00      0   0  1
  [18] .comment          PROGBITS        00000000 10d7ef 000013 01  MS  0   0  1
  [19] .symtab           SYMTAB          00000000 10d804 000b20 10     21 152  4
  [20] .shstrtab         STRTAB          00000000 10e324 0000f7 00      0   0  1
  [21] .strtab           STRTAB          00000000 10e41b 000dcf 00      0   0  1
Key to Flags:
  W (write), A (alloc), X (execute), M (merge), S (strings), I (info),
  L (link order), O (extra OS processing required), G (group), T (TLS),
  C (compressed), x (unknown), o (OS specific), E (exclude),
  D (mbind), y (purecode), p (processor specific)

The reset_vector is at the start of .text, at 0x10000100 in good case, but at 0x1000010c when the inclusion of this function causes the .ARM.extab.t section to get linked in.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Observed behavior contradicts documented or intended behavior
Projects
None yet
Development

No branches or pull requests

3 participants