Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NativeAOT staticlib crashes with SIGSEGV inside RhpNewArray when linked with -dead_strip #96663

Closed
anatawa12 opened this issue Jan 9, 2024 · 13 comments · Fixed by #103039
Closed
Assignees
Labels
area-NativeAOT-coreclr in-pr There is an active PR which will close this issue when it is merged os-mac-os-x macOS aka OSX
Milestone

Comments

@anatawa12
Copy link
Contributor

anatawa12 commented Jan 9, 2024

Description

When I linked nativeaot static lib with -Wl,-dead_strip, the C# function crashed with SIGSEGV / Segmentation fault inside RhpNewArray

Reproduction Steps

test.cs

using System.Runtime.InteropServices;

public class Class1
{
    [UnmanagedCallersOnly(EntryPoint = "add_dotnet")]
    public static int Add(int a, int b)
    {
        return a + b;
    }
}

test.csproj

<Project Sdk="Microsoft.NET.Sdk">

  <PropertyGroup>
    <TargetFramework>net8.0</TargetFramework>
    <RootNamespace>test</RootNamespace>
    <ImplicitUsings>enable</ImplicitUsings>
    <Nullable>enable</Nullable>

    <PublishAot>true</PublishAot>
    <PublishTrimmed>true</PublishTrimmed>
    <NativeLib>Static</NativeLib>
  </PropertyGroup>

  <ItemGroup>
  </ItemGroup>

</Project>

test.c

#include <stdio.h>

int add_dotnet(int, int);

int main(void) {
  printf("%d\n", add_dotnet(1, 2));
}

build script (bash)

dotnet publish -r osx-arm64 -c Release

cc \
  -o test \
  test.c \
  ./bin/release/net8.0/osx-arm64/publish/test.a \
  "$HOME"/.nuget/packages/runtime.osx-arm64.microsoft.dotnet.ilcompiler/8.0.0/framework/libSystem.Native.a \
  "$HOME"/.nuget/packages/runtime.osx-arm64.microsoft.dotnet.ilcompiler/8.0.0/sdk/libRuntime.ServerGC.a \
  "$HOME"/.nuget/packages/runtime.osx-arm64.microsoft.dotnet.ilcompiler/8.0.0/sdk/libstdc++compat.a \
  "$HOME"/.nuget/packages/runtime.osx-arm64.microsoft.dotnet.ilcompiler/8.0.0/framework/libSystem.Globalization.Native.a \
  "$HOME"/.nuget/packages/runtime.osx-arm64.microsoft.dotnet.ilcompiler/8.0.0/sdk/libeventpipe-disabled.a \
  "$HOME"/.nuget/packages/runtime.osx-arm64.microsoft.dotnet.ilcompiler/8.0.0/sdk/libbootstrapperdll.o \
  -framework Foundation \
  -Wl,-dead_strip

Expected behavior

No crash and prints 3

Actual behavior

Crashes with Segmentation Fault with RhpNewArray

Here is lldb registers and backtrace
$ lldb ./test 
(lldb) target create "./test"
Current executable set to '/Users/anatawa12/IdeaProjects/vrc-get/vrc-get-litedb/cs-test/test' (arm64).
(lldb) r
Process 42160 launched: '/Users/anatawa12/IdeaProjects/vrc-get/vrc-get-litedb/cs-test/test' (arm64)
Process 42160 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGSEGV
    frame #0: 0x000000010004ed9c test`RhpNewArray at AllocFast.S:219
Target 0: (test) stopped.
warning: This version of LLDB has no plugin for the language "assembler". Inspection of frame variables will be limited.
(lldb) register read
General Purpose Registers:
        x0 = 0x00000001000e1110  test`vtable for __Array<S_P_CoreLib_Internal_Runtime_TypeManagerHandle>
        x1 = 0x0000000000000000
        x2 = 0x0000000000000000
        x3 = 0x00006000007a8008
        x4 = 0x000000000000000e
        x5 = 0x000000003f852438
        x6 = 0x00006000039a0060
        x7 = 0x0000000000000d70
        x8 = 0x00006000018ac0f0
        x9 = 0x0000000000000000
       x10 = 0x0000000000000000
       x11 = 0x0000000000000010
       x12 = 0x0000000000000000
       x13 = 0x00000000fffffc06
       x14 = 0x00000000000007fb
       x15 = 0x00000000815fcffb
       x16 = 0x0000000000000006
       x17 = 0x00000000000003f9
       x18 = 0x0000000000000000
       x19 = 0x0000000000000000
       x20 = 0x00000001000d62b0  test`_ZN18RedhawkGCInterface25tls_pLastAllocationEETypeE$tlv$init
       x21 = 0x0000000100000000  test`_mh_execute_header
       x22 = 0x00000001000cd090  test`c_classlibFunctions
       x23 = 0x000000000000000e
       x24 = 0x0000000000000000
       x25 = 0x0000000000000000
       x26 = 0x0000000000000000
       x27 = 0x0000000000000000
       x28 = 0x0000000000000000
        fp = 0x000000016fdfee00
        lr = 0x00000001000a8ff8  test`fram0_S_P_CoreLib_Internal_Runtime_CompilerHelpers_StartupCodeHelpers__CreateTypeManagers + 312
        sp = 0x000000016fdfed90
        pc = 0x000000010004ed9c  test`RhpNewArray + 100
      cpsr = 0x60001000

(lldb) 

Regression?

I tested no other version of .NET so I don't know if this is regression or not.

This might be because medium using CoreRT linked from NativeLib sample uses rust and it doesn't mention about dead_strip however if this problem is os-dependent, this isn't.

Known Workarounds

not passing -dead_strip to the linker

Configuration

$ dotnet --version
8.0.100
$ neofetch --off
[email protected] 
---------------------- 
OS: macOS 14.2.1 23C71 arm64 
Host: MacBookPro18,4 
Kernel: 23.2.0 
Uptime: 13 days, 3 hours, 32 mins 
Packages: 265 (brew) 
Shell: zsh 5.9 
Resolution: 3360x1890, 1800x1169, 1280x800, 1920x1080 
DE: Aqua 
WM: Quartz Compositor 
WM Theme: Blue (Light) 
Terminal: rustrover 
CPU: Apple M1 Max 
GPU: Apple M1 Max 
Memory: 10511MiB / 65536MiB 

With cross-arch-compiling to x64/x86_64 and running on rosetta will cause same problem so this problem is not arch specific.
No other oses are tested so I don't know if this is os-specific

Other information

I encountered this problem trying to link C# to rust.
Rust passes -dead_strip by default.
For rust users, you can omit -dead_strip by setting rustflags = ["-C", "link-dead-code"] but this will increase final binary

In NativeLibrary sample, I have to pass -Wl,-u,_NativeAOT_StaticInitialization if it's built with .NET 7.
My project is .NET 8 so I think this is not related, and when I passed that option to compiler, linker says _NativeAOT_StaticInitialization is undefined.

@ghost ghost added the untriaged New issue has not been triaged by the area owner label Jan 9, 2024
@ghost
Copy link

ghost commented Jan 9, 2024

Tagging subscribers to this area: @agocke, @MichalStrehovsky, @jkotas
See info in area-owners.md if you want to be subscribed.

Issue Details

Description

When I linked nativeaot static lib with -Wl,-dead_strip, the C# function crashed with SIGSEGV / Segmentation fault inside RhpNewArray

Reproduction Steps

test.cs

using System.Runtime.InteropServices;

public class Class1
{
    [UnmanagedCallersOnly(EntryPoint = "add_dotnet")]
    public static int Add(int a, int b)
    {
        return a + b;
    }
}

test.csproj

<Project Sdk="Microsoft.NET.Sdk">

  <PropertyGroup>
    <TargetFramework>net8.0</TargetFramework>
    <RootNamespace>test</RootNamespace>
    <ImplicitUsings>enable</ImplicitUsings>
    <Nullable>enable</Nullable>

    <PublishAot>true</PublishAot>
    <PublishTrimmed>true</PublishTrimmed>
    <NativeLib>Static</NativeLib>
  </PropertyGroup>

  <ItemGroup>
  </ItemGroup>

</Project>

test.c

#include <stdio.h>

int add_dotnet(int, int);

int main(void) {
  printf("%d\n", add_dotnet(1, 2));
}

build script (bash)

dotnet publish -r osx-arm64 -c Release

cc \
  -o test \
  test.c \
  ./bin/release/net8.0/osx-arm64/publish/test.a \
  "$HOME"/.nuget/packages/runtime.osx-arm64.microsoft.dotnet.ilcompiler/8.0.0/framework/libSystem.Native.a \
  "$HOME"/.nuget/packages/runtime.osx-arm64.microsoft.dotnet.ilcompiler/8.0.0/sdk/libRuntime.ServerGC.a \
  "$HOME"/.nuget/packages/runtime.osx-arm64.microsoft.dotnet.ilcompiler/8.0.0/sdk/libstdc++compat.a \
  "$HOME"/.nuget/packages/runtime.osx-arm64.microsoft.dotnet.ilcompiler/8.0.0/framework/libSystem.Globalization.Native.a \
  "$HOME"/.nuget/packages/runtime.osx-arm64.microsoft.dotnet.ilcompiler/8.0.0/sdk/libeventpipe-disabled.a \
  "$HOME"/.nuget/packages/runtime.osx-arm64.microsoft.dotnet.ilcompiler/8.0.0/sdk/libbootstrapperdll.o \
  -framework Foundation \
  -Wl,-dead_strip

Expected behavior

No crash and prints 3

Actual behavior

Crashes with Segmentation Fault with RhpNewArray

Here is lldb registers and backtrace
$ lldb ./test 
(lldb) target create "./test"
Current executable set to '/Users/anatawa12/IdeaProjects/vrc-get/vrc-get-litedb/cs-test/test' (arm64).
(lldb) r
Process 42160 launched: '/Users/anatawa12/IdeaProjects/vrc-get/vrc-get-litedb/cs-test/test' (arm64)
Process 42160 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGSEGV
    frame #0: 0x000000010004ed9c test`RhpNewArray at AllocFast.S:219
Target 0: (test) stopped.
warning: This version of LLDB has no plugin for the language "assembler". Inspection of frame variables will be limited.
(lldb) register read
General Purpose Registers:
        x0 = 0x00000001000e1110  test`vtable for __Array<S_P_CoreLib_Internal_Runtime_TypeManagerHandle>
        x1 = 0x0000000000000000
        x2 = 0x0000000000000000
        x3 = 0x00006000007a8008
        x4 = 0x000000000000000e
        x5 = 0x000000003f852438
        x6 = 0x00006000039a0060
        x7 = 0x0000000000000d70
        x8 = 0x00006000018ac0f0
        x9 = 0x0000000000000000
       x10 = 0x0000000000000000
       x11 = 0x0000000000000010
       x12 = 0x0000000000000000
       x13 = 0x00000000fffffc06
       x14 = 0x00000000000007fb
       x15 = 0x00000000815fcffb
       x16 = 0x0000000000000006
       x17 = 0x00000000000003f9
       x18 = 0x0000000000000000
       x19 = 0x0000000000000000
       x20 = 0x00000001000d62b0  test`_ZN18RedhawkGCInterface25tls_pLastAllocationEETypeE$tlv$init
       x21 = 0x0000000100000000  test`_mh_execute_header
       x22 = 0x00000001000cd090  test`c_classlibFunctions
       x23 = 0x000000000000000e
       x24 = 0x0000000000000000
       x25 = 0x0000000000000000
       x26 = 0x0000000000000000
       x27 = 0x0000000000000000
       x28 = 0x0000000000000000
        fp = 0x000000016fdfee00
        lr = 0x00000001000a8ff8  test`fram0_S_P_CoreLib_Internal_Runtime_CompilerHelpers_StartupCodeHelpers__CreateTypeManagers + 312
        sp = 0x000000016fdfed90
        pc = 0x000000010004ed9c  test`RhpNewArray + 100
      cpsr = 0x60001000

(lldb) 

Regression?

I tested no other version of .NET so I don't know if this is regression or not.

This might be because medium using CoreRT linked from NativeLib sample uses rust and it doesn't mention about dead_strip however if this problem is os-dependent, this isn't.

Known Workarounds

not passing -dead_strip to the binary

Configuration

$ dotnet --version
8.0.100
$ neofetch --off
[email protected] 
---------------------- 
OS: macOS 14.2.1 23C71 arm64 
Host: MacBookPro18,4 
Kernel: 23.2.0 
Uptime: 13 days, 3 hours, 32 mins 
Packages: 265 (brew) 
Shell: zsh 5.9 
Resolution: 3360x1890, 1800x1169, 1280x800, 1920x1080 
DE: Aqua 
WM: Quartz Compositor 
WM Theme: Blue (Light) 
Terminal: rustrover 
CPU: Apple M1 Max 
GPU: Apple M1 Max 
Memory: 10511MiB / 65536MiB 

With cross-arch-compiling to x64/x86_64 and running on rosetta will cause same problem so this problem is not arch specific.
No other oses are tested so I don't know if this is os-specific

Other information

I encountered this problem trying to link C# to rust.
Rust passes -dead_strip by default.
For rust users, you can omit -dead_strip by setting rustflags = ["-C", "link-dead-code"] but this will increase final binary

In NativeLibrary sample, I have to pass -Wl,-u,_NativeAOT_StaticInitialization if it's built with .NET 7.
My project is .NET 8 so I think this is not related, and when I passed that option to compiler, linker says _NativeAOT_StaticInitialization is undefined.

Author: anatawa12
Assignees: -
Labels:

untriaged, area-NativeAOT-coreclr

Milestone: -

@MichalStrehovsky
Copy link
Member

Another dead_strip issue here: #88032

@MichalStrehovsky MichalStrehovsky added the os-mac-os-x macOS aka OSX label Jan 9, 2024
@anatawa12
Copy link
Contributor Author

I found that contents of __ZTV55__Array<S_P_CoreLib_Internal_Runtime_TypeManagerHandle> become zeros after linking code.

(the following is lldb console breaking at the start of the RhpNewArray)

(lldb) memory read $x0
0x1028b1e00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
0x1028b1e10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................

@ivanpovazan ivanpovazan self-assigned this Jan 9, 2024
@ivanpovazan
Copy link
Member

@MichalStrehovsky I assigned this to myself as I will be working on the iOS issue #88032 Feel free to change this if needed.

@anatawa12
Copy link
Contributor Author

I found another workaround(?) for this problem.

I can make it working by setting S_ATTR_NO_DEAD_STRIP flag for hydrated and __modules section of object file will not cause this problem (at least for my small program).

@agocke agocke added this to the Future milestone Jan 23, 2024
@ghost ghost removed the untriaged New issue has not been triaged by the area owner label Jan 23, 2024
@JCash
Copy link

JCash commented Jan 25, 2024

Hi!

Is there are workaround that I can use as an end user? (macOS, clang #97501)
I don't need it for production (yet) but it would be great to be able to work on other scaffolding for my task in the meantime.
Regards,
Mathias

@anatawa12
Copy link
Contributor Author

In my project, disabling dehydration and adding NO_STRIP flag to all sections of nativeaot object file wil workaround this problem.

https://github.com/anatawa12/vrc-get/blob/b92ff37ca31892a9bb4c0c5fc0364b0959930c6f/vrc-get-litedb/build.rs#L215

https://github.com/anatawa12/vrc-get/blob/b92ff37ca31892a9bb4c0c5fc0364b0959930c6f/vrc-get-litedb/build.rs#L276_L301

@JCash
Copy link

JCash commented Feb 2, 2024

I've tried your workaround, and it works.
The size of my tiny exe went from 4.3 mb to 4.1 mb.
Hopefully this ticket will allow for even further optimizations.

@JCash
Copy link

JCash commented May 30, 2024

Hi @ivanpovazan !
I just wanted to hear about any potential ETA for this fix?
E.g. I'd rather use an experimental dotnet version than build upon that workaround of patching the generated library files.

@ivanpovazan
Copy link
Member

@JCash sorry I was a bit side-tracked recently.
I will take a look again at this and respond here as soon as I have an update.
Thank you for understanding.

@dotnet-policy-service dotnet-policy-service bot added the in-pr There is an active PR which will close this issue when it is merged label Jun 4, 2024
@JCash
Copy link

JCash commented Jun 26, 2024

Q: Will this be backported to dotnet 8? And if so, in which version/timeframe would that happen?

Regards,
Mathias

@MichalStrehovsky
Copy link
Member

Q: Will this be backported to dotnet 8? And if so, in which version/timeframe would that happen?

Correct me if I'm wrong but -dead_strip is an optional switch and adding it to the command line improves size. The possible workaround is to not pass the switch.

If the above is correct, I don't think it would pass the bar for servicing. The servicing bar is pretty high.

@JCash
Copy link

JCash commented Jun 26, 2024

Sure. However as we provide a service to others, we can't remove the flag for them.

I needed to ask the question, as I'm new to this dotnet eco system.
I'm currently having four outstanding issues I'm waiting for, where one was backported but awaiting next release.
That's why I asked.

I'll sync with the team in our private chat as well, to see what the recommended course of action is for us.

@github-actions github-actions bot locked and limited conversation to collaborators Jul 26, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-NativeAOT-coreclr in-pr There is an active PR which will close this issue when it is merged os-mac-os-x macOS aka OSX
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

5 participants