Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reloc failures with NativeAOT on Apple Silicon #67232

Closed
am11 opened this issue Mar 28, 2022 · 103 comments · Fixed by #75264
Closed

Reloc failures with NativeAOT on Apple Silicon #67232

am11 opened this issue Mar 28, 2022 · 103 comments · Fixed by #75264

Comments

@am11
Copy link
Member

am11 commented Mar 28, 2022

I am trying to enable NativeAOT on OSX arm64. With this patch main...am11:feature/nativeaot/osx-arm64 (tested with both @GOTPAGE and @PAGE assembler directives), it builds the nupkg. Consuming that package results in the following errors during the ilc step:

# with `<add key="TestSource" value="/Users/am11/projects/runtime/artifacts/packages/Release/Shipping" />`
# in NuGet.config
$ dotnet nuget locals all --clear && rm -rf obj bin && dotnet publish --use-current-runtime -v:diag ...
... snip ...
21:06:05.007   1:7>Target "IlcCompile: (TargetId:181)" in file "/Users/am11/.nuget/packages/microsoft.dotnet.ilcompiler/7.0.0-dev/build/Microsoft.NETCore.Native.targets" from project "/Users/am11/projects/naot1/naot1.csproj" (target "LinkNative" depends on it):
                   Building target "IlcCompile" completely.
                   Output file "obj/release/net7.0/osx-arm64/native/naot1.o" does not exist.
                   Task "Message" skipped, due to false condition; ($(_BuildingInCompatibleMode) != 'true') was evaluated as (true != 'true').
                   Task "Message" (TaskId:126)
                     Task Parameter:Text=Generating compatible native code. To optimize for size or speed, visit https://aka.ms/OptimizeCoreRT (TaskId:126)
                     Task Parameter:Importance=high (TaskId:126)
                     Generating compatible native code. To optimize for size or speed, visit https://aka.ms/OptimizeCoreRT (TaskId:126)
                   Done executing task "Message". (TaskId:126)
                   Task "Exec" (TaskId:127)
                     Task Parameter:Command="/Users/am11/.nuget/packages/runtime.osx-arm64.microsoft.dotnet.ilcompiler/7.0.0-dev/tools/ilc" @"obj/release/net7.0/osx-arm64/native/naot1.ilc.rsp" (TaskId:127)
                     "/Users/am11/.nuget/packages/runtime.osx-arm64.microsoft.dotnet.ilcompiler/7.0.0-dev/tools/ilc" @"obj/release/net7.0/osx-arm64/native/naot1.ilc.rsp" (TaskId:127)
                     <unknown>:0: error: ADR/ADRP relocations must be GOT relative (TaskId:127)
                     <unknown>:0: error: unknown AArch64 fixup kind! (TaskId:127)
                     <unknown>:0: error: unknown AArch64 fixup kind! (TaskId:127)
                     <unknown>:0: error: fixup value out of range (TaskId:127)
                     <unknown>:0: error: ADR/ADRP relocations must be GOT relative (TaskId:127)
                     <unknown>:0: error: unknown AArch64 fixup kind! (TaskId:127)
                     <unknown>:0: error: unknown AArch64 fixup kind! (TaskId:127)
                     <unknown>:0: error: fixup value out of range (TaskId:127)
... repeats 1000s of times ...

somewhere after the objwriter has succeeded:

and before the clang command is executed. While the ilc task does not fail, MSBuild fails on the clang step:

                 Set Property: _IgnoreLinkerWarnings=false
                   Set Property: _IgnoreLinkerWarnings=true
                   Task "Exec" (TaskId:129)
                     Task Parameter:IgnoreStandardErrorWarningFormat=True (TaskId:129)
                     Task Parameter:Command=clang "obj/release/net7.0/osx-arm64/native/naot1.o" -o "bin/release/net7.0/osx-arm64/native/naot1" /Users/am11/.nuget/packages/runtime.osx-arm64.microsoft.dotnet.ilcompiler/7.0.0-dev/sdk/libbootstrapper.a /Users/am11/.nuget/packages/runtime.osx-arm64.microsoft.dotnet.ilcompiler/7.0.0-dev/sdk/libRuntime.WorkstationGC.a /Users/am11/.nuget/packages/runtime.osx-arm64.microsoft.dotnet.ilcompiler/7.0.0-dev/framework/libSystem.Native.a /Users/am11/.nuget/packages/runtime.osx-arm64.microsoft.dotnet.ilcompiler/7.0.0-dev/framework/libSystem.Globalization.Native.a /Users/am11/.nuget/packages/runtime.osx-arm64.microsoft.dotnet.ilcompiler/7.0.0-dev/framework/libSystem.IO.Compression.Native.a /Users/am11/.nuget/packages/runtime.osx-arm64.microsoft.dotnet.ilcompiler/7.0.0-dev/framework/libSystem.Net.Security.Native.a /Users/am11/.nuget/packages/runtime.osx-arm64.microsoft.dotnet.ilcompiler/7.0.0-dev/framework/libSystem.Security.Cryptography.Native.Apple.a -g -Wl,-rpath,'@executable_path' -lstdc++ -ldl -lm -lz -licucore -framework CoreFoundation -framework Foundation -framework Security -framework GSS (TaskId:129)
                     clang "obj/release/net7.0/osx-arm64/native/naot1.o" -o "bin/release/net7.0/osx-arm64/native/naot1" /Users/am11/.nuget/packages/runtime.osx-arm64.microsoft.dotnet.ilcompiler/7.0.0-dev/sdk/libbootstrapper.a /Users/am11/.nuget/packages/runtime.osx-arm64.microsoft.dotnet.ilcompiler/7.0.0-dev/sdk/libRuntime.WorkstationGC.a /Users/am11/.nuget/packages/runtime.osx-arm64.microsoft.dotnet.ilcompiler/7.0.0-dev/framework/libSystem.Native.a /Users/am11/.nuget/packages/runtime.osx-arm64.microsoft.dotnet.ilcompiler/7.0.0-dev/framework/libSystem.Globalization.Native.a /Users/am11/.nuget/packages/runtime.osx-arm64.microsoft.dotnet.ilcompiler/7.0.0-dev/framework/libSystem.IO.Compression.Native.a /Users/am11/.nuget/packages/runtime.osx-arm64.microsoft.dotnet.ilcompiler/7.0.0-dev/framework/libSystem.Net.Security.Native.a /Users/am11/.nuget/packages/runtime.osx-arm64.microsoft.dotnet.ilcompiler/7.0.0-dev/framework/libSystem.Security.Cryptography.Native.Apple.a -g -Wl,-rpath,'@executable_path' -lstdc++ -ldl -lm -lz -licucore -framework CoreFoundation -framework Foundation -framework Security -framework GSS (TaskId:129)
                     ld: malformed __LD,__compact_unwind section, bad length file 'obj/release/net7.0/osx-arm64/native/naot1.o' (TaskId:129)
                     clang: error: linker command failed with exit code 1 (use -v to see invocation) (TaskId:129)
21:06:12.873   1:7>/Users/am11/.nuget/packages/microsoft.dotnet.ilcompiler/7.0.0-dev/build/Microsoft.NETCore.Native.targets(337,5): error MSB3073: The command "clang "obj/release/net7.0/osx-arm64/native/naot1.o" -o "bin/release/net7.0/osx-arm64/native/naot1" /Users/am11/.nuget/packages/runtime.osx-arm64.microsoft.dotnet.ilcompiler/7.0.0-dev/sdk/libbootstrapper.a /Users/am11/.nuget/packages/runtime.osx-arm64.microsoft.dotnet.ilcompiler/7.0.0-dev/sdk/libRuntime.WorkstationGC.a /Users/am11/.nuget/packages/runtime.osx-arm64.microsoft.dotnet.ilcompiler/7.0.0-dev/framework/libSystem.Native.a /Users/am11/.nuget/packages/runtime.osx-arm64.microsoft.dotnet.ilcompiler/7.0.0-dev/framework/libSystem.Globalization.Native.a /Users/am11/.nuget/packages/runtime.osx-arm64.microsoft.dotnet.ilcompiler/7.0.0-dev/framework/libSystem.IO.Compression.Native.a /Users/am11/.nuget/packages/runtime.osx-arm64.microsoft.dotnet.ilcompiler/7.0.0-dev/framework/libSystem.Net.Security.Native.a /Users/am11/.nuget/packages/runtime.osx-arm64.microsoft.dotnet.ilcompiler/7.0.0-dev/framework/libSystem.Security.Cryptography.Native.Apple.a -g -Wl,-rpath,'@executable_path' -lstdc++ -ldl -lm -lz -licucore -framework CoreFoundation -framework Foundation -framework Security -framework GSS" exited with code 1. [/Users/am11/projects/naot1/naot1.csproj]
                   Done executing task "Exec" -- FAILED. (TaskId:129)
21:06:12.873   1:7>Done building target "LinkNative" in project "naot1.csproj" -- FAILED.: (TargetId:182)

With objdump, that __LD,__compact_unwind section looks like:

Disassembly of section __LD,__compact_unwind:

00000000003b2858 <ltmp8>:
  3b2858: 40 4b 00 00   udf     #19264
  3b285c: 00 00 00 00   udf     #0
  3b2860: 74 00 00 00   udf     #116
  3b2864: 00 00 00 03   <unknown>
                ...
  3b2874: c0 4b 00 00   udf     #19392
  3b2878: 00 00 00 00   udf     #0
  3b287c: 74 00 00 00   udf     #116
  3b2880: 00 00 00 03   <unknown>
                ...
  3b2890: 40 4c 00 00   udf     #19520
  3b2894: 00 00 00 00   udf     #0
  3b2898: 74 00 00 00   udf     #116
  3b289c: 00 00 00 03   <unknown>
                ...
  3b28ac: c0 4c 00 00   udf     #19648
  3b28b0: 00 00 00 00   udf     #0
  3b28b4: 74 00 00 00   udf     #116
  3b28b8: 00 00 00 03   <unknown>
... repeats ...
@MichalStrehovsky
Copy link
Member

Ah, so those messages are still generated by the object writer in ILC.

For example here: https://github.com/dotnet/llvm-project/blob/f1120a92d05f1c57e75af7d16504012570ef3409/llvm/lib/Target/AArch64/MCTargetDesc/AArch64MachObjectWriter.cpp#L102-L103.

Looks like we need to decide what kind of relocation to generate when we're generating it.

@am11
Copy link
Member Author

am11 commented Mar 29, 2022

@MichalStrehovsky, thanks for the pointers. The managed part of objwriter API currently does not support VariantKind, so I updated objwriter: dotnet/llvm-project#185. With the current state of that PR (+ main...am11:feature/nativeaot/osx-arm64), these two ilc errors are vanished:

error: fixup value out of range
error: ADR/ADRP relocations must be GOT relative 

but this one remains:

error: unknown AArch64 fixup kind!

it is false return from https://github.com/dotnet/llvm-project/blob/f1120a92d05f1c57e75af7d16504012570ef3409/llvm/lib/Target/AArch64/MCTargetDesc/AArch64MachObjectWriter.cpp#L59. I noticed that in objwriter, we do not explicitly create a MCFixup for mach, looks like those are classified by AArch64AsmBackend. 💭

@MichalStrehovsky
Copy link
Member

Really nice progress!

What is the reloc it's complaining about? Is it FK_PCRel_4 by any chance? We add special handling to it in our LLVM patch: dotnet/llvm-project@67f5503. Maybe having it take the same path as FK_Data_4 would work?

You'll know if the reloc got messed up if it ends up pointing at garbage after linking.

@am11
Copy link
Member Author

am11 commented Mar 30, 2022

Those were indeed all FK_PCRel_4. 😄

Pushed a commit to treat it as FK_Data_4. With that ilc msbuild task completes without producing any diagnostics.

clang task, however, continues to fail (still complaining about __LD,__compact_unwind section, bad length file). Interestingly, now the __LD,__compact_unwind section has disappeared when I run with objdump -d (as it was showing previously), but it shows up with objdump --full-contents -d.

Running ld command with -v shows nothing useful:

 % "/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/ld" \
       -demangle -lto_library /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/lib/libLTO.dylib \
       -dynamic -arch arm64 -platform_version macos 12.0.0 12.1 \
       -syslibroot /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk \
       -o bin/release/net7.0/osx-arm64/native/naot1 \
       -L/usr/local/lib obj/release/net7.0/osx-arm64/native/naot1.o \
       /Users/am11/.nuget/packages/runtime.osx-arm64.microsoft.dotnet.ilcompiler/7.0.0-dev/sdk/libbootstrapper.a \
       /Users/am11/.nuget/packages/runtime.osx-arm64.microsoft.dotnet.ilcompiler/7.0.0-dev/sdk/libRuntime.WorkstationGC.a \
       /Users/am11/.nuget/packages/runtime.osx-arm64.microsoft.dotnet.ilcompiler/7.0.0-dev/framework/libSystem.Native.a \
       /Users/am11/.nuget/packages/runtime.osx-arm64.microsoft.dotnet.ilcompiler/7.0.0-dev/framework/libSystem.Globalization.Native.a \
       /Users/am11/.nuget/packages/runtime.osx-arm64.microsoft.dotnet.ilcompiler/7.0.0-dev/framework/libSystem.IO.Compression.Native.a \
       /Users/am11/.nuget/packages/runtime.osx-arm64.microsoft.dotnet.ilcompiler/7.0.0-dev/framework/libSystem.Net.Security.Native.a \
       /Users/am11/.nuget/packages/runtime.osx-arm64.microsoft.dotnet.ilcompiler/7.0.0-dev/framework/libSystem.Security.Cryptography.Native.Apple.a \
       -rpath @executable_path -lc++ -ldl -lm -lz -licucore \
       -framework CoreFoundation -framework Foundation -framework Security -framework GSS \
       -lSystem /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/lib/clang/13.0.0/lib/darwin/libclang_rt.osx.a \
       -v

@(#)PROGRAM:ld  PROJECT:ld64-711
BUILD 21:57:24 Nov 17 2021
configured to support archs: armv6 armv7 armv7s arm64 arm64e arm64_32 i386 x86_64 x86_64h armv6m armv7k armv7m armv7em
Library search paths:
	/usr/local/lib
	/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk/usr/lib
Framework search paths:
	/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk/System/Library/Frameworks/
ld: malformed __LD,__compact_unwind section, bad length file 'obj/release/net7.0/osx-arm64/native/naot1.o'

@MichalStrehovsky
Copy link
Member

Digging in Apple's source code, the error seems to be:

https://github.com/apple-oss-distributions/ld64/blob/dbf8f7feb5579761f1623b004bd468bdea7c6225/src/ld/parsers/macho_relocatable_file.cpp#L5631-L5632

the size of __compact_unwind section is not divisible by the size of the unwind entry. That's odd because we don't generate the Apple weird thing, we generate DWARF CFI.

However, looking at LLVM source code, I think this is kicking in:

https://github.com/dotnet/llvm-project/blob/f1120a92d05f1c57e75af7d16504012570ef3409/llvm/lib/MC/MCObjectFileInfo.cpp#L30-L32

And LLVM does generate something on our behalf. Probably broken from the sound of it.

I would dig around that - can we still make do without a __compact_unwind section? If it's not present in the executable, maybe ld would still do the right thing and convert it from CFI to the compact unwind scheme for us.

Maybe the right thing would be to start generating compact unwinding because Apple tends to unceremoniously cut off things they don't like anymore after a couple years of supporting both the thing they stopped liking and the new shiny thing. Unwinding codes are currently generated in RyuJIT.

@am11
Copy link
Member Author

am11 commented Mar 30, 2022

Thanks, and nice detective work finding those links! (I was only searching across org:dotnet 😅)

Disabling compact unwind (dotnet/llvm-project@090a465) revealed some missing symbols errors. Those error messages were generous enough to point me to the missing C_FUNC(): am11@3bcd7b6c (I previously fixed couple of those in am11@b8cc922 which were failing the runtime build, but the new ones only show up when consuming Microsoft.DotNet.ILCompiler package 🤷)

Looks like we are getting there. Next error is:

  ld: in section __DATA,.corert_eh_table reloc 0: unknown relocation type 15 file 'obj/release/net7.0/osx-arm64/native/naot1.o'

Apparently, reloc 15 is assigned to PPC_RELOC_LOCAL_SECTDIFF (suggesting something is off the kilter.. 👀).

@MichalStrehovsky
Copy link
Member

.corert_eh_table uses the 32-bit relative relocs that would be translated to FK_PCRel_4 I think - so something might be going wrong around that.

Can you try updating this:

public bool SupportsRelativePointers
{
get
{
return (Abi != TargetAbi.CppCodegen) && (Architecture != TargetArchitecture.Wasm32);
}
}

and make it so that it returns false for ARM64 macOS? (use Architecture and OperatingSystem that is conveniently available on the class).

This will make the compiler avoid generating 32bit relative relocations in favor of full pointers. It will make the executables a bit bigger. We would want to try find a way to do 32bit relative relocs eventually, but let's first find out if that's really the problem.

@am11
Copy link
Member Author

am11 commented Mar 31, 2022

Having SupportsRelativePointers to return false had no effect, getting the ditto ld error. :(

@MichalStrehovsky
Copy link
Member

Ah, there's an extra code path that is not active for Wasm or CppCodegen (for which SupportsRelativePointers was introduced) - replace IMAGE_REL_BASED_RELPTR32 with IMAGE_REL_BASED_DIR64 in src\coreclr\tools\aot\ILCompiler.Compiler\Compiler\DependencyAnalysis\ObjectWriter.cs. This is nothing but a hack (it's going to crash if we take a GC or exception at runtime), but it might help narrowing down the problem/make further progress.

@am11
Copy link
Member Author

am11 commented Mar 31, 2022

Thanks, this workaround worked for .corert_eh_table section. Now I am getting it from another section:

  ld: in section __TEXT,__const reloc 0: unknown relocation type 15 file 'obj/release/net7.0/osx-arm64/native/naot1.o'

I'll try to find out what is causing type 15 failure (to avoid workarounds).

@MichalStrehovsky
Copy link
Member

I'll try to find out what is causing type 15 failure (to avoid workarounds).

Yeah, that sounds like a better plan. This workaround is starting to get out of hand (I think now it's relocs generated by RyuJIT).

I was able to make quick progress on issues like this in the past by reducing the problem into a ZeroSharp no-runtime size (https://github.com/MichalStrehovsky/zerosharp) repro case. The object files generated out of that are just a couple kilobytes in size and it's easier to trace through the problematic code within the compiler with that. But it's not mandatory to go in that direction, just a possible avenue if too many things are happening in a full Hello World.

@MichalStrehovsky MichalStrehovsky removed the untriaged New issue has not been triaged by the area owner label Mar 31, 2022
@am11
Copy link
Member Author

am11 commented Mar 31, 2022

Pushed a commit to objwriter (dotnet/llvm-project@7280b55) which fixes type 15 error. After that dotnet publish produced the binary successfully but it does not print Hello World! yet. :)

 % lldb bin/release/net7.0/osx-arm64/publish/naot1                
Added Microsoft public symbol server
Added symbol directory path: /usr/local/share/dotnet/shared/Microsoft.NETCore.App/6.0.2
Added symbol directory path: /usr/local/share/dotnet/packs/Microsoft.NETCore.App.Host.osx-arm64/6.0.2/runtimes/osx-arm64/native
(lldb) target create "../naot1/bin/release/net7.0/osx-arm64/publish/naot1"
Current executable set to '/Users/am11/projects/naot1/bin/release/net7.0/osx-arm64/publish/naot1' (arm64).
(lldb) r
Process 38291 launched: '/Users/am11/projects/naot1/bin/release/net7.0/osx-arm64/publish/naot1' (arm64)
Process 38291 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=2, address=0x100460988)
    frame #0: 0x0000000100460988 naot1`tls_CurrentThread
naot1`tls_CurrentThread:
->  0x100460988 <+0>:  ldp    x16, x17, [x9, #-0xd0]
    0x10046098c <+4>:  udf    #0x1
    0x100460990 <+8>:  udf    #0x102
    0x100460994 <+12>: udf    #0x0
Target 0: (naot1) stopped.
(lldb) bt all
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=2, address=0x100460988)
  * frame #0: 0x0000000100460988 naot1`tls_CurrentThread
    frame #1: 0x00000001001a6db0 naot1`InitializeModules + 80
    frame #2: 0x0000000100006438 naot1`main [inlined] InitializeRuntime() at main.cpp:169:5 [opt]
    frame #3: 0x00000001000063ac naot1`main(argc=1, argv=0x000000016fdff798) at main.cpp:201:19 [opt]
    frame #4: 0x0000000100c350f4 dyld`start + 520
  thread #2
    frame #0: 0x00000001a96f1eac libsystem_kernel.dylib`mach_absolute_time + 108
    frame #1: 0x00000001a96f3838 libsystem_kernel.dylib`__commpage_gettimeofday_internal + 44
    frame #2: 0x00000001a95f9534 libsystem_c.dylib`gettimeofday + 52
    frame #3: 0x000000010004fcb4 naot1`::QueryPerformanceCounter(lpPerformanceCount=0x000000016fe86f68) at PalRedhawkUnix.cpp:1090:9 [opt]
    frame #4: 0x0000000100012090 naot1`EnsureYieldProcessorNormalizedInitialized() [inlined] PalQueryPerformanceCounter(arg1=0x000000016fe86f68) at PalRedhawkFunctions.h:131:12 [opt]
    frame #5: 0x0000000100012088 naot1`EnsureYieldProcessorNormalizedInitialized() at yieldprocessornormalized.cpp:76:9 [opt]
    frame #6: 0x0000000100012024 naot1`EnsureYieldProcessorNormalizedInitialized() at yieldprocessornormalized.cpp:118:9 [opt]
    frame #7: 0x000000010000801c naot1`FinalizerStart(pContext=0x0000600003000090) at FinalizerHelpers.cpp:54:5 [opt]
    frame #8: 0x00000001a972d240 libsystem_pthread.dylib`_pthread_start + 148

@MichalStrehovsky
Copy link
Member

Great progress!

If I'm looking at the right thing, tls_currentThread is data, not code, so I guess we shouldn't be running it. But seeing a problem around it is not completely surprising since TLS access is likely going to be different on macOS than it is on Linux. The fix will probably be around the INLINE_GET_TLS_VAR macro in src\coreclr\nativeaot\Runtime. It should expand to however threadlocal statics are accessed on ARM64. There already is an if APPLE on the x64 version of the macro.

@am11
Copy link
Member Author

am11 commented Apr 1, 2022

Yup, that was it: am11@5106fc9 (based on ELF sequence vs. MachO)

It has moved the needle a bit; next up is RhpNewArray:

(lldb) r
Process 89591 launched: '/Users/am11/projects/naot1/bin/release/net7.0/osx-arm64/publish/naot1' (arm64)
This version of LLDB has no plugin for the mipsassem language. Inspection of frame variables will be limited.
Process 89591 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=2, address=0x1a9734530)
    frame #0: 0x000000010005c750 naot1`RhpNewArray at AllocFast.S:213
   210 	        ldr         x12, [x3, #OFFSETOF__Thread__m_alloc_context__alloc_ptr]
   211 	
   212 	        // Update the alloc pointer to account for the allocation.
-> 213 	        str         x2, [x3, #OFFSETOF__Thread__m_alloc_context__alloc_ptr]
   214 	
   215 	        // Set the new objects MethodTable pointer and element count.
   216 	        str         x0, [x12, #OFFSETOF__Object__m_pEEType]
Target 0: (naot1) stopped.

(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=2, address=0x1a9734530)
  * frame #0: 0x000000010005c750 naot1`RhpNewArray at AllocFast.S:213
    frame #1: 0x00000001001a7114 naot1`S_P_CoreLib_Internal_Runtime_CompilerHelpers_StartupCodeHelpers__CreateTypeManagers + 100
    frame #2: 0x00000001001a6db0 naot1`InitializeModules + 80
    frame #3: 0x0000000100006458 naot1`main [inlined] InitializeRuntime() at main.cpp:169:5 [opt]
    frame #4: 0x00000001000063cc naot1`main(argc=1, argv=0x000000016fdff7a0) at main.cpp:201:19 [opt]
    frame #5: 0x0000000100c350f4 dyld`start + 520
    
(lldb) register read
General Purpose Registers:
        x0 = 0x0000000100f04290
        x1 = 0x0000000000000001
        x2 = 0xd53bd071f9400430
        x3 = 0x00000001a9734530  libdyld.dylib`tlv_get_addr
        x4 = 0x000000000000000a
        x5 = 0x0000000000000000
        x6 = 0x0000000000000002
        x7 = 0x0000000000000000
        x8 = 0x00000000ffffffff
        x9 = 0x0000000000000000
       x10 = 0x0000000000000070
       x11 = 0x0000000000000001
       x12 = 0xd53bd071f9400410
       x13 = 0x0000000001dfb800
       x14 = 0x0000000001c00000
       x15 = 0x0000000000000044
       x16 = 0x0000000000000000
       x17 = 0x0000000100f04290
       x18 = 0x0000000000000000
       x19 = 0x0000000000000001
       x20 = 0x0000000100460968  naot1`__Module
       x21 = 0x0000000100000000  naot1`_mh_execute_header
       x22 = 0x00000001003773a0  naot1`c_classlibFunctions
       x23 = 0x000000000000000a
       x24 = 0x0000000000000000
       x25 = 0x0000000000000000
       x26 = 0x0000000000000000
       x27 = 0x0000000000000000
       x28 = 0x0000000000000000
        fp = 0x000000016fdff570
        lr = 0x000000010005c738  naot1`RhpNewArray + 56
        sp = 0x000000016fdff560
        pc = 0x000000010005c750  naot1`RhpNewArray + 80
      cpsr = 0x80001000

# a weird "error: warning: warning:" from printer
(lldb) p OFFSETOF__Thread__m_alloc_context__alloc_ptr
error: warning: warning: got name from symbols: OFFSETOF__Thread__m_alloc_context__alloc_ptr
error: <user expression 8>:1:1: reference to 'OFFSETOF__Thread__m_alloc_context__alloc_ptr' is ambiguous
OFFSETOF__Thread__m_alloc_context__alloc_ptr
^
note: candidate found by name lookup is 'OFFSETOF__Thread__m_alloc_context__alloc_ptr'

note: candidate found by name lookup is 'OFFSETOF__Thread__m_alloc_context__alloc_ptr'

# but this suggests offset is zero
(lldb) image lookup -n OFFSETOF__Thread__m_alloc_context__alloc_ptr
1 match found in /Users/am11/projects/naot1/bin/release/net7.0/osx-arm64/publish/naot1:
        Address: 0x0000000000000000 (0x0000000000000000)
        Summary: 0x0000000000000000

@MichalStrehovsky
Copy link
Member

That looks related to the TLS access. We're here:

#ifdef FEATURE_EMULATED_TLS
GETTHREAD_ETLS_3
#else
INLINE_GETTHREAD x3
#endif
// Load potential new object address into x12.
ldr x12, [x3, #OFFSETOF__Thread__m_alloc_context__alloc_ptr]
// Determine whether the end of the object would lie outside of the current allocation context. If so,
// we abandon the attempt to allocate the object directly and fall back to the slow helper.
add x2, x2, x12
ldr x12, [x3, #OFFSETOF__Thread__m_alloc_context__alloc_limit]
cmp x2, x12
bhi RhpNewArrayRare
// Reload new object address into x12.
ldr x12, [x3, #OFFSETOF__Thread__m_alloc_context__alloc_ptr]
// Update the alloc pointer to account for the allocation.
str x2, [x3, #OFFSETOF__Thread__m_alloc_context__alloc_ptr]

My suspicion is that INLINE_GETTHREAD loaded a bogus address into x3. It's supposed to load the tls_CurrentThread thread-local static.

.macro INLINE_GETTHREAD target
INLINE_GET_TLS_VAR \target, tls_CurrentThread
.endm

I would put a breakpoint here:

// static
inline Thread * ThreadStore::RawGetCurrentThread()
{
return (Thread *) &tls_CurrentThread;
}

and see what value the variable has (and how the compiler got to it in assembly). Then compare with what INLINE_GETTHREAD came up with. (Make sure you're looking at the same thread, we already have the finalizer thread running at this point in startup).

@am11
Copy link
Member Author

am11 commented Apr 1, 2022

Ah, right, had to move x0 to the target register and also pop stack before endm for APPLE branch: am11@750fc83. Thanks for the debugging pointers, I just printed its address with p (Thread *) &tls_CurrentThread and then compared what's the effect on x3 before and after Alloc.S:196.

Now we are in the managed main and getting to the code emitted via objwriter:

(lldb) target create "../naot1/bin/release/net7.0/osx-arm64/native/naot1"
Current executable set to '/Users/am11/projects/naot1/bin/release/net7.0/osx-arm64/native/naot1' (arm64).

(lldb) r
Process 86586 launched: '/Users/am11/projects/naot1/bin/release/net7.0/osx-arm64/native/naot1' (arm64)
Process 86586 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x0)
    frame #0: 0x00000001001fa8a4 naot1`S_P_CoreLib_System_Collections_Generic_EqualityComparer_1<System___Canon>__Create + 52
naot1`S_P_CoreLib_System_Collections_Generic_EqualityComparer_1<System___Canon>__Create:
->  0x1001fa8a4 <+52>: ldr    x1, [x0]
    0x1001fa8a8 <+56>: ldr    x1, [x1, #0x1e8]
    0x1001fa8ac <+60>: blr    x1
    0x1001fa8b0 <+64>: bl     0x1001a3b10               ; S_P_CoreLib_Internal_IntrinsicSupport_EqualityComparerHelpers__GetComparer
Target 0: (naot1) stopped.

(lldb) bt all
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x0)
  * frame #0: 0x00000001001fa8a4 naot1`S_P_CoreLib_System_Collections_Generic_EqualityComparer_1<System___Canon>__Create + 52
    frame #1: 0x0000000100190dd4 naot1`S_P_CoreLib_System_Collections_Generic_NonRandomizedStringEqualityComparer___cctor + 36
    frame #2: 0x000000010019722c naot1`S_P_CoreLib_System_Runtime_CompilerServices_ClassConstructorRunner__EnsureClassConstructorRun + 204
    frame #3: 0x00000001001970e4 naot1`S_P_CoreLib_System_Runtime_CompilerServices_ClassConstructorRunner__CheckStaticClassConstructionReturnGCStaticBase + 20
    frame #4: 0x0000000100190d48 naot1`S_P_CoreLib_System_Collections_Generic_NonRandomizedStringEqualityComparer__GetStringComparer + 24
    frame #5: 0x000000010020cc80 naot1`S_P_CoreLib_System_Collections_Generic_Dictionary_2<System___Canon__System___Canon>___ctor_2 + 128
    frame #6: 0x00000001000e2a04 naot1`S_P_CoreLib_System_AppContext__SetData + 84
    frame #7: 0x00000001002125a0 naot1`Internal_CompilerGenerated__Module___SetAppContextSwitches + 32
    frame #8: 0x0000000100212734 naot1`__managed__Main + 228
    frame #9: 0x0000000100006454 naot1`main(argc=1, argv=0x000000016fdff798) at main.cpp:205:18 [opt]
    frame #10: 0x0000000100c350f4 dyld`start + 520
  thread #2
    frame #0: 0x00000001a96f1eac libsystem_kernel.dylib`mach_absolute_time + 108
    frame #1: 0x00000001a96f3838 libsystem_kernel.dylib`__commpage_gettimeofday_internal + 44
    frame #2: 0x00000001a95f9534 libsystem_c.dylib`gettimeofday + 52
    frame #3: 0x000000010004fcc4 naot1`::QueryPerformanceCounter(lpPerformanceCount=0x000000016fe86f68) at PalRedhawkUnix.cpp:1090:9 [opt]
    frame #4: 0x00000001000120a0 naot1`EnsureYieldProcessorNormalizedInitialized() [inlined] PalQueryPerformanceCounter(arg1=0x000000016fe86f68) at PalRedhawkFunctions.h:131:12 [opt]
    frame #5: 0x0000000100012098 naot1`EnsureYieldProcessorNormalizedInitialized() at yieldprocessornormalized.cpp:76:9 [opt]
    frame #6: 0x0000000100012034 naot1`EnsureYieldProcessorNormalizedInitialized() at yieldprocessornormalized.cpp:118:9 [opt]
    frame #7: 0x000000010000802c naot1`FinalizerStart(pContext=0x0000600003000090) at FinalizerHelpers.cpp:54:5 [opt]
    frame #8: 0x00000001a972d240 libsystem_pthread.dylib`_pthread_start + 148

(lldb) register read
General Purpose Registers:
        x0 = 0x0000000000000000
        x1 = 0x0000000100373628  (void *)0x0000000100469238: __writableDataString
        x2 = 0x00000001003eeaf0  __TypeThreadStaticIndexS_P_CoreLib_System_Threading_ManagedThreadId
        x3 = 0x0000000000000018
        x4 = 0x0000000000000000
        x5 = 0x0000000101800000
        x6 = 0x0000000000000007
        x7 = 0x0000000000000000
        x8 = 0x0000000100ec11f0
        x9 = 0x0000600000004090
       x10 = 0x00000001a9734530  libdyld.dylib`tlv_get_addr
       x11 = 0x0000000000000001
       x12 = 0x000000010045f1e8  g_ephemeral_high
       x13 = 0x0000000101807fd0
       x14 = 0x0000000101806608
       x15 = 0x0000000101806620
       x16 = 0x0000000000000000
       x17 = 0x0000000100f04290
       x18 = 0x0000000000000000
       x19 = 0x000000010041bf70  vtable for S_P_CoreLib_System_Collections_Generic_EqualityComparer_1<String>
       x20 = 0x00000001198061d8
       x21 = 0x00000001198061e0
       x22 = 0x0000000101806440
       x23 = 0x0000000119804820
       x24 = 0x0000000000000000
       x25 = 0x0000000000000000
       x26 = 0x0000000000000000
       x27 = 0x0000000000000000
       x28 = 0x0000000000000000
        fp = 0x000000016fdff420
        lr = 0x00000001001fa8a4  naot1`S_P_CoreLib_System_Collections_Generic_EqualityComparer_1<System___Canon>__Create + 52
        sp = 0x000000016fdff420
        pc = 0x00000001001fa8a4  naot1`S_P_CoreLib_System_Collections_Generic_EqualityComparer_1<System___Canon>__Create + 52
      cpsr = 0x60001000

@MichalStrehovsky
Copy link
Member

This doesn't ring any bell - can you paste the full disassembly of the faulting method? It's not clear where we are. I'm guessing that this is an attempt to do a virtual call to access the TypeHandle property here:

? Unsafe.As<EqualityComparer<T>>(EqualityComparerHelpers.GetComparer(typeof(T).TypeHandle))

...but the System.Type we got out of the typeof is bogus. Right before that you should see a call to this method:

private static Type GetRuntimeType(IntPtr pEEType)
{
return Type.GetTypeFromEETypePtr(new EETypePtr(pEEType));
}

Validate that the parameter to the method is correct (it should be a MethodTable* and should have an associated textual symbol) and then try to step through what it's doing.

Might be easier to debug through this with optimizations off (drop the -O parameter from the ilc command line or dotnet publish as Debug).

@am11
Copy link
Member Author

am11 commented Apr 1, 2022

With debug, here is the full disassembly at IP/PC (0x1002f0bb8 in the middle of disassembly):

(lldb) r
Process 88778 launched: '/Users/am11/projects/naot1/bin/Debug/net7.0/osx-arm64/publish/naot1' (arm64)
Process 88778 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x0)
    frame #0: 0x00000001002f0bb8 naot1`S_P_CoreLib_System_Collections_Generic_EqualityComparer_1<System___Canon>__Create + 168
naot1`S_P_CoreLib_System_Collections_Generic_EqualityComparer_1<System___Canon>__Create:
->  0x1002f0bb8 <+168>: ldr    wzr, [x0]
    0x1002f0bbc <+172>: blr    x1
    0x1002f0bc0 <+176>: str    x0, [x29, #0x48]
    0x1002f0bc4 <+180>: ldr    x0, [x29, #0x48]
Target 0: (naot1) stopped.
(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x0)
  * frame #0: 0x00000001002f0bb8 naot1`S_P_CoreLib_System_Collections_Generic_EqualityComparer_1<System___Canon>__Create + 168
    frame #1: 0x00000001002f0c74 naot1`S_P_CoreLib_System_Collections_Generic_EqualityComparer_1<System___Canon>__get_Default + 68
    frame #2: 0x000000010023dec4 naot1`S_P_CoreLib_System_Collections_Generic_NonRandomizedStringEqualityComparer___cctor + 36
    frame #3: 0x0000000100247ba0 naot1`S_P_CoreLib_System_Runtime_CompilerServices_ClassConstructorRunner__EnsureClassConstructorRun + 384
    frame #4: 0x0000000100247988 naot1`S_P_CoreLib_System_Runtime_CompilerServices_ClassConstructorRunner__CheckStaticClassConstructionReturnGCStaticBase + 24
    frame #5: 0x000000010023de28 naot1`S_P_CoreLib_System_Collections_Generic_NonRandomizedStringEqualityComparer__GetStringComparer + 40
    frame #6: 0x000000010030e32c naot1`S_P_CoreLib_System_Collections_Generic_Dictionary_2<System___Canon__System___Canon>___ctor_2 + 268
    frame #7: 0x000000010030e1dc naot1`S_P_CoreLib_System_Collections_Generic_Dictionary_2<System___Canon__System___Canon>___ctor + 28
    frame #8: 0x0000000100123414 naot1`S_P_CoreLib_System_AppContext__SetData + 116
    frame #9: 0x0000000100317a9c naot1`Internal_CompilerGenerated__Module___SetAppContextSwitches + 28
    frame #10: 0x0000000100317bbc naot1`__managed__Main + 60
    frame #11: 0x000000010000df64 naot1`main(argc=1, argv=0x000000016fdff7a0) at main.cpp:205:18 [opt]
    frame #12: 0x0000000100f190f4 dyld`start + 520
(lldb) disassemble -a 0x1002f0bb8
naot1`S_P_CoreLib_System_Collections_Generic_EqualityComparer_1<System___Canon>__Create:
    0x1002f0b10 <+0>:   stp    x29, x30, [sp, #-0x90]!
    0x1002f0b14 <+4>:   mov    x29, sp
    0x1002f0b18 <+8>:   add    x9, x29, #0x28            ; =0x28 
    0x1002f0b1c <+12>:  movi.16b v16, #0x0
    0x1002f0b20 <+16>:  stp    q16, q16, [x9]
    0x1002f0b24 <+20>:  stp    q16, q16, [x9, #0x20]
    0x1002f0b28 <+24>:  stp    xzr, xzr, [x9, #0x40]
    0x1002f0b2c <+28>:  str    xzr, [x9, #0x50]
    0x1002f0b30 <+32>:  str    x0, [x29, #0x88]
    0x1002f0b34 <+36>:  str    x0, [x29, #0x80]
    0x1002f0b38 <+40>:  ldr    x0, [x29, #0x80]
    0x1002f0b3c <+44>:  bl     0x10000b380               ; __GenericLookupFromType_S_P_CoreLib_System_Collections_Generic_EqualityComparer_1<System___Canon>_GCStaticBase_S_P_CoreLib_System_Collections_Generic_EqualityComparer_1<T_System___Canon>
    0x1002f0b40 <+48>:  add    x0, x0, #0x8              ; =0x8 
    0x1002f0b44 <+52>:  str    x0, [x29, #0x78]
    0x1002f0b48 <+56>:  ldr    x0, [x29, #0x80]
    0x1002f0b4c <+60>:  bl     0x10000bd54               ; __GenericLookupFromType_S_P_CoreLib_System_Collections_Generic_EqualityComparer_1<System___Canon>_TypeHandle_S_P_CoreLib_System_Collections_Generic_EqualityComparer_1<T_System___Canon>
    0x1002f0b50 <+64>:  bl     0x1002f0af0               ; S_P_CoreLib_System_Collections_Generic_EqualityComparer_1<System___Canon>__get_SupportsGenericIEquatableInterfaces
    0x1002f0b54 <+68>:  str    w0, [x29, #0x74]
    0x1002f0b58 <+72>:  ldr    x0, [x29, #0x78]
    0x1002f0b5c <+76>:  str    x0, [x29, #0x68]
    0x1002f0b60 <+80>:  ldr    w0, [x29, #0x74]
    0x1002f0b64 <+84>:  cbnz   w0, 0x1002f0b88           ; <+120>
    0x1002f0b68 <+88>:  nop    
    0x1002f0b6c <+92>:  nop    
    0x1002f0b70 <+96>:  nop    
    0x1002f0b74 <+100>: nop    
    0x1002f0b78 <+104>: nop    
    0x1002f0b7c <+108>: ldr    x0, [x29, #0x68]
    0x1002f0b80 <+112>: str    x0, [x29, #0x28]
    0x1002f0b84 <+116>: b      0x1002f0b84               ; <+116>
    0x1002f0b88 <+120>: ldr    x0, [x29, #0x80]
    0x1002f0b8c <+124>: bl     0x10000bd60               ; __GenericLookupFromType_S_P_CoreLib_System_Collections_Generic_EqualityComparer_1<System___Canon>_TypeHandle_T_System___Canon
    0x1002f0b90 <+128>: bl     0x100261960               ; S_P_CoreLib_Internal_Runtime_CompilerHelpers_LdTokenHelpers__GetRuntimeTypeHandle
    0x1002f0b94 <+132>: str    x0, [x29, #0x60]
    0x1002f0b98 <+136>: ldr    x0, [x29, #0x60]
    0x1002f0b9c <+140>: bl     0x100140890               ; S_P_CoreLib_System_Type__GetTypeFromHandle
    0x1002f0ba0 <+144>: str    x0, [x29, #0x58]
    0x1002f0ba4 <+148>: adrp   x0, -747
    0x1002f0ba8 <+152>: add    x0, x0, #0x4a8            ; =0x4a8 
    0x1002f0bac <+156>: str    x0, [x29, #0x50]
    0x1002f0bb0 <+160>: ldr    x0, [x29, #0x58]
    0x1002f0bb4 <+164>: ldr    x1, [x29, #0x50]
->  0x1002f0bb8 <+168>: ldr    wzr, [x0]
    0x1002f0bbc <+172>: blr    x1
    0x1002f0bc0 <+176>: str    x0, [x29, #0x48]
    0x1002f0bc4 <+180>: ldr    x0, [x29, #0x48]
    0x1002f0bc8 <+184>: bl     0x1002596d0               ; S_P_CoreLib_Internal_IntrinsicSupport_EqualityComparerHelpers__GetComparer
    0x1002f0bcc <+188>: str    x0, [x29, #0x40]
    0x1002f0bd0 <+192>: ldr    x0, [x29, #0x80]
    0x1002f0bd4 <+196>: bl     0x10000c954               ; __GenericLookupFromType_S_P_CoreLib_System_Collections_Generic_EqualityComparer_1<System___Canon>_MethodDictionary_S_P_CoreLib_System_Runtime_CompilerServices_Unsafe__As<S_P_CoreLib_System_Collections_Generic_EqualityComparer_1<T_System___Canon>>
    0x1002f0bd8 <+200>: str    x0, [x29, #0x20]
    0x1002f0bdc <+204>: ldr    x0, [x29, #0x20]
    0x1002f0be0 <+208>: ldr    x1, [x29, #0x40]
    0x1002f0be4 <+212>: bl     0x100370490               ; S_P_CoreLib_System_Runtime_CompilerServices_Unsafe__As<System___Canon>
    0x1002f0be8 <+216>: str    x0, [x29, #0x38]
    0x1002f0bec <+220>: ldr    x0, [x29, #0x80]
    0x1002f0bf0 <+224>: bl     0x10000c948               ; __GenericLookupFromType_S_P_CoreLib_System_Collections_Generic_EqualityComparer_1<System___Canon>_MethodDictionary_S_P_CoreLib_System_Threading_Interlocked__CompareExchange_3<S_P_CoreLib_System_Collections_Generic_EqualityComparer_1<T_System___Canon>>
    0x1002f0bf4 <+228>: str    x0, [x29, #0x18]
    0x1002f0bf8 <+232>: ldr    x0, [x29, #0x18]
    0x1002f0bfc <+236>: ldr    x1, [x29, #0x68]
    0x1002f0c00 <+240>: ldr    x2, [x29, #0x38]
    0x1002f0c04 <+244>: mov    x3, xzr
    0x1002f0c08 <+248>: bl     0x10036f770               ; S_P_CoreLib_System_Threading_Interlocked__CompareExchange_3<System___Canon>
    0x1002f0c0c <+252>: str    x0, [x29, #0x30]
    0x1002f0c10 <+256>: nop    
    0x1002f0c14 <+260>: ldr    x0, [x29, #0x80]
    0x1002f0c18 <+264>: bl     0x10000b380               ; __GenericLookupFromType_S_P_CoreLib_System_Collections_Generic_EqualityComparer_1<System___Canon>_GCStaticBase_S_P_CoreLib_System_Collections_Generic_EqualityComparer_1<T_System___Canon>
    0x1002f0c1c <+268>: ldr    x0, [x0, #0x8]
    0x1002f0c20 <+272>: ldp    x29, x30, [sp], #0x90
    0x1002f0c24 <+276>: ret    
    0x1002f0c28 <+280>: udf    #0x0
    0x1002f0c2c <+284>: udf    #0x0

(lldb) register read
General Purpose Registers:
        x0 = 0x0000000000000000
        x1 = 0x00000001000054a8  naot1`__VirtualCall_S_P_CoreLib_System_Type__get_TypeHandle
        x2 = 0x000000000000000a
        x3 = 0x0000000102006668
        x4 = 0x0000000000000020
        x5 = 0x0000000000000003
        x6 = 0x0000000000000007
        x7 = 0x0000000000000000
        x8 = 0x00000001011a11f0
        x9 = 0x000000016fdff218
       x10 = 0x00000001a9734530  libdyld.dylib`tlv_get_addr
       x11 = 0x0000000000000001
       x12 = 0x00000001005fe7a8  g_ephemeral_high
       x13 = 0x0000000102007fd0
       x14 = 0x0000000102006638
       x15 = 0x0000000102006650
       x16 = 0x0000000000000000
       x17 = 0x0000000101204290
       x18 = 0x0000000000000000
       x19 = 0x000000016fdff7a0
       x20 = 0x0000000000000001
       x21 = 0x0000000100000000  naot1`_mh_execute_header
       x22 = 0x0000000000000000
       x23 = 0x0000000000000000
       x24 = 0x0000000000000000
       x25 = 0x0000000000000000
       x26 = 0x0000000000000000
       x27 = 0x0000000000000000
       x28 = 0x0000000000000000
        fp = 0x000000016fdff290
        lr = 0x00000001002f0ba0  naot1`S_P_CoreLib_System_Collections_Generic_EqualityComparer_1<System___Canon>__Create + 144
        sp = 0x000000016fdff290
        pc = 0x00000001002f0bb8  naot1`S_P_CoreLib_System_Collections_Generic_EqualityComparer_1<System___Canon>__Create + 168
      cpsr = 0x20001000

x0 is null.


BTW, runtime's build.sh also produced ~/projects/runtime/artifacts/packages/Release/Shipping/Microsoft.DotNet.ILCompiler.7.0.0-dev.symbols.nupkg, but I am not sure if dotnet-sos, dotnet-symbol and friends recognize it OOTB (there were few exports which were required for singlefilehost to work with SOS: b0621e7 and 8c6e3e9. That would make things super easy to debug.

@MichalStrehovsky
Copy link
Member

I'm still looking at the disassembly, but while I'm doing that - the compiler produces DWARF debug information so you should be getting line numbers and local variables. It's supposed to debug like C++. Make sure ILC is invoked with the -g option (it should be unless opted out).

@am11
Copy link
Member Author

am11 commented Apr 1, 2022

Yup, ./obj/Debug/net7.0/osx-arm64/native/naot1.ilc.rsp has -g. So the output in lldb is complete/expected. 👍

@MichalStrehovsky
Copy link
Member

Can you dump what's in x0 at the spot where we call S_P_CoreLib_System_Type__GetTypeFromHandle? It should be an address that has a symbol associated with it (image lookup -va the_address_in_x0 should show a symbol for it). If there's no symbol, it's suspicious - it should be pointing to a MethodTable so you can cast it to MethodTable* and dump the contents to see if it looks legit. If it has a symbol, it's legit for sure.

If x0 is already bogus at that point, check x0 before the call to __GenericLookupFromType_S_P_CoreLib_System_Collections_Generic_EqualityComparer_1<System___Canon>_TypeHandle_S_P_CoreLib_System_Collections_Generic_EqualityComparer_1<T_System___Canon>. That one should also be a MethodTable with a symbol associated.

Yup, ./obj/Debug/net7.0/osx-arm64/native/naot1.ilc.rsp has -g. So the output in lldb is complete/expected. 👍

Just to double check - do you get line debugging and local variables as well?

@am11
Copy link
Member Author

am11 commented Apr 1, 2022

vtable for String is the name:

(lldb) r
Process 89722 launched: '/Users/am11/projects/naot1/bin/Debug/net7.0/osx-arm64/publish/naot1' (arm64)
Process 89722 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x0)
    frame #0: 0x00000001002f0bb8 naot1`S_P_CoreLib_System_Collections_Generic_EqualityComparer_1<System___Canon>__Create + 168
naot1`S_P_CoreLib_System_Collections_Generic_EqualityComparer_1<System___Canon>__Create:
->  0x1002f0bb8 <+168>: ldr    wzr, [x0]
    0x1002f0bbc <+172>: blr    x1
    0x1002f0bc0 <+176>: str    x0, [x29, #0x48]
    0x1002f0bc4 <+180>: ldr    x0, [x29, #0x48]
Target 0: (naot1) stopped.

(lldb) disassemble -a 0x1002f0bb8
naot1`S_P_CoreLib_System_Collections_Generic_EqualityComparer_1<System___Canon>__Create:
    0x1002f0b10 <+0>:   stp    x29, x30, [sp, #-0x90]!
    0x1002f0b14 <+4>:   mov    x29, sp
    0x1002f0b18 <+8>:   add    x9, x29, #0x28            ; =0x28 
    0x1002f0b1c <+12>:  movi.16b v16, #0x0
    0x1002f0b20 <+16>:  stp    q16, q16, [x9]
    0x1002f0b24 <+20>:  stp    q16, q16, [x9, #0x20]
    0x1002f0b28 <+24>:  stp    xzr, xzr, [x9, #0x40]
    0x1002f0b2c <+28>:  str    xzr, [x9, #0x50]
    0x1002f0b30 <+32>:  str    x0, [x29, #0x88]
    0x1002f0b34 <+36>:  str    x0, [x29, #0x80]
    0x1002f0b38 <+40>:  ldr    x0, [x29, #0x80]
    0x1002f0b3c <+44>:  bl     0x10000b380               ; __GenericLookupFromType_S_P_CoreLib_System_Collections_Generic_EqualityComparer_1<System___Canon>_GCStaticBase_S_P_CoreLib_System_Collections_Generic_EqualityComparer_1<T_System___Canon>
    0x1002f0b40 <+48>:  add    x0, x0, #0x8              ; =0x8 
    0x1002f0b44 <+52>:  str    x0, [x29, #0x78]
    0x1002f0b48 <+56>:  ldr    x0, [x29, #0x80]
    0x1002f0b4c <+60>:  bl     0x10000bd54               ; __GenericLookupFromType_S_P_CoreLib_System_Collections_Generic_EqualityComparer_1<System___Canon>_TypeHandle_S_P_CoreLib_System_Collections_Generic_EqualityComparer_1<T_System___Canon>
    0x1002f0b50 <+64>:  bl     0x1002f0af0               ; S_P_CoreLib_System_Collections_Generic_EqualityComparer_1<System___Canon>__get_SupportsGenericIEquatableInterfaces
    0x1002f0b54 <+68>:  str    w0, [x29, #0x74]
    0x1002f0b58 <+72>:  ldr    x0, [x29, #0x78]
    0x1002f0b5c <+76>:  str    x0, [x29, #0x68]
    0x1002f0b60 <+80>:  ldr    w0, [x29, #0x74]
    0x1002f0b64 <+84>:  cbnz   w0, 0x1002f0b88           ; <+120>
    0x1002f0b68 <+88>:  nop    
    0x1002f0b6c <+92>:  nop    
    0x1002f0b70 <+96>:  nop    
    0x1002f0b74 <+100>: nop    
    0x1002f0b78 <+104>: nop    
    0x1002f0b7c <+108>: ldr    x0, [x29, #0x68]
    0x1002f0b80 <+112>: str    x0, [x29, #0x28]
    0x1002f0b84 <+116>: b      0x1002f0b84               ; <+116>
    0x1002f0b88 <+120>: ldr    x0, [x29, #0x80]
    0x1002f0b8c <+124>: bl     0x10000bd60               ; __GenericLookupFromType_S_P_CoreLib_System_Collections_Generic_EqualityComparer_1<System___Canon>_TypeHandle_T_System___Canon
    0x1002f0b90 <+128>: bl     0x100261960               ; S_P_CoreLib_Internal_Runtime_CompilerHelpers_LdTokenHelpers__GetRuntimeTypeHandle
    0x1002f0b94 <+132>: str    x0, [x29, #0x60]
    0x1002f0b98 <+136>: ldr    x0, [x29, #0x60]
    0x1002f0b9c <+140>: bl     0x100140890               ; S_P_CoreLib_System_Type__GetTypeFromHandle
    0x1002f0ba0 <+144>: str    x0, [x29, #0x58]
    0x1002f0ba4 <+148>: adrp   x0, -747
    0x1002f0ba8 <+152>: add    x0, x0, #0x4a8            ; =0x4a8 
    0x1002f0bac <+156>: str    x0, [x29, #0x50]
    0x1002f0bb0 <+160>: ldr    x0, [x29, #0x58]
    0x1002f0bb4 <+164>: ldr    x1, [x29, #0x50]
->  0x1002f0bb8 <+168>: ldr    wzr, [x0]
    0x1002f0bbc <+172>: blr    x1
    0x1002f0bc0 <+176>: str    x0, [x29, #0x48]
    0x1002f0bc4 <+180>: ldr    x0, [x29, #0x48]
    0x1002f0bc8 <+184>: bl     0x1002596d0               ; S_P_CoreLib_Internal_IntrinsicSupport_EqualityComparerHelpers__GetComparer
    0x1002f0bcc <+188>: str    x0, [x29, #0x40]
    0x1002f0bd0 <+192>: ldr    x0, [x29, #0x80]
    0x1002f0bd4 <+196>: bl     0x10000c954               ; __GenericLookupFromType_S_P_CoreLib_System_Collections_Generic_EqualityComparer_1<System___Canon>_MethodDictionary_S_P_CoreLib_System_Runtime_CompilerServices_Unsafe__As<S_P_CoreLib_System_Collections_Generic_EqualityComparer_1<T_System___Canon>>
    0x1002f0bd8 <+200>: str    x0, [x29, #0x20]
    0x1002f0bdc <+204>: ldr    x0, [x29, #0x20]
    0x1002f0be0 <+208>: ldr    x1, [x29, #0x40]
    0x1002f0be4 <+212>: bl     0x100370490               ; S_P_CoreLib_System_Runtime_CompilerServices_Unsafe__As<System___Canon>
    0x1002f0be8 <+216>: str    x0, [x29, #0x38]
    0x1002f0bec <+220>: ldr    x0, [x29, #0x80]
    0x1002f0bf0 <+224>: bl     0x10000c948               ; __GenericLookupFromType_S_P_CoreLib_System_Collections_Generic_EqualityComparer_1<System___Canon>_MethodDictionary_S_P_CoreLib_System_Threading_Interlocked__CompareExchange_3<S_P_CoreLib_System_Collections_Generic_EqualityComparer_1<T_System___Canon>>
    0x1002f0bf4 <+228>: str    x0, [x29, #0x18]
    0x1002f0bf8 <+232>: ldr    x0, [x29, #0x18]
    0x1002f0bfc <+236>: ldr    x1, [x29, #0x68]
    0x1002f0c00 <+240>: ldr    x2, [x29, #0x38]
    0x1002f0c04 <+244>: mov    x3, xzr
    0x1002f0c08 <+248>: bl     0x10036f770               ; S_P_CoreLib_System_Threading_Interlocked__CompareExchange_3<System___Canon>
    0x1002f0c0c <+252>: str    x0, [x29, #0x30]
    0x1002f0c10 <+256>: nop    
    0x1002f0c14 <+260>: ldr    x0, [x29, #0x80]
    0x1002f0c18 <+264>: bl     0x10000b380               ; __GenericLookupFromType_S_P_CoreLib_System_Collections_Generic_EqualityComparer_1<System___Canon>_GCStaticBase_S_P_CoreLib_System_Collections_Generic_EqualityComparer_1<T_System___Canon>
    0x1002f0c1c <+268>: ldr    x0, [x0, #0x8]
    0x1002f0c20 <+272>: ldp    x29, x30, [sp], #0x90
    0x1002f0c24 <+276>: ret    
    0x1002f0c28 <+280>: udf    #0x0
    0x1002f0c2c <+284>: udf    #0x0

(lldb) b -a 0x1002f0b9c
Breakpoint 1: where = naot1`S_P_CoreLib_System_Collections_Generic_EqualityComparer_1<System___Canon>__Create + 140, address = 0x00000001002f0b9c

(lldb) r
There is a running process, kill it and restart?: [Y/n] Y
Process 89722 exited with status = 9 (0x00000009) 
Process 89732 launched: '/Users/am11/projects/naot1/bin/Debug/net7.0/osx-arm64/publish/naot1' (arm64)
Process 89732 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.1
    frame #0: 0x00000001002f0b9c naot1`S_P_CoreLib_System_Collections_Generic_EqualityComparer_1<System___Canon>__Create + 140
naot1`S_P_CoreLib_System_Collections_Generic_EqualityComparer_1<System___Canon>__Create:
->  0x1002f0b9c <+140>: bl     0x100140890               ; S_P_CoreLib_System_Type__GetTypeFromHandle
    0x1002f0ba0 <+144>: str    x0, [x29, #0x58]
    0x1002f0ba4 <+148>: adrp   x0, -747
    0x1002f0ba8 <+152>: add    x0, x0, #0x4a8            ; =0x4a8 
Target 0: (naot1) stopped.

(lldb) register read x0
      x0 = 0x00000001005a0620  vtable for String

(lldb) image lookup -va 0x00000001005a0620
      Address: naot1[0x00000001005a0620] (naot1.__DATA.__data + 297760)
      Summary: vtable for String
       Module: file = "/Users/am11/projects/naot1/bin/Debug/net7.0/osx-arm64/publish/naot1", arch = "arm64"
       Symbol: id = {0x00008410}, range = [0x00000001005a0620-0x00000001005a0698), name="vtable for String", mangled="_ZTV6String"

@jkotas
Copy link
Member

jkotas commented Sep 5, 2022

Or fix the JIT to produce frames that are compatible with compact unwinding.

@filipnavara
Copy link
Member

Or fix the JIT to produce frames that are compatible with compact unwinding.

That's also an option. I started reading the design document and there are some limits on what the instruction encoding allows in terms of offsets... Overall I think there are multiple viable ways to fix it but it's relatively easy to get it working in some way now that I know what is happening.

@filipnavara
Copy link
Member

It is likely a failure to allocate executable memory.

Agreed, this looks like the case. I tried borrowing some helpers from coreclr/pal yesterday, but it has not fixed the issue (so far).

You would need to map the memory with MAP_JIT flag (see MEM_RESERVE_EXECUTABLE in CLR code). Writes to both the data section and the thunk section have to be protected by the pthread_jit_write_protect_np call. Since the data chunk of memory is written from the managed code you would need to also expose the helper to enable/disable write protection to the managed code.

@filipnavara
Copy link
Member

Reflection tests are failing because GetUnwindProcInfo doesn't understand compact unwinding yet. I have a local fix but it needs polishing before I push it.

Testing delegate targets are reflectable...
Testing virtual delegate targets are reflectable...
TestContainment
TestInterfaceMethod
TestByRefLikeTypeMethod
TestILScanner
Search current assembly
GetMethod on a non-generic type
Totally unreferenced method on a non-generic type (we should not find it)
GetMethod on a non-generic type for a generic method
Generics
Partial canonical types
Search in system assembly
Search through a forwarder
Search in mscorlib
Enum.GetValues
Enum.GetValuesAsUnderlyingType
Pattern in LINQ expressions
Other pattern in LINQ expressions
TestUnreferencedEnum
TestAttributeInheritance
TestStringConstructor
TestAssemblyAndModuleAttributes
TestAttributeExpressions
TestParameterAttributes
TestPropertyAndEventAttributes
TestNecessaryEETypeReflection
TestCreateDelegate
TestGetUninitializedObject
TestInstanceFields
TestReflectionInvoke
TestInvokeMemberParamsCornerCase
TestDefaultInterfaceInvoke
TestCovariantReturnInvoke
TestThreadStaticFields
TestByRefReturnInvoke
Process 59310 exited with status = 100 (0x00000064)

@filipnavara
Copy link
Member

I pushed changes to my branch that get the PInvoke smoke test passing. The MAP_JIT protection was the easy part. Apparently TLS access was trashing some registers in tls_get_var function and that prevented the stubs from working correctly.

@filipnavara
Copy link
Member

Looks like we don't need to change gcenv.unix.cpp now that OS_PAGE_SIZE is fixed?

Apparently we still have to, I checked. Not sure what's different from regular CoreCLR though.

@filipnavara
Copy link
Member

filipnavara commented Sep 8, 2022

The remaining failures seem to have some memory trashing going on. There may be something suspicious going on in RhpCheckedLockCmpXchg since the trashed locations seem to be variables written by Interlocked.CompareExchange. Or possibly the GC stack scanning is missing something and the heap gets compacted with live references incorrectly discarded...

@filipnavara
Copy link
Member

aaaaaargh:

.macro PREPARE_EXTERNAL_VAR Name, HelperReg
#if defined(__APPLE__)
        adrp \HelperReg, C_FUNC(\Name)@GOTPAGE
        ldr  \HelperReg, [\HelperReg, C_FUNC(\Name)@GOTPAGEOFF]
#else
        adrp \HelperReg, C_FUNC(\Name)
        add  \HelperReg, \HelperReg, :lo12:C_FUNC(\Name)
#endif
.endm

.macro PREPARE_EXTERNAL_VAR_INDIRECT Name, HelperReg
#if defined(__APPLE__)
        adrp \HelperReg, C_FUNC(\Name)@GOTPAGE
        ldr  \HelperReg, [\HelperReg, C_FUNC(\Name)@GOTPAGEOFF]
#else
        adrp \HelperReg, C_FUNC(\Name)
        ldr  \HelperReg, [\HelperReg, :lo12:C_FUNC(\Name)]
#endif
.endm

Spot the mistake.

@filipnavara
Copy link
Member

Somewhere along the way I broke the Reflection test... the weird part is that it's because of a corrupted pointer in the data section and it's already corrupted at the process start before any code is run:

image

Process 25618 launched: '/Users/filipnavara/Projects/runtime/artifacts/tests/coreclr/OSX.arm64.Debug/nativeaot/SmokeTests/Reflection/Reflection/native/Reflection' (arm64)
(lldb) p *(void **)0x00000001008e7ffc
(void *) $1 = 0x00000000004c3140

...and the section has no relocations:

image

I am running out of ideas on what could have possibly caused that.

@filipnavara
Copy link
Member

filipnavara commented Sep 8, 2022

Eh, I traced back the Relocation failure to https://github.com/dotnet/llvm-project/blob/cb1c615abd1a871bd2d2a105325aaa84ee5913b5/llvm/tools/objwriter/objwriter.cpp#L257-L262 ... will try to fix it, or submit revert of the code block to llvm-project for objwriter.

I did a rebuild and I could not reproduce it anymore...

@filipnavara
Copy link
Member

filipnavara commented Sep 8, 2022

With the current state of things the smoke tests sometimes pass on my machine. Other times it fails here:

* thread #172, stop reason = signal SIGABRT
  * frame #0: 0x00000001a0366224 libsystem_kernel.dylib`__pthread_kill + 8
    frame #1: 0x00000001a039ccec libsystem_pthread.dylib`pthread_kill + 288
    frame #2: 0x00000001a02d62d8 libsystem_c.dylib`abort + 180
    frame #3: 0x00000001000a6980 UnitTests`::PalHijack(hThread=0x00006000002040e0, pThreadToHijack=0x0000000123004740) at PalRedhawkUnix.cpp:1041:9
    frame #4: 0x000000010002bad0 UnitTests`Thread::Hijack(this=0x0000000123004740) at thread.cpp:616:5
    frame #5: 0x000000010002db84 UnitTests`ThreadStore::SuspendAllThreads(this=0x0000600000202d80, waitForGCEvent=true) at threadstore.cpp:278:36
    frame #6: 0x000000010001cb64 UnitTests`GCToEEInterface::SuspendEE(reason=SUSPEND_FOR_GC) at gcrhenv.cpp:659:23
    frame #7: 0x000000010004ccc4 UnitTests`WKS::GCHeap::GarbageCollectGeneration(this=0x0000600000004030, gen=0, reason=reason_alloc_soh) at gc.cpp:46900:9
    frame #8: 0x000000010004ec6c UnitTests`WKS::gc_heap::trigger_gc_for_alloc(gen_number=0, gr=reason_alloc_soh, msl=0x0000000100a2c1b8, loh_p=false, take_state=mt_try_budget) at gc.cpp:17691:14
    frame #9: 0x00000001000502e8 UnitTests`WKS::gc_heap::try_allocate_more_space(acontext=0x0000000126004dd0, size=32, flags=0, gen_number=0) at gc.cpp:17841:21
    frame #10: 0x000000010005048c UnitTests`WKS::gc_heap::allocate_more_space(acontext=0x0000000126004dd0, size=32, flags=0, alloc_generation_number=0) at gc.cpp:18320:18
    frame #11: 0x0000000100092394 UnitTests`WKS::GCHeap::Alloc(gc_alloc_context*, unsigned long, unsigned int) at gc.cpp:18351:19
    frame #12: 0x000000010009225c UnitTests`WKS::GCHeap::Alloc(this=0x0000600000004030, context=0x0000000126004dd0, size=32, flags=0) at gc.cpp:45894:34
    frame #13: 0x000000010001bd7c UnitTests`GcAllocInternal(pEEType=0x0000000100a12340, uFlags=0, numElements=0, pThread=0x0000000126004dd0) at gcrhenv.cpp:267:54
    frame #14: 0x000000010001c064 UnitTests`::RhpGcAlloc(pEEType=0x0000000100a12340, uFlags=0, numElements=0, pTransitionFrame=0x00000001705eecd0) at gcrhenv.cpp:303:12
    frame #15: 0x00000001000c3f5c UnitTests`RhpNewObject at AllocFast.S:88
    frame #16: 0x0000000100377d78 UnitTests`RhNewObject + 264
    frame #17: 0x0000000100373b64 UnitTests`S_P_CoreLib_System_Runtime_RuntimeImports__RhNewObject_0 + 36
    frame #18: 0x00000001003dff08 UnitTests`S_P_CoreLib_Internal_Runtime_ThreadStatics__AllocateThreadStaticStorageForType + 312
    frame #19: 0x00000001003dfc20 UnitTests`S_P_CoreLib_Internal_Runtime_ThreadStatics__GetThreadStaticBaseForTypeSlow + 112
    frame #20: 0x00000001003dfb84 UnitTests`S_P_CoreLib_Internal_Runtime_ThreadStatics__GetThreadStaticBaseForType + 228
    frame #21: 0x00000001002d2564 UnitTests`S_P_CoreLib_System_Threading_Thread__StartThread + 116
    frame #22: 0x00000001002d2e50 UnitTests`S_P_CoreLib_System_Threading_Thread__ThreadEntryPoint + 32
    frame #23: 0x00000001a039d06c libsystem_pthread.dylib`_pthread_start + 148

UPD: Updated PalHijack, now I get Assertion failed: (dont_restart_ee_p), function background_mark_phase, file gc.cpp, line 34647.

@filipnavara
Copy link
Member

I'll probably try to clean up my branch and submit a PR soon.

@ghost ghost added the in-pr There is an active PR which will close this issue when it is merged label Sep 8, 2022
@filipnavara
Copy link
Member

filipnavara commented Sep 8, 2022

That's GC interruption. You need to disable it in lldb with proc han -s false SIGUSR1. I do get intermittent GC related asserts in that test but sometimes it passes.

@am11
Copy link
Member Author

am11 commented Sep 8, 2022

this assertion is failing:

* thread #87, stop reason = hit program assert
    frame #4: 0x00000001000818bc UnitTests`WKS::gc_heap::background_promote_callback(ppObject=0x000000017018eb90, sc=0x0000000170fc6a00, flags=1) at gc.cpp:35589:5
   35586	    UNREFERENCED_PARAMETER(sc);
   35587	    //in order to save space on the array, mark the object,
   35588	    //knowing that it will be visited later
-> 35589	    assert (settings.concurrent);
   35590	
   35591	    THREAD_NUMBER_FROM_CONTEXT;
   35592	#ifndef MULTIPLE_HEAPS
Target 0: (UnitTests) stopped.

@filipnavara
Copy link
Member

Yep, that matches what I get. Not every time though.

@filipnavara
Copy link
Member

filipnavara commented Sep 8, 2022

We should probably fix GC_PAGE_SIZE definition (

// The value of card_size is determined empirically according to the average size of an object
). Unfortunately I have a bit of trouble capturing the GC failures under lldb.

UPD: Updating GC_PAGE_SIZE blows up really quickly. Nobody expects neither the Spanish inquisition nor the 16Kb page size.

@filipnavara
Copy link
Member

I finally managed to get the assert under lldb. The interesting thing is that two thread try to do GC at the same time:

  thread #73
    frame #0: 0x00000001000b9cbc UnitTests`BitStreamReader::DecodeVarLengthUnsigned(this=0x0000000170935bb0, base=8) at gcinfodecoder.h:382:13
    frame #1: 0x00000001000b967c UnitTests`GcInfoDecoder::GcInfoDecoder(this=0x0000000170935bb0, gcInfoToken=(Info = 0x0000000100878a21, Version = 2), flags=DECODE_SECURITY_OBJECT | DECODE_VARARG | DECODE_GC_LIFETIMES, breakOffset=667) at gcinfodecoder.cpp:150:29
    frame #2: 0x00000001000ba7e8 UnitTests`GcInfoDecoder::GcInfoDecoder(this=0x0000000170935bb0, gcInfoToken=(Info = 0x0000000100878a21, Version = 2), flags=DECODE_SECURITY_OBJECT | DECODE_VARARG | DECODE_GC_LIFETIMES, breakOffset=667) at gcinfodecoder.cpp:100:1
    frame #3: 0x00000001000c0fd8 UnitTests`UnixNativeCodeManager::EnumGcRefs(this=0x0000600000c04030, pMethodInfo=0x0000000170935f98, safePointAddress=0x00000001002adaec, pRegisterSet=0x0000000170935e40, hCallback=0x0000000170935ca0, isActiveStackFrame=false) at UnixNativeCodeManager.cpp:190:19
    frame #4: 0x000000010001ba44 UnitTests`RedhawkGCInterface::EnumGcRefs(pCodeManager=0x0000600000c04030, pMethodInfo=0x0000000170935f98, safePointAddress=0x00000001002adaec, pRegisterSet=0x0000000170935e40, pfnEnumCallback=0x000000010006f5b8, pvCallbackData=0x0000000170936258, isActiveStackFrame=false) at gcrhenv.cpp:377:19
    frame #5: 0x000000010002ac80 UnitTests`Thread::GcScanRootsWorker(this=0x0000000123604080, pfnEnumCallback=0x000000010006f5b8, pvCallbackData=0x0000000170936258, frameIterator=0x0000000170935e20) at thread.cpp:514:17
    frame #6: 0x000000010002a918 UnitTests`Thread::GcScanRoots(this=0x0000000123604080, pfnEnumCallback=0x000000010006f5b8, pvCallbackData=0x0000000170936258) at thread.cpp:404:5
    frame #7: 0x000000010001d7c4 UnitTests`GCToEEInterface::GcScanRoots(fn=(UnitTests`WKS::GCHeap::Promote(Object**, ScanContext*, unsigned int) at gc.cpp:45418), condemned=0, max_gen=2, sc=0x0000000170936258)(Object**, ScanContext*, unsigned int), int, int, ScanContext*) at gcrhscan.cpp:62:22
    frame #8: 0x0000000100099770 UnitTests`GCScan::GcScanRoots(fn=(UnitTests`WKS::GCHeap::Promote(Object**, ScanContext*, unsigned int) at gc.cpp:45418), condemned=0, max_gen=2, sc=0x0000000170936258)(Object**, ScanContext*, unsigned int), int, int, ScanContext*) at gcscan.cpp:152:5
    frame #9: 0x000000010005b584 UnitTests`WKS::gc_heap::mark_phase(condemned_gen_number=0, mark_only_p=NO) at gc.cpp:26389:9
    frame #10: 0x000000010005751c UnitTests`WKS::gc_heap::gc1() at gc.cpp:20977:13
    frame #11: 0x000000010006604c UnitTests`WKS::gc_heap::garbage_collect(n=0) at gc.cpp:22716:17
    frame #12: 0x000000010004c9a0 UnitTests`WKS::GCHeap::GarbageCollectGeneration(this=0x0000600000008010, gen=0, reason=reason_alloc_soh) at gc.cpp:46935:9
    frame #13: 0x000000010004e8e4 UnitTests`WKS::gc_heap::trigger_gc_for_alloc(gen_number=0, gr=reason_alloc_soh, msl=0x0000000100a2c3d0, loh_p=false, take_state=mt_try_budget) at gc.cpp:17692:14
    frame #14: 0x000000010004ff60 UnitTests`WKS::gc_heap::try_allocate_more_space(acontext=0x00000001027041a0, size=32, flags=0, gen_number=0) at gc.cpp:17842:21
    frame #15: 0x0000000100050104 UnitTests`WKS::gc_heap::allocate_more_space(acontext=0x00000001027041a0, size=32, flags=0, alloc_generation_number=0) at gc.cpp:18321:18
    frame #16: 0x0000000100092018 UnitTests`WKS::GCHeap::Alloc(gc_alloc_context*, unsigned long, unsigned int) at gc.cpp:18352:19
    frame #17: 0x0000000100091ee0 UnitTests`WKS::GCHeap::Alloc(this=0x0000600000008010, context=0x00000001027041a0, size=32, flags=0) at gc.cpp:45893:34
    frame #18: 0x000000010001b554 UnitTests`GcAllocInternal(pEEType=0x0000000100a12528, uFlags=0, numElements=0, pThread=0x00000001027041a0) at gcrhenv.cpp:267:54
    frame #19: 0x000000010001b83c UnitTests`::RhpGcAlloc(pEEType=0x0000000100a12528, uFlags=0, numElements=0, pTransitionFrame=0x0000000170936cd0) at gcrhenv.cpp:303:12
    frame #20: 0x00000001000c3c30 UnitTests`RhpNewObject at AllocFast.S:88
    frame #21: 0x00000001003493d8 UnitTests`_S_P_CoreLib_System_Runtime_RuntimeExports__RhNewObject(pEEType=0x0000000100a12528) at RuntimeExports.cs:52
    frame #22: 0x00000001003451c4 UnitTests`_S_P_CoreLib_System_Runtime_RuntimeImports__RhNewObject_0(pEEType=S_P_CoreLib_System_EETypePtr @ 0x0000000170936dd8) at RuntimeImports.cs:355
    frame #23: 0x00000001003b1568 UnitTests`_S_P_CoreLib_Internal_Runtime_ThreadStatics__AllocateThreadStaticStorageForType(typeManager=S_P_CoreLib_Internal_Runtime_TypeManagerHandle @ 0x0000000170936e68, typeTlsIndex=9) at ThreadStatics.cs:110
    frame #24: 0x00000001003b1280 UnitTests`_S_P_CoreLib_Internal_Runtime_ThreadStatics__GetThreadStaticBaseForTypeSlow(pModuleData=0x000000010093b7f0, typeTlsIndex=9) at ThreadStatics.cs:50
    frame #25: 0x00000001003b11e4 UnitTests`_S_P_CoreLib_Internal_Runtime_ThreadStatics__GetThreadStaticBaseForType(pModuleData=0x000000010093b7f0, typeTlsIndex=9) at ThreadStatics.cs:35
    frame #26: 0x00000001002a3ab4 UnitTests`_S_P_CoreLib_System_Threading_Thread__StartThread(parameter=4330574968) at Thread.NativeAot.cs:411
    frame #27: 0x00000001002a43b0 UnitTests`_S_P_CoreLib_System_Threading_Thread__ThreadEntryPoint(parameter=4330574968) at Thread.NativeAot.Unix.cs:113
    frame #28: 0x00000001a039d06c libsystem_pthread.dylib`_pthread_start + 148
  thread #74
    frame #0: 0x00000001a03615e4 libsystem_kernel.dylib`__psynch_cvwait + 8
    frame #1: 0x00000001a039d638 libsystem_pthread.dylib`_pthread_cond_wait + 1232
    frame #2: 0x00000001000a985c UnitTests`GCEvent::Impl::Wait(this=0x0000600002c04300, milliseconds=4294967295, alertable=false) at events.cpp:153:22
    frame #3: 0x00000001000a97b0 UnitTests`GCEvent::Wait(this=0x0000600000008020, timeout=4294967295, alertable=false) at events.cpp:262:20
    frame #4: 0x0000000100031df8 UnitTests`WKS::GCHeap::WaitUntilGCComplete(this=0x0000600000008010, bConsiderGCStart=false) at gcee.cpp:285:40
    frame #5: 0x000000010001b9b0 UnitTests`RedhawkGCInterface::WaitForGCCompletion() at gcrhenv.cpp:327:35
    frame #6: 0x000000010002ccc0 UnitTests`ThreadStore::AttachCurrentThread(fAcquireThreadStoreLock=true) at threadstore.cpp:131:9
    frame #7: 0x000000010002cda8 UnitTests`ThreadStore::AttachCurrentThread() at threadstore.cpp:148:5
    frame #8: 0x000000010002c0cc UnitTests`Thread::ReversePInvokeAttachOrTrapThread(this=0x00000001235041a0, pFrame=0x0000000170bf2fa8) at thread.cpp:1172:9
    frame #9: 0x000000010002c608 UnitTests`::RhpReversePInvokeAttachOrTrapThread2(pFrame=0x0000000170bf2fa8) at thread.cpp:1349:28
    frame #10: 0x000000010002c73c UnitTests`::RhpReversePInvoke(pFrame=0x0000000170bf2fa8) at thread.cpp:1363:5
    frame #11: 0x00000001002a43a4 UnitTests`_S_P_CoreLib_System_Threading_Thread__ThreadEntryPoint(parameter=4330572752) at Thread.NativeAot.Unix.cs:112
    frame #12: 0x00000001a039d06c libsystem_pthread.dylib`_pthread_start + 148
* thread #75, stop reason = hit program assert
    frame #0: 0x00000001a0366224 libsystem_kernel.dylib`__pthread_kill + 8
    frame #1: 0x00000001a039ccec libsystem_pthread.dylib`pthread_kill + 288
    frame #2: 0x00000001a02d62d8 libsystem_c.dylib`abort + 180
    frame #3: 0x00000001a02d5630 libsystem_c.dylib`__assert_rtn + 272
  * frame #4: 0x0000000100081cb4 UnitTests`WKS::gc_heap::background_promote_callback(ppObject=0x0000000170076f30, sc=0x0000000170d96a00, flags=0) at gc.cpp:35589:5
    frame #5: 0x000000010001da9c UnitTests`GcEnumObject(ppObj=0x0000000170076f30, flags=0, fnGcEnumRef=(UnitTests`WKS::gc_heap::background_promote_callback(Object**, ScanContext*, unsigned int) at gc.cpp:35585), pSc=0x0000000170d96a00)(Object**, ScanContext*, unsigned int), ScanContext*) at gcrhscan.cpp:119:9
    frame #6: 0x000000010001ba8c UnitTests`EnumGcRefsCallback(hCallback=0x0000000170d96460, pObject=0x0000000170076f30, flags=0) at gcrhenv.cpp:359:5
    frame #7: 0x00000001000c08ac UnitTests`GcInfoDecoder::ReportStackSlotToGC(this=0x0000000170d96370, spOffset=32, spBase=GC_FRAMEREG_REL, gcFlags=0, pRD=0x0000000170d96600, flags=0, pCallBack=(UnitTests`EnumGcRefsCallback(void*, void**, unsigned int) at gcrhenv.cpp:356), hCallBack=0x0000000170d96460)(void*, void**, unsigned int), void*) at gcinfodecoder.cpp:2018:5
    frame #8: 0x00000001000bf4ec UnitTests`GcInfoDecoder::ReportSlotToGC(this=0x0000000170d96370, slotDecoder=0x0000000170d95fa0, slotIndex=6, pRD=0x0000000170d96600, reportScratchSlots=true, inputFlags=0, pCallBack=(UnitTests`EnumGcRefsCallback(void*, void**, unsigned int) at gcrhenv.cpp:356), hCallBack=0x0000000170d96460)(void*, void**, unsigned int), void*) at gcinfodecoder.h:698:17
    frame #9: 0x00000001000bf5d4 UnitTests`GcInfoDecoder::ReportUntrackedSlots(this=0x0000000170d96370, slotDecoder=0x0000000170d95fa0, pRD=0x0000000170d96600, inputFlags=0, pCallBack=(UnitTests`EnumGcRefsCallback(void*, void**, unsigned int) at gcrhenv.cpp:356), hCallBack=0x0000000170d96460)(void*, void**, unsigned int), void*) at gcinfodecoder.cpp:1032:9
    frame #10: 0x00000001000bce08 UnitTests`GcInfoDecoder::EnumerateLiveSlots(this=0x0000000170d96370, pRD=0x0000000170d96600, reportScratchSlots=false, inputFlags=0, pCallBack=(UnitTests`EnumGcRefsCallback(void*, void**, unsigned int) at gcrhenv.cpp:356), hCallBack=0x0000000170d96460)(void*, void**, unsigned int), void*) at gcinfodecoder.cpp:981:9
    frame #11: 0x00000001000c1068 UnitTests`UnixNativeCodeManager::EnumGcRefs(this=0x0000600000c04030, pMethodInfo=0x0000000170d96758, safePointAddress=0x00000001002a3b64, pRegisterSet=0x0000000170d96600, hCallback=0x0000000170d96460, isActiveStackFrame=false) at UnixNativeCodeManager.cpp:206:18
    frame #12: 0x000000010001ba44 UnitTests`RedhawkGCInterface::EnumGcRefs(pCodeManager=0x0000600000c04030, pMethodInfo=0x0000000170d96758, safePointAddress=0x00000001002a3b64, pRegisterSet=0x0000000170d96600, pfnEnumCallback=0x0000000100081c5c, pvCallbackData=0x0000000170d96a00, isActiveStackFrame=false) at gcrhenv.cpp:377:19
    frame #13: 0x000000010002ac80 UnitTests`Thread::GcScanRootsWorker(this=0x0000000102404d90, pfnEnumCallback=0x0000000100081c5c, pvCallbackData=0x0000000170d96a00, frameIterator=0x0000000170d965e0) at thread.cpp:514:17
    frame #14: 0x000000010002a918 UnitTests`Thread::GcScanRoots(this=0x0000000102404d90, pfnEnumCallback=0x0000000100081c5c, pvCallbackData=0x0000000170d96a00) at thread.cpp:404:5
    frame #15: 0x000000010001d7c4 UnitTests`GCToEEInterface::GcScanRoots(fn=(UnitTests`WKS::gc_heap::background_promote_callback(Object**, ScanContext*, unsigned int) at gc.cpp:35585), condemned=2, max_gen=2, sc=0x0000000170d96a00)(Object**, ScanContext*, unsigned int), int, int, ScanContext*) at gcrhscan.cpp:62:22
    frame #16: 0x0000000100099770 UnitTests`GCScan::GcScanRoots(fn=(UnitTests`WKS::gc_heap::background_promote_callback(Object**, ScanContext*, unsigned int) at gc.cpp:35585), condemned=2, max_gen=2, sc=0x0000000170d96a00)(Object**, ScanContext*, unsigned int), int, int, ScanContext*) at gcscan.cpp:152:5
    frame #17: 0x0000000100058bc8 UnitTests`WKS::gc_heap::background_mark_phase() at gc.cpp:34578:5
    frame #18: 0x00000001000574e0 UnitTests`WKS::gc_heap::gc1() at gc.cpp:20968:13
    frame #19: 0x0000000100081048 UnitTests`WKS::gc_heap::bgc_thread_function() at gc.cpp:35926:9
    frame #20: 0x0000000100080ec0 UnitTests`WKS::gc_heap::bgc_thread_stub(arg=0x0000000000000000) at gc.cpp:33887:5
    frame #21: 0x000000010001d63c UnitTests`GCToEEInterface::CreateThread(this=0x0000000170936680, argument=0x0000000170936680)(void*), void*, bool, char const*)::$_0::operator()(void*) const at gcrhenv.cpp:1234:9
    frame #22: 0x000000010001d540 UnitTests`GCToEEInterface::CreateThread(argument=0x0000000170936680)(void*), void*, bool, char const*)::$_0::__invoke(void*) at gcrhenv.cpp:1211:23
    frame #23: 0x00000001a039d06c libsystem_pthread.dylib`_pthread_start + 148

@jkotas
Copy link
Member

jkotas commented Sep 8, 2022

You may be seeing #75298

@VSadov
Copy link
Member

VSadov commented Sep 8, 2022

assert (settings.concurrent); is indeed #75298

@filipnavara
Copy link
Member

Confirmed, with #75298 I no longer see the crash.

@ghost ghost removed the in-pr There is an active PR which will close this issue when it is merged label Sep 11, 2022
@am11
Copy link
Member Author

am11 commented Sep 15, 2022

.NET 8 installer for SDK 8.0.100-alpha.1.22464.43 is available and console, classlib and mvc (C# and F#) apps seems to be working fine on M1 when published with:

dotnet8 publish -c release --use-current-runtime -p:'PublishAot=true;StripSymbols=true'

@am11
Copy link
Member Author

am11 commented Sep 15, 2022

BTW, I haven't found any unusual warning if we delete this suppression:

<_IgnoreLinkerWarnings Condition="'$(TargetOS)' == 'OSX'">true</_IgnoreLinkerWarnings>

Is it still relevant for osx-x64?

@filipnavara
Copy link
Member

Is it still relevant for osx-x64?

Yes, it is. There are warnings about conversation of DWARF to compact unwinding.

@am11
Copy link
Member Author

am11 commented Sep 15, 2022

We can exclude TargetArchitecture=arm64

@ghost ghost locked as resolved and limited conversation to collaborators Oct 15, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants