-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update llvm-libunwind from v9.0.0 to v14.0.6 #72442
Conversation
I couldn't figure out the best area label to add to this PR. If you have write-permissions please help me learn by adding exactly one area label. |
@@ -117,23 +83,12 @@ namespace libunwind { | |||
// __eh_frame_hdr_start = SIZEOF(.eh_frame_hdr) > 0 ? ADDR(.eh_frame_hdr) : 0; | |||
// __eh_frame_hdr_end = SIZEOF(.eh_frame_hdr) > 0 ? . : 0; | |||
|
|||
#ifndef _LIBUNWIND_USE_ONLY_DWARF_INDEX |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
_LIBUNWIND_USE_ONLY_DWARF_INDEX
, _LIBUNWIND_BAREMETAL_DWARF_INDEX_SEC_START
and _LIBUNWIND_BAREMETAL_DWARF_INDEX_SEC_END
were added in dotnet/corert#8271, but they are always using the default value, and not defined anywhere else. I have removed them as part of resolving merge conflicts (and to avoid future conflicts).
One test is failing on Linux x64: Running Test: ThreadLocalStatics.TLSTesting.ThreadLocalStatics_Test
Thread 2 "DynamicGenerics" received signal SIG34, Real-time event 34.
[Switching to Thread 0x7f0b60b3a700 (LWP 186231)]
futex_wait_cancelable (private=<optimized out>, expected=0, futex_word=0x5628c407bd4c) at ../sysdeps/nptl/futex-internal.h:183
183 ../sysdeps/nptl/futex-internal.h: No such file or directory.
(gdb) thread apply all bt
Thread 3 (Thread 0x7f0b5a23f700 (LWP 186232)):
#0 __libc_read (nbytes=1, buf=0x7f0b5a23ed7f, fd=5) at ../sysdeps/unix/sysv/linux/read.c:26
#1 __libc_read (fd=5, buf=0x7f0b5a23ed7f, nbytes=1) at ../sysdeps/unix/sysv/linux/read.c:24
#2 0x00005628c2c0dddf in ?? ()
#3 0x00007f0b80f03609 in start_thread (arg=<optimized out>) at pthread_create.c:477
#4 0x00007f0b80e28163 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
Thread 2 (Thread 0x7f0b60b3a700 (LWP 186231)):
#0 futex_wait_cancelable (private=<optimized out>, expected=0, futex_word=0x5628c407bd4c) at ../sysdeps/nptl/futex-internal.h:183
#1 __pthread_cond_wait_common (abstime=0x0, clockid=0, mutex=0x5628c407bd50, cond=0x5628c407bd20) at pthread_cond_wait.c:508
#2 __pthread_cond_wait (cond=0x5628c407bd20, mutex=0x5628c407bd50) at pthread_cond_wait.c:638
#3 0x00005628c2be9813 in GCEvent::Impl::Wait(unsigned int, bool) ()
#4 0x00005628c2be92a3 in GCEvent::Wait(unsigned int, bool) ()
#5 0x00005628c2b6ef87 in WKS::GCHeap::WaitUntilGCComplete(bool) ()
#6 0x00005628c2b5bcf7 in RedhawkGCInterface::WaitForGCCompletion() ()
#7 0x00005628c2b679c6 in Thread::WaitForGC(PInvokeTransitionFrame*) ()
#8 0x00005628c2b69710 in RhpWaitForGC2 ()
#9 0x00005628c2dcccfa in S_P_CoreLib_System_Runtime_InternalCalls__RhpSignalFinalizationComplete ()
#10 0x00005628c2dcb094 in S_P_CoreLib_System_Runtime___Finalizer__ProcessFinalizers ()
#11 0x00005628c2b58fce in FinalizerStart(void*) ()
#12 0x00007f0b80f03609 in start_thread (arg=<optimized out>) at pthread_create.c:477
#13 0x00007f0b80e28163 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
Thread 1 (Thread 0x7f0b80ce9880 (LWP 186227)):
#0 0x00005628c2ba64b7 in WKS::gc_heap::mark_object_simple1(unsigned char*, unsigned char*) ()
#1 0x00005628c2ba80f2 in WKS::gc_heap::mark_object_simple(unsigned char**) ()
#2 0x00005628c2bacc2c in WKS::GCHeap::Promote(Object**, ScanContext*, unsigned int) ()
#3 0x00005628c2be249f in PromoteObject(Object**, unsigned long*, unsigned long, unsigned long) ()
#4 0x00005628c2be0338 in ScanConsecutiveHandlesWithoutUserData(Object**, Object**, ScanCallbackInfo*, unsigned long*) ()
#5 0x00005628c2be054f in BlockScanBlocksWithoutUserData(TableSegment*, unsigned int, unsigned int, ScanCallbackInfo*) ()
#6 0x00005628c2be1764 in SegmentScanByTypeChain(TableSegment*, unsigned int, void (*)(TableSegment*, unsigned int, unsigned int, ScanCallbackInfo*), ScanCallbackInfo*) ()
#7 0x00005628c2be195f in TableScanHandles(HandleTable*, unsigned int const*, unsigned int, TableSegment* (*)(HandleTable*, TableSegment*, CrstHolderWithState*), void (*)(TableSegment*, unsigned int, unsigned int, ScanCallbackInfo*), ScanCallbackInfo*, CrstHolderWithState*) ()
#8 0x00005628c2bdbe47 in HndScanHandlesForGC(HandleTable*, void (*)(Object**, unsigned long*, unsigned long, unsigned long), unsigned long, unsigned long, unsigned int const*, unsigned int, unsigned int, unsigned int, unsigned int) ()
#9 0x00005628c2be3523 in Ref_TraceNormalRoots(unsigned int, unsigned int, ScanContext*, void (*)(Object**, ScanContext*, unsigned int)) ()
#10 0x00005628c2bdaea6 in GCScan::GcScanHandles(void (*)(Object**, ScanContext*, unsigned int), int, int, ScanContext*) ()
#11 0x00005628c2b98452 in WKS::gc_heap::mark_phase(int, int) ()
#12 0x00005628c2b93e06 in WKS::gc_heap::gc1() ()
#13 0x00005628c2ba3ebd in WKS::gc_heap::garbage_collect(int) ()
#14 0x00005628c2b88a31 in WKS::GCHeap::GarbageCollectGeneration(unsigned int, gc_reason) ()
#15 0x00005628c2bd1e55 in WKS::GCHeap::GarbageCollectTry(int, int, int) ()
#16 0x00005628c2bd1cb2 in WKS::GCHeap::GarbageCollect(int, bool, int) ()
#17 0x00005628c2b594d4 in RhpCollect ()
#18 0x00005628c2dcca39 in S_P_CoreLib_System_Runtime_InternalCalls__RhpCollect ()
#19 0x00005628c2dcc9ec in S_P_CoreLib_System_Runtime_InternalCalls__RhCollect ()
#20 0x00005628c2c5bf94 in S_P_CoreLib_System_GC__Collect_0 ()
#21 0x00005628c2c375e0 in DynamicGenerics_ThreadLocalStatics_TLSTesting__ThreadLocalStatics_Test ()
#22 0x00005628c2c4b843 in DynamicGenerics_EntryPointMain___c___Main_b__0_62 ()
#23 0x00005628c2c4713d in DynamicGenerics_CoreFXTestLibrary_Internal_Runner__RunTestMethod ()
#24 0x00005628c2c46c3f in DynamicGenerics_CoreFXTestLibrary_Internal_Runner__RunTest ()
#25 0x00005628c2c469b4 in DynamicGenerics_CoreFXTestLibrary_Internal_Runner__RunTests ()
#26 0x00005628c2c260fd in DynamicGenerics_EntryPointMain__Main ()
#27 0x00005628c305c107 in DynamicGenerics__Module___MainMethodWrapper ()
#28 0x00005628c305c1a3 in __managed__Main ()
#29 0x00005628c2b56fdf in main () |
SIG34 is how GC suspension interrupts threads asynchronously on Linux. When debugging in GDB you may want to just pass through SIG34: |
Bypassing SIG34, we get: Running Test: ThreadLocalStatics.TLSTesting.ThreadLocalStatics_Test
[New Thread 0x7f91b3fff700 (LWP 186654)]
[New Thread 0x7f91b869f700 (LWP 186655)]
[New Thread 0x7f91b37fe700 (LWP 186656)]
[New Thread 0x7f91b2ffd700 (LWP 186657)]
[New Thread 0x7f91b27fc700 (LWP 186658)]
[New Thread 0x7f91b1ffb700 (LWP 186659)]
[New Thread 0x7f91b17fa700 (LWP 186660)]
[New Thread 0x7f91b0ff9700 (LWP 186661)]
[New Thread 0x7f918ffff700 (LWP 186662)]
[New Thread 0x7f918f7fe700 (LWP 186663)]
[New Thread 0x7f918effd700 (LWP 186664)]
[New Thread 0x7f918e7fc700 (LWP 186665)]
[New Thread 0x7f918dffb700 (LWP 186666)]
[New Thread 0x7f918d7fa700 (LWP 186667)]
[New Thread 0x7f918cff9700 (LWP 186668)]
[New Thread 0x7f916bfff700 (LWP 186669)]
[New Thread 0x7f916b7fe700 (LWP 186670)]
Thread 8 "DynamicGenerics" received signal SIGABRT, Aborted.
[Switching to Thread 0x7f91b27fc700 (LWP 186658)]
__GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
50 ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
(gdb) bt
#0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1 0x00007f91db893859 in __GI_abort () at abort.c:79
#2 0x000055f8e39dd9bc in Assert(char const*, char const*, unsigned int, char const*) ()
#3 0x000055f8e3943231 in Thread::HijackReturnAddressWorker(StackFrameIterator*, void**) ()
#4 0x000055f8e3942dbd in Thread::HijackReturnAddress(UNIX_CONTEXT*, void**) ()
#5 0x000055f8e3942c4c in Thread::HijackCallback(UNIX_CONTEXT*, void*) ()
#6 0x000055f8e39bfe48 in ?? ()
#7 <signal handler called>
#8 0x000055f8e3b67631 in S_P_CoreLib_System_Diagnostics_Debug__Assert ()
#9 0x000055f8e3d924e4 in S_P_CoreLib_System_Collections_Concurrent_ConcurrentUnifierWKeyed_2_Container<S_P_CoreLib_System_Reflection_Runtime_TypeInfos_RuntimeConstructedGenericTypeInfo_UnificationKey__System___Canon>__VerifyUnifierConsistency ()
#10 0x000055f8e3d913fe in S_P_CoreLib_System_Collections_Concurrent_ConcurrentUnifierWKeyed_2<S_P_CoreLib_System_Reflection_Runtime_TypeInfos_RuntimeConstructedGenericTypeInfo_UnificationKey__System___Canon>__GetOrAdd ()
#11 0x000055f8e3bcb0d6 in S_P_CoreLib_System_Reflection_Runtime_TypeInfos_RuntimeConstructedGenericTypeInfo__GetRuntimeConstructedGenericTypeInfo_0 ()
#12 0x000055f8e3bcb039 in S_P_CoreLib_System_Reflection_Runtime_TypeInfos_RuntimeConstructedGenericTypeInfo__GetRuntimeConstructedGenericTypeInfo ()
#13 0x000055f8e3be60ff in S_P_CoreLib_System_Reflection_Runtime_General_TypeUnifier__GetConstructedGenericTypeWithTypeHandle ()
#14 0x000055f8e3bc60a8 in S_P_CoreLib_System_Reflection_Runtime_TypeInfos_RuntimeTypeInfo__MakeGenericType ()
#15 0x000055f8e3a0eb32 in DynamicGenerics_ThreadLocalStatics_TLSTesting__MakeType1 ()
#16 0x000055f8e3a26e7f in DynamicGenerics_ThreadLocalStatics_TLSTesting___c__DisplayClass3_0___MultiThreaded_Test_b__0 ()
#17 0x000055f8e3b1fe63 in S_P_CoreLib_System_Threading_Tasks_Task__InnerInvoke ()
#18 0x000055f8e3c35c37 in S_P_CoreLib_System_Threading_Tasks_Task___c____cctor_b__273_0 ()
#19 0x000055f8e3b13641 in S_P_CoreLib_System_Threading_ExecutionContext__RunFromThreadPoolDispatchLoop ()
#20 0x000055f8e3b1fc19 in S_P_CoreLib_System_Threading_Tasks_Task__ExecuteWithThreadLocal ()
#21 0x000055f8e3b1f88d in S_P_CoreLib_System_Threading_Tasks_Task__ExecuteEntryUnsafe ()
#22 0x000055f8e3b1f7ff in S_P_CoreLib_System_Threading_Tasks_Task__ExecuteFromThreadPool ()
#23 0x000055f8e3b19211 in S_P_CoreLib_System_Threading_ThreadPoolWorkQueue__DispatchWorkItem ()
#24 0x000055f8e3b18f5a in S_P_CoreLib_System_Threading_ThreadPoolWorkQueue__Dispatch ()
#25 0x000055f8e3c30fb2 in S_P_CoreLib_System_Threading_PortableThreadPool_WorkerThread__WorkerThreadStart ()
#26 0x000055f8e3e352ea in S_P_CoreLib_System_Threading_ThreadStart__InvokeOpenStaticThunk ()
#27 0x000055f8e3c28e1b in S_P_CoreLib_System_Threading_Thread_StartHelper__RunWorker ()
#28 0x000055f8e3c28d92 in S_P_CoreLib_System_Threading_Thread_StartHelper__Run ()
#29 0x000055f8e3b0f2b2 in S_P_CoreLib_System_Threading_Thread__StartThread ()
#30 0x000055f8e3b0f9c1 in S_P_CoreLib_System_Threading_Thread__ThreadEntryPoint ()
#31 0x00007f91dba6b609 in start_thread (arg=<optimized out>) at pthread_create.c:477
#32 0x00007f91db990163 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95 |
it is hard to tell which assert is actually failing - is it the one in S_P_CoreLib_System_Collections_Concurrent_ConcurrentUnifierWKeyed_2_Container ? |
Another possibility is that the new libunwind has problems with unwinding this particular stack and that causes an assert in The issue like this would not be specific to just one test typically, there would be other failures. |
If this is just one scenario that is affected, I think we may want to disable a scenario and I can look at it later. Just need to make sure it is only this scenario and not a random crash on certain stacks. |
18e634b
to
ef0164f
Compare
Disabling Do we have more NativeAOT tests in one of the oterloop pipeline? Can we |
/azp run runtime-extra-platforms |
Azure Pipelines successfully started running 1 pipeline(s). |
DynamicGenerics is the most targeted test of the runtime type loader. We cannot have it disabled. We could comment out this line instead but that is pretty worrying too:
The only thing special about that test case is that it stresses the GC a lot. I've seen it SIGABRT in this run yesterday: #72236 This might be a recent regression. |
@VSadov another option is that we block this PR and try to find the solution, then push the required changes here rather than disabling the test. Although it is a non-stripped test binary, it doesn't have the line number / file info. I built everything as debug: $ ./build.sh -s clr+clr.aot+libs
$ src/tests/build.sh nativeaot 'tree nativeaot' /p:LibrariesConfiguration=Debug
$ cd artifacts/tests/coreclr/Linux.x64.Debug/nativeaot/SmokeTests/DynamicGenerics/DynamicGenerics/native
$ file DynamicGenerics
DynamicGenerics: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=2087cb20056da1ec0c47368caae0ab6a1c72e100, for GNU/Linux 3.2.0, with debug_info, not stripped
$ gdb ./DynamicGenerics
...
Reading symbols from ./DynamicGenerics...
Dwarf Error: DW_FORM_strx1 found in non-DWO CU [in module /runtime/artifacts/tests/coreclr/Linux.x64.Debug/nativeaot/SmokeTests/DynamicGenerics/DynamicGenerics/native/DynamicGenerics]
(No debugging symbols found in ./DynamicGenerics)
... so it is a bit tricky to debug without debug symbols. :) |
This reverts commit ef0164f.
I am able to reproduce the failure. |
We are unable to unwind in prologue. We should be able to. Could be something in the new libunwind. An example of the failure is when we try to hijack at a point like the following
I would expect that library tests would fail a lot with this. |
Actually, no, this fix is present. I was looking at a wrong source. It must be something else. |
static_cast<uint64_t>(instructionsEnd)); | ||
|
||
// see DWARF Spec, section 6.4.2 for details on unwind opcodes | ||
while ((p < instructionsEnd) && (codeOffset < pcoffset)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should have codeOffset <= pcoffset
The opcode that updates the position comes before the opcodes that describe effects of the instruction, so the while
loop must be end-inclusive, or it will miss effects of the last instruction in the range.
If I make the change, DynamicGenerics
is passing.
I think the upstream should have this fix, but maybe not in the version that we are getting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@am11 - I have pushed a fix, but feel free to re-apply it if this is not the right way from the change tracking point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@VSadov, thank you! :)
This one is still <
in 14.0.6: https://github.com/llvm/llvm-project/blob/llvmorg-14.0.6/libunwind/src/DwarfParser.hpp#L449 as well as in their main
branch. I will add it to our tracking list once NativeAOT_Libs_Passing legs' runs are completed.
/azp run runtime-extra-platforms |
Azure Pipelines successfully started running 1 pipeline(s). |
Windows leg timed out |
Thanks @am11 ! |
Closes #72344