-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pProfInterface is used after being freed by TerminateProfiling #13413
Comments
Thanks @echesakovMSFT, this is probably a bug in the runtime but i need to do a little investigating to make sure. The profiler is not expected to ever unregister the ELT hooks (in fact I think we explicitly block them from doing so). We would run finalizers on shutdown on desktop, so I am curious to know if anything's change or if this is a longstanding bug. |
Similar race is observed in CoreMangLib/system/span/RefStructWithSpan test where the ELT hooks is called from a background thread after pProfInterface is destroyed. |
@davmason with patch you provide I have same issue on exit, but with a bit different trace:
it still SIGSEGV on
|
Thanks @viewizard, there is a race in my proposed fix and I will have to update it |
@viewizard can you rerun your test with the update fix I have? I ran all of our tests with ELT enabled and didn't see any crashes, so I believe the race conditions are solved. |
@davmason I am about to start the final testing for Arm64 ELT hooks. If you want - I can pull down your changes and test them as well. |
@echesakovMSFT That would be great, thanks! |
@davmason I have a question. What a profiler is expected to do during its Shutdown() call? I presume it should deallocate the memory/destroy all the objects that were created before? What if the ELT callbacks rely on the data? |
It’s up to the profiler when to release memory. If the runtime is shutting down, then the profiler will be unloaded too when the process dies. You could just let the OS reclaim the memory. If the profiler wanted to free things on shutdown you would have to set them to null and return early from the ELT stubs after they are null. |
@davmason Basically, this means that every ELT hook should have some synchronization mechanism (ideally, lock-free) to be able to bypass the logic if the system is shutting down? With you changes I don't see the issue with null pProfInterface but I see that ELT hooks are being called after the system was shut down (and SIGSEGV since the profiler data was disposed). If my profiler implementation accounted for this then I think I wouldn't see any issue at all. |
@davmason I have tested CoreCLR with new patch on arm32, looks like this solve my issue with SIGSEGV. |
It seems that
pProfInterface
can be accessed byProfileEnter
/ProfileLeave
/ProfileTailcall
callbacks afterTerminateProfiling()
was called.I was running our tests Pri1 with a sample profiler (https://github.com/microsoft/clr-samples/tree/master/ProfilingAPI/ELTProfiler) on linux-arm64 against my PR dotnet/coreclr#26460 and I hit segmentation faults in GC/Scenarios/FinalizeTimeout test.
The issue can also be reproduced on linux-x64.
What happens:
ProfilingAPIUtility::TerminateProfiling()
is called during system shutdown. After that pointpProfInterface
is freed and set tonullptr
.ProfileEnter
orProfileLeave
callback is called by https://github.com/dotnet/coreclr/blob/9c8ba7773e506db05016f1c278a7a1ea27816dbf/tests/src/GC/Scenarios/FinalizeTimeout/FinalizeTimeout.cs#L75from a finalizer thread.
pProfInterface
(e.g. https://github.com/dotnet/coreclr/blob/master/src/vm/proftoeeinterfaceimpl.cpp#L10259)Steps to repro:
You will see that the debugger will stop at two watchpoints - the first time is during ProfilingAPIUtility::LoadProfiler, the second time - during ProfilingAPIUtility::TerminateProfiling().
Then it will stop during SIGSEGV.
@davmason @noahfalk Is this a supported scenario for profiling?
cc @dotnet/dotnet-diag
Related issues: #11885 dotnet/coreclr#22712
The text was updated successfully, but these errors were encountered: