-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix for stress failure when adjusting effective IP while stackwalking may put it on a wrong instruction. #100376
Conversation
/azp run runtime-coreclr gcstress-extra, runtime-coreclr gcstress0x3-gcstress0xc |
Azure Pipelines successfully started running 2 pipeline(s). |
The few failures that were observed in the stress runs are not new, so I consider this a pass. |
CC. @jkotas @jakobbotsch |
I think this is ready for a review. |
I've just realized that we can now stop doing this adjustment on x86 and on NativeAOT as well. So, there is another commit. It is not strictly necessary for the fix, just some cleanup that we can do. |
((call->AsCall()->gtCallType == CT_HELPER) && | ||
Compiler::s_helperCallProperties.AlwaysThrow(call->AsCall()->GetHelperNum()))) | ||
{ | ||
// NOTE: We should probably never see a BBJ_ALWAYS block ending with a throw in a first place. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I take it you did see such cases? Can you open an issues for us to follow-up and switch them over to BBJ_THROWs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note this might largely clean itself up by adding the logic to IsNoReturn
, though there is one place in morph where we still check the flag rather than the property.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I take it you did see such cases?
Yes, I wanted this to be an assert, so it would not happen by accident in the future. To my surprise, the assert was triggered by existing code.
@@ -712,7 +712,9 @@ void CodeGen::genCodeForBBlist() | |||
|
|||
if ((call != nullptr) && (call->gtOper == GT_CALL)) | |||
{ | |||
if ((call->AsCall()->gtCallMoreFlags & GTF_CALL_M_DOES_NOT_RETURN) != 0) | |||
if ((call->AsCall()->gtCallMoreFlags & GTF_CALL_M_DOES_NOT_RETURN) != 0 || |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we extend GenTreeCall::IsNoReturn()
to cover the helper case too? Either by setting the no return flag when the helper is set, or by putting this check there?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have tried, naively, to put
if (s_helperCallProperties.AlwaysThrow((CorInfoHelpFunc)helper))
{
result->gtCallMoreFlags |= GTF_CALL_M_DOES_NOT_RETURN;
}
in gtNewHelperCallNode
, but that resulted in
Assertion failed 'call->gtCallType == CT_USER_FUNC' in 'System.ModuleHandle:ResolveFieldHandle(int,System.RuntimeTypeHandle[],System.Run
timeTypeHandle[]):System.RuntimeFieldHandle:this' during 'Merge throw blocks' (IL size 283; hash 0x040825d3; FullOpts)
if (!call->IsNoReturn())
{
continue;
}
// Sanity check -- only user funcs should be marked do not return
assert(call->gtCallType == CT_USER_FUNC);
I could not easily tell if the assert is just an observation, that could be relaxed or indeed guarding some assumptions made in the code, so I did not continue on that path.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Commenting out that assert leads to others:
Assert failure(PID 35472 [0x00008a90], Thread: 41468 [0xa1fc]): Assertion failed 'updateCount < optNoReturnCallCount' in 'System.Tests.S
tringComparerTests:Create_CreatesValidComparer():this' during 'Merge throw blocks' (IL size 543; hash 0xc45bc068; FullOpts)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I'd prefer entering an issue on this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense, looks like there are some assumptions in the throw helper merging to sort out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
JIT changes LGTM.
Do you plan to get rid of the adjustment in a follow up PR? |
In this PR we can only avoid adjustment in a throwing case. That is possible because on this path we will care only about fully-interruptible methods that faulted or called methods that throw (most commonly it is
We stopped adjusting in this case in this PR - including x86 and AOT cases. To get rid of the adjustment in all cases is a bigger change because:
|
Co-authored-by: Jan Kotas <[email protected]>
Thanks!!! |
Fixes: #86273
The change makes sure that an instruction after a throwing call cannot be branched to (and have different GC liveness.
In many cases that was already guaranteed, just picking up remaining cases.
This allows not to do
-1
adjustment of the effective IP when stackwalking through exceptional/aborting frames. The root cause of #86273 was that this adjustment is not safe under some conditions.One solution would be to:
This change does the
#2