Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test failure JIT/Methodical/xxobj/sizeof/sizeof64_Target_64Bit_and_arm_il_r/sizeof64_Target_64Bit_and_arm.cmd #81109

Closed
JulieLeeMSFT opened this issue Jan 24, 2023 · 36 comments
Assignees
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI blocking-outerloop Blocking the 'runtime-coreclr outerloop' and 'runtime-libraries-coreclr outerloop' runs
Milestone

Comments

@JulieLeeMSFT
Copy link
Member

Failed in run: runtime-coreclr outerloop 20230123.1

Failed tests:

R2R-CG2 windows arm Checked @ Windows.11.Arm64.Open
  - JIT/Methodical/xxobj/sizeof/sizeof64_Target_64Bit_and_arm_il_r/sizeof64_Target_64Bit_and_arm.cmd

Error message:

Fatal error. Internal CLR error. (0x80131506)
at System.GC.AllocateNewArray(IntPtr, Int32, GC_ALLOC_FLAGS)
at System.Text.StringBuilder.ExpandByABlock(Int32)
at System.Text.StringBuilder.Append(Char, Int32)
at System.Text.StringBuilder.Append(Char)
at System.Diagnostics.StackTrace.ToString(TraceFormat, System.Text.StringBuilder)
at System.Diagnostics.StackTrace.ToString(TraceFormat)
at System.Exception.get_StackTrace()
at System.Exception.ToString()
at Internal.JitInterface.CorInfoImpl.AllocException(System.Exception)
at Internal.JitInterface.CorInfoImpl._resolveToken(IntPtr, IntPtr*, Internal.JitInterface.CORINFO_RESOLVED_TOKEN*)
at Internal.JitInterface.CorInfoImpl.JitCompileMethod(IntPtr ByRef, IntPtr, IntPtr, IntPtr, Internal.JitInterface.CORINFO_METHOD_INFO ByRef, UInt32, IntPtr ByRef, UInt32 ByRef)
at Internal.JitInterface.CorInfoImpl.CompileMethodInternal(ILCompiler.DependencyAnalysis.IMethodNode, Internal.IL.MethodIL)
at Internal.JitInterface.CorInfoImpl.CompileMethod(ILCompiler.DependencyAnalysis.ReadyToRun.MethodWithGCInfo, ILCompiler.Logger)
at ILCompiler.ReadyToRunCodegenCompilation+<>c__DisplayClass46_0.<ComputeDependencyNodeDependencies>g__CompileOneMethod|5(ILCompiler.DependencyAnalysisFramework.DependencyNodeCore`1<ILCompiler.DependencyAnalysis.NodeFactory>, Int32)
at ILCompiler.ReadyToRunCodegenCompilation+<>c__DisplayClass46_0.<ComputeDependencyNodeDependencies>g__CompileOnThread|4(Int32)
at ILCompiler.ReadyToRunCodegenCompilation+<>c__DisplayClass46_0.<ComputeDependencyNodeDependencies>g__CompilationThread|3(System.Object)
at System.Threading.Thread.StartCallback()

Return code:      1
Raw output file:      C:hwA74E094BwB27F0950uploads\xxobjsizeofsizeof64_Target_64Bit_and_arm_il_routput.txt
Raw output:
BEGIN EXECUTION
sizeof64_Target_64Bit_and_arm.dll
TestLibrary.dll
2 file(s) copied.
11:39:15.16
Response file: C:hwA74E094BwB27F0950eJITMethodical\xxobjsizeofsizeof64_Target_64Bit_and_arm_il_rsizeof64_Target_64Bit_and_arm.dll.rsp
C:hwA74E094BwB27F0950eJITMethodical\xxobjsizeofsizeof64_Target_64Bit_and_arm_il_rIL-CG2sizeof64_Target_64Bit_and_arm.dll
-o:C:hwA74E094BwB27F0950eJITMethodical\xxobjsizeofsizeof64_Target_64Bit_and_arm_il_rsizeof64_Target_64Bit_and_arm.dll
--targetarch:arm
--targetos:windows
--verify-type-and-field-layout
--method-layout:random
-r:C:hwA74E094BpSystem..dll
-r:C:hwA74E094BpMicrosoft..dll
-r:C:hwA74E094Bpmscorlib.dll
-r:C:hwA74E094Bp
etstandard.dll
-O
" "dotnet" "C:hwA74E094Bpcrossgen2crossgen2.dll" @"C:hwA74E094BwB27F0950eJITMethodical\xxobjsizeofsizeof64_Target_64Bit_and_arm_il_rsizeof64_Target_64Bit_and_arm.dll.rsp"   -r:C:hwA74E094BwB27F0950eJITMethodicalMethodical_r2IL-CG2*.dll  -r:C:hwA74E094BwB27F0950eJITMethodical\xxobjsizeofsizeof64_Target_64Bit_and_arm_il_rIL-CG2*.dll"
Emitting R2R PE file: C:hwA74E094BwB27F0950eJITMethodical\xxobjsizeofsizeof64_Target_64Bit_and_arm_il_rsizeof64_Target_64Bit_and_arm.dll
" "dotnet" "C:hwA74E094Bp
2rdump
2rdump.dll" --header --sc --in C:hwA74E094BwB27F0950eJITMethodical\xxobjsizeofsizeof64_Target_64Bit_and_arm_il_rsizeof64_Target_64Bit_and_arm.dll --out C:hwA74E094BwB27F0950eJITMethodical\xxobjsizeofsizeof64_Target_64Bit_and_arm_il_rsizeof64_Target_64Bit_and_arm.dll.r2rdump --val"
11:39:16.88
11:39:16.90
Response file: C:hwA74E094BwB27F0950eJITMethodical\xxobjsizeofsizeof64_Target_64Bit_and_arm_il_rTestLibrary.dll.rsp
C:hwA74E094BwB27F0950eJITMethodical\xxobjsizeofsizeof64_Target_64Bit_and_arm_il_rIL-CG2TestLibrary.dll
-o:C:hwA74E094BwB27F0950eJITMethodical\xxobjsizeofsizeof64_Target_64Bit_and_arm_il_rTestLibrary.dll
--targetarch:arm
--targetos:windows
--verify-type-and-field-layout
--method-layout:random
-r:C:hwA74E094BpSystem..dll
-r:C:hwA74E094BpMicrosoft..dll
-r:C:hwA74E094Bpmscorlib.dll
-r:C:hwA74E094Bp
etstandard.dll
-O
" "dotnet" "C:hwA74E094Bpcrossgen2crossgen2.dll" @"C:hwA74E094BwB27F0950eJITMethodical\xxobjsizeofsizeof64_Target_64Bit_and_arm_il_rTestLibrary.dll.rsp"   -r:C:hw

Stack trace:

   at Xunit.Assert.True(Nullable`1 condition, String userMessage) in /_/src/xunit.assert/Asserts/BooleanAsserts.cs:line 132
   at TestLibrary.OutOfProcessTest.RunOutOfProcessTest(String basePath, String assemblyPath)
   at Program.<<Main>$>g__TestExecutor7|0_6(<>c__DisplayClass0_0&)
@JulieLeeMSFT JulieLeeMSFT added area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI blocking-outerloop Blocking the 'runtime-coreclr outerloop' and 'runtime-libraries-coreclr outerloop' runs labels Jan 24, 2023
@ghost ghost added the untriaged New issue has not been triaged by the area owner label Jan 24, 2023
@ghost
Copy link

ghost commented Jan 24, 2023

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch, @kunalspathak
See info in area-owners.md if you want to be subscribed.

Issue Details

Failed in run: runtime-coreclr outerloop 20230123.1

Failed tests:

R2R-CG2 windows arm Checked @ Windows.11.Arm64.Open
  - JIT/Methodical/xxobj/sizeof/sizeof64_Target_64Bit_and_arm_il_r/sizeof64_Target_64Bit_and_arm.cmd

Error message:

Fatal error. Internal CLR error. (0x80131506)
at System.GC.AllocateNewArray(IntPtr, Int32, GC_ALLOC_FLAGS)
at System.Text.StringBuilder.ExpandByABlock(Int32)
at System.Text.StringBuilder.Append(Char, Int32)
at System.Text.StringBuilder.Append(Char)
at System.Diagnostics.StackTrace.ToString(TraceFormat, System.Text.StringBuilder)
at System.Diagnostics.StackTrace.ToString(TraceFormat)
at System.Exception.get_StackTrace()
at System.Exception.ToString()
at Internal.JitInterface.CorInfoImpl.AllocException(System.Exception)
at Internal.JitInterface.CorInfoImpl._resolveToken(IntPtr, IntPtr*, Internal.JitInterface.CORINFO_RESOLVED_TOKEN*)
at Internal.JitInterface.CorInfoImpl.JitCompileMethod(IntPtr ByRef, IntPtr, IntPtr, IntPtr, Internal.JitInterface.CORINFO_METHOD_INFO ByRef, UInt32, IntPtr ByRef, UInt32 ByRef)
at Internal.JitInterface.CorInfoImpl.CompileMethodInternal(ILCompiler.DependencyAnalysis.IMethodNode, Internal.IL.MethodIL)
at Internal.JitInterface.CorInfoImpl.CompileMethod(ILCompiler.DependencyAnalysis.ReadyToRun.MethodWithGCInfo, ILCompiler.Logger)
at ILCompiler.ReadyToRunCodegenCompilation+<>c__DisplayClass46_0.<ComputeDependencyNodeDependencies>g__CompileOneMethod|5(ILCompiler.DependencyAnalysisFramework.DependencyNodeCore`1<ILCompiler.DependencyAnalysis.NodeFactory>, Int32)
at ILCompiler.ReadyToRunCodegenCompilation+<>c__DisplayClass46_0.<ComputeDependencyNodeDependencies>g__CompileOnThread|4(Int32)
at ILCompiler.ReadyToRunCodegenCompilation+<>c__DisplayClass46_0.<ComputeDependencyNodeDependencies>g__CompilationThread|3(System.Object)
at System.Threading.Thread.StartCallback()

Return code:      1
Raw output file:      C:hwA74E094BwB27F0950uploads\xxobjsizeofsizeof64_Target_64Bit_and_arm_il_routput.txt
Raw output:
BEGIN EXECUTION
sizeof64_Target_64Bit_and_arm.dll
TestLibrary.dll
2 file(s) copied.
11:39:15.16
Response file: C:hwA74E094BwB27F0950eJITMethodical\xxobjsizeofsizeof64_Target_64Bit_and_arm_il_rsizeof64_Target_64Bit_and_arm.dll.rsp
C:hwA74E094BwB27F0950eJITMethodical\xxobjsizeofsizeof64_Target_64Bit_and_arm_il_rIL-CG2sizeof64_Target_64Bit_and_arm.dll
-o:C:hwA74E094BwB27F0950eJITMethodical\xxobjsizeofsizeof64_Target_64Bit_and_arm_il_rsizeof64_Target_64Bit_and_arm.dll
--targetarch:arm
--targetos:windows
--verify-type-and-field-layout
--method-layout:random
-r:C:hwA74E094BpSystem..dll
-r:C:hwA74E094BpMicrosoft..dll
-r:C:hwA74E094Bpmscorlib.dll
-r:C:hwA74E094Bp
etstandard.dll
-O
" "dotnet" "C:hwA74E094Bpcrossgen2crossgen2.dll" @"C:hwA74E094BwB27F0950eJITMethodical\xxobjsizeofsizeof64_Target_64Bit_and_arm_il_rsizeof64_Target_64Bit_and_arm.dll.rsp"   -r:C:hwA74E094BwB27F0950eJITMethodicalMethodical_r2IL-CG2*.dll  -r:C:hwA74E094BwB27F0950eJITMethodical\xxobjsizeofsizeof64_Target_64Bit_and_arm_il_rIL-CG2*.dll"
Emitting R2R PE file: C:hwA74E094BwB27F0950eJITMethodical\xxobjsizeofsizeof64_Target_64Bit_and_arm_il_rsizeof64_Target_64Bit_and_arm.dll
" "dotnet" "C:hwA74E094Bp
2rdump
2rdump.dll" --header --sc --in C:hwA74E094BwB27F0950eJITMethodical\xxobjsizeofsizeof64_Target_64Bit_and_arm_il_rsizeof64_Target_64Bit_and_arm.dll --out C:hwA74E094BwB27F0950eJITMethodical\xxobjsizeofsizeof64_Target_64Bit_and_arm_il_rsizeof64_Target_64Bit_and_arm.dll.r2rdump --val"
11:39:16.88
11:39:16.90
Response file: C:hwA74E094BwB27F0950eJITMethodical\xxobjsizeofsizeof64_Target_64Bit_and_arm_il_rTestLibrary.dll.rsp
C:hwA74E094BwB27F0950eJITMethodical\xxobjsizeofsizeof64_Target_64Bit_and_arm_il_rIL-CG2TestLibrary.dll
-o:C:hwA74E094BwB27F0950eJITMethodical\xxobjsizeofsizeof64_Target_64Bit_and_arm_il_rTestLibrary.dll
--targetarch:arm
--targetos:windows
--verify-type-and-field-layout
--method-layout:random
-r:C:hwA74E094BpSystem..dll
-r:C:hwA74E094BpMicrosoft..dll
-r:C:hwA74E094Bpmscorlib.dll
-r:C:hwA74E094Bp
etstandard.dll
-O
" "dotnet" "C:hwA74E094Bpcrossgen2crossgen2.dll" @"C:hwA74E094BwB27F0950eJITMethodical\xxobjsizeofsizeof64_Target_64Bit_and_arm_il_rTestLibrary.dll.rsp"   -r:C:hw

Stack trace:

   at Xunit.Assert.True(Nullable`1 condition, String userMessage) in /_/src/xunit.assert/Asserts/BooleanAsserts.cs:line 132
   at TestLibrary.OutOfProcessTest.RunOutOfProcessTest(String basePath, String assemblyPath)
   at Program.<<Main>$>g__TestExecutor7|0_6(<>c__DisplayClass0_0&)
Author: JulieLeeMSFT
Assignees: markples
Labels:

area-CodeGen-coreclr, blocking-outerloop

Milestone: -

@JulieLeeMSFT
Copy link
Member Author

@markples, it is blocking outerloop. PTAL with a high priority.

@JulieLeeMSFT
Copy link
Member Author

JulieLeeMSFT commented Jan 24, 2023

@trylek, PTAL. We are seeing the same error message from different test cases. #11360 and #81118.
Edit: corrected the issue number to 11360z

Fatal error. Internal CLR error. (0x80131506)
at System.GC.AllocateNewArray(IntPtr, Int32, GC_ALLOC_FLAGS)
at System.Text.StringBuilder.ExpandByABlock(Int32)
at System.Text.StringBuilder.Append(Char, Int32)
at System.Text.StringBuilder.Append(Char)
at System.Diagnostics.StackTrace.ToString(TraceFormat, System.Text.StringBuilder)
at System.Diagnostics.StackTrace.ToString(TraceFormat)
at System.Exception.get_StackTrace()
at System.Exception.ToString()
at Internal.JitInterface.CorInfoImpl.AllocException(System.Exception)
at Internal.JitInterface.CorInfoImpl._resolveToken(IntPtr, IntPtr*, Internal.JitInterface.CORINFO_RESOLVED_TOKEN*)
at Internal.JitInterface.CorInfoImpl.JitCompileMethod(IntPtr ByRef, IntPtr, IntPtr, IntPtr, Internal.JitInterface.CORINFO_METHOD_INFO ByRef, UInt32, IntPtr ByRef, UInt32 ByRef)
at Internal.JitInterface.CorInfoImpl.CompileMethodInternal(ILCompiler.DependencyAnalysis.IMethodNode, Internal.IL.MethodIL)
at Internal.JitInterface.CorInfoImpl.CompileMethod(ILCompiler.DependencyAnalysis.ReadyToRun.MethodWithGCInfo, ILCompiler.Logger)
at ILCompiler.ReadyToRunCodegenCompilation+<>c__DisplayClass46_0.<ComputeDependencyNodeDependencies>g__CompileOneMethod|5(ILCompiler.DependencyAnalysisFramework.DependencyNodeCore`1<ILCompiler.DependencyAnalysis.NodeFactory>, Int32)
at ILCompiler.ReadyToRunCodegenCompilation+<>c__DisplayClass46_0.<ComputeDependencyNodeDependencies>g__CompileOnThread|4(Int32)
at ILCompiler.ReadyToRunCodegenCompilation+<>c__DisplayClass46_0.<ComputeDependencyNodeDependencies>g__CompilationThread|3(System.Object)
at System.Threading.Thread.StartCallback()

@markples
Copy link
Member

A meta-question about the testing here - when I look at the devops job - https://dev.azure.com/dnceng-public/public/_build/results?buildId=144935&view=logs&j=454a72be-7065-5174-fb75-00cb418aebaf&t=46d7521e-696b-5674-5766-9cffa0247852 - the log file for this test isn't listed. The test is there in the devops test results integration, and if I go to https://dataexplorer.azure.com/clusters/engsrvprod/databases/engineeringdata and manually find the logs I can see the failure there. Is there a problem with how test merging reports the overall job status back to devops?

Separately, and more directly relevant: do we have definitive instructions on how to repro a crossgen failure like this, especially one with "R2R-CG2 windows arm Checked @ Windows.11.Arm64.Open" that mentions both arm and arm64? I have a bunch of things to try, but I don't know how long it will take until one works and I have a repro to work with.

@JulieLeeMSFT
Copy link
Member Author

Moving @AaronRobinsonMSFT ‘s comment to the right issue: #81120 (comment)

@JulieLeeMSFT JulieLeeMSFT added the Priority:2 Work that is important, but not critical for the release label Jan 24, 2023
@trylek
Copy link
Member

trylek commented Jan 24, 2023

@markples - reproducing Crossgen2 failures locally should typically amount to building stuff for arm and setting the environment variable RunCrossgen2 to 1 or true (I think it's just checked for being non-empty). For the discrepancy between arm / arm64, I think it's just the same situation as x64 machines being able to execute x86 code by means of the 32-bit subsystem. Please let me know if I missed the point of your comment.

@JulieLeeMSFT
Copy link
Member Author

@trylek, As Markples mentioned, Send tests to Helix does not show the log file for this failure:
at System.GC.AllocateNewArray(IntPtr, Int32, GC_ALLOC_FLAGS).
The page shows 2 other failure logs. Why would it be?
@markwilkie, I grabbed the above error message from here. Where would the sytem get this test failure? The process seems to be different from other test failures cases.

A meta-question about the testing here - when I look at the devops job - https://dev.azure.com/dnceng-public/public/_build/results?buildId=144935&view=logs&j=454a72be-7065-5174-fb75-00cb418aebaf&t=46d7521e-696b-5674-5766-9cffa0247852 - the log file for this test isn't listed. The test is there in the devops test results integration, and if I go to https://dataexplorer.azure.com/clusters/engsrvprod/databases/engineeringdata and manually find the logs I can see the failure there. Is there a problem with how test merging reports the overall job status back to devops?

@markples
Copy link
Member

@JulieLeeMSFT What is the connection to #11360? One of the same tests failed, but I don't see any mention of a failure mode like this one in it.

@JulieLeeMSFT
Copy link
Member Author

Error message is the same. #11360 (comment)

@markples
Copy link
Member

@markples - reproducing Crossgen2 failures locally should typically amount to building stuff for arm and setting the environment variable RunCrossgen2 to 1 or true (I think it's just checked for being non-empty). For the discrepancy between arm / arm64, I think it's just the same situation as x64 machines being able to execute x86 code by means of the 32-bit subsystem. Please let me know if I missed the point of your comment.

Thanks @trylek. This is basically the point of my comment except that I haven't become confident that I'm reproing the correct way outside of the default testing modes. For this I tried:

  • (wasn't expecting this to work) - RunCrossgen2 fully under x64
  • RunCrossgen2 on build -a arm on an x64 machine. I think I had to mess with the scripts a bit. crossgen2 (managed) actually just ran fine, which is where the failure occurred in the lab. r2rdump (runs after crossgen2) was unhappy but I just hit cancel on its popup.
  • Copied the build -a arm artifacts to an arm machine. I had to meess with the scripts again. Also my local build (I did build tree ...\Methodical) ended up with TestLibrary.cmd in a different place (I think with the merged test rather than the individual location like seen in the failure log - but I'd have to recheck the details)

None of these reproed it. However, I'm a bit more confident that my steps would have been sufficient but that this isn't going to repro but will add a separate response here for that. (However, if something above just seems wrong, I'd be happy to hear corrections for next time!)

@markples
Copy link
Member

Error message is the same. #11360 (comment)

I don't understand. The discussion in 11360 is about a registry key. I don't see the Internal CLR Error / AllocateNewArray / StringBuilder or the AllocException / _resolveToken pattern anywhere in there.

@markples
Copy link
Member

@trylek, As Markples mentioned, Send tests to Helix does not show the log file for this failure: at System.GC.AllocateNewArray(IntPtr, Int32, GC_ALLOC_FLAGS). The page shows 2 other failure logs. Why would it be? @markwilkie, I grabbed the above error message from here. Where would the sytem get this test failure? The process seems to be different from other test failures cases.

I do not fully understand how these are hooked up, but the Tests tab that you linked is operating at the individual test level, whereas things like job failure are a single overall pass/fail. If we look at the helix log, we see the text that was eventually copied to where you saw it. It might be as simple as text scanning of these logs as the "running test" "passed test" "failed test" patterns are fairly simple, or it might be integration with the test harness. Regardless, we can see the script then goes wrong when it prints END EXECUTION - PASSED PASSED - fixing however that works should be able to then trigger the helix/ADO job failure logic.

@markples
Copy link
Member

I don't think we're going to get a repro for this. Looking at the outerloop failures, it passed at 5bd322b, failed at the very next commit a272954, and then passed several commits later at 9d44b9b. It also failed in the specific runtime-coreclr r2r pipeline, again just once and at the same commit. However, commit a272954 looks pretty safe.

Timestamps for job starts for the above:
outerloop:
pass Sun 5pm
fail Mon 1am
pass Mon 10am
r2r:
pass Sat 9pm
fail Sun 9pm
pass Mon 9pm

Of course, it's not surprising that two failures in rolling jobs for the same commit happened at roughly the same time, but this feels like something happened at that time (problem in a repo dependency seems unlikely, maybe a hardware problem).

I created a fake PR back at that point and launched the r2r job. If this passes, I think we should just close this and see if it happens again. If it fails, then we try harder to repro at precisely that commit.

@trylek
Copy link
Member

trylek commented Jan 26, 2023

@JulieLeeMSFT - For the problem regarding "Send to Helix" not reporting the merged JIT/Methodical work item as failed, I think this is likely a bug in the generated merged test wrapper, it probably returns 0 while it should return an exit code if any of the component tests have failed, adding @jkoritzinsky to confirm.

For the actual bug in the sizeof64_Target_64Bit_and_arm.cmd, I'm trying to repro that locally. At the first glance it looks like an OOM or invalid allocation size. As you can easily imagine, running the component tests in-proc means that in exceptional cases the failure may only be reproducible when running the entire merged test set, not just a single test, as it may be caused by some interaction between the individual test (e.g. excessive GC allocation or not shutting down a worker thread in an earlier test).

@markples
Copy link
Member

@trylek - There are a few more things worth noting:

  • sizeof_Target_64Bit_and_arm_il_r is marked RequiresProcessIsolation, so the machine state might be different when it is run, but I don't think the direct "many tests in-proc" concerns apply.
  • As Aaron noted, the stack trace appears to have two failures - something with resolveToken that tries to allocate an exception (which likely isn't supposed to happen but perhaps is intended to be caught?) and the array allocation that is internal to that. As above, I'm inclined to think that there is something strange going on beyond a product failure, but perhaps these are just more clues.

@trylek
Copy link
Member

trylek commented Jan 26, 2023

Oh, you're right, architecture-conditional tests now need to be marked as external as the merged test wrapper generator doesn't have sufficient logic to deal with the variations; on top of that, it turns out that some of the conditional tests crash JIT when getting built on a different architecture (even though the test would have been skipped at runtime). So if the tests runs out-of-proc I'm inclined to concur with your assessment that the problem is somewhere else; in particular, on arm64 machines it sometimes happen that a particular machine gets in a bad state so we should check the names of the offending machines that typically appear at the top of the Helix logs, that's what helped us several times in the past to identify faulty machines and ask the dnceng team to take them off rotation, reimage them or something else.

@markples
Copy link
Member

markples commented Jan 26, 2023

Here's a list of a few recent unexplained arm/arm64 failures and machines: 4 machine IDs, same OS:

#81111

coreclr windows arm Checked @ Windows.11.Arm64.Open

  • JIT\Regression\JitBlue\GitHub_18362\GitHub_18362\GitHub_18362.cmd
  • JIT\Regression\JitBlue\GitHub_12037\GitHub_12037\GitHub_12037.cmd
  • JIT\Regression\JitBlue\GitHub_36905\GitHub_36905\GitHub_36905.cmd

Console log: 'JIT.Regression.JitBlue' from job 0c80c0ec-a8da-4207-a792-72d84339586a workitem 9683ca79-3792-47ce-9a71-d8d72c6c7877 (windows.11.arm64.open) executed on machine a000XJH running Windows-10-10.0.22621-SP0

#81120

R2R-CG2 windows arm64 Checked @ Windows.11.Arm64.Open

  • Regressions\coreclr\GitHub_22348\Test22348\Test22348.cmd

Console log: 'PayloadGroup0' from job 9fd227ea-dc82-44d9-a46c-dfc2c402d758 workitem 8c216dcd-e1a1-480b-bcc8-6d394a92f9b3 (windows.11.arm64.open) executed on machine a000UAW running Windows-10-10.0.22621-SP0

#81109

R2R-CG2 windows arm Checked @ Windows.11.Arm64.Open

  • JIT/Methodical/xxobj/sizeof/sizeof64_Target_64Bit_and_arm_il_r/sizeof64_Target_64Bit_and_arm.cmd

runtime-coreclr r2r

Console log: 'Methodical_r2' from job 14fe2e89-fac7-4f18-ac98-fe2998020e90 workitem e6185482-4265-4f91-a175-601d4d641aa7 (windows.11.arm64.open) executed on machine a000XZF running Windows-10-10.0.22621-SP0

runtime-coreclr outerloop

Console log: 'Methodical_r2' from job 120f293f-6f83-4e15-b8ce-53a1e9c364e4 workitem 84ad56c-22de-4bfd-80c2-24a78e759090 (windows.11.arm64.open) executed on machine a000YEX running Windows-10-10.0.22621-SP0

@trylek
Copy link
Member

trylek commented Jan 26, 2023

Hmm, that doesn't look like machine-specific.

@markples markples removed blocking-outerloop Blocking the 'runtime-coreclr outerloop' and 'runtime-libraries-coreclr outerloop' runs untriaged New issue has not been triaged by the area owner labels Jan 26, 2023
@markples
Copy link
Member

oops - sorry - same OS (I misread the workitem OS and "executed on" OS distinction when I saw two) - correcting above

@BruceForstall
Copy link
Member

Looks like another case:

Interop\PInvoke\Delegate\DelegateTest\DelegateTest.cmd

https://dev.azure.com/dnceng-public/public/_build/results?buildId=171006&view=ms.vss-test-web.build-test-results-tab&runId=3424924&resultId=115182&paneView=debug

R2R-CG2 windows arm64 Checked no_tiered_compilation @ Windows.11.Arm64.Open

    Interop\PInvoke\Delegate\DelegateTest\DelegateTest.cmd [FAIL]
      Fatal error. Internal CLR error. (0x80131506)
         at System.GC.AllocateNewArray(IntPtr, Int32, GC_ALLOC_FLAGS)
         at System.Text.StringBuilder.ExpandByABlock(Int32)
         at System.Text.StringBuilder.Append(Char, Int32)
         at System.Text.StringBuilder.Append(Char)
         at System.Diagnostics.StackTrace.ToString(TraceFormat, System.Text.StringBuilder)
         at System.Diagnostics.StackTrace.ToString(TraceFormat)
         at System.Exception.get_StackTrace()
         at System.Exception.ToString()
         at Internal.JitInterface.CorInfoImpl.AllocException(System.Exception)
         at Internal.JitInterface.CorInfoImpl._resolveToken(IntPtr, IntPtr*, Internal.JitInterface.CORINFO_RESOLVED_TOKEN*)
         at Internal.JitInterface.CorInfoImpl.JitCompileMethod(IntPtr ByRef, IntPtr, IntPtr, IntPtr, Internal.JitInterface.CORINFO_METHOD_INFO ByRef, UInt32, IntPtr ByRef, UInt32 ByRef)
         at Internal.JitInterface.CorInfoImpl.JitCompileMethod(IntPtr ByRef, IntPtr, IntPtr, IntPtr, Internal.JitInterface.CORINFO_METHOD_INFO ByRef, UInt32, IntPtr ByRef, UInt32 ByRef)
         at Internal.JitInterface.CorInfoImpl.CompileMethodInternal(ILCompiler.DependencyAnalysis.IMethodNode, Internal.IL.MethodIL)
         at Internal.JitInterface.CorInfoImpl.CompileMethod(ILCompiler.DependencyAnalysis.ReadyToRun.MethodWithGCInfo, ILCompiler.Logger)
         at ILCompiler.ReadyToRunCodegenCompilation+<>c__DisplayClass46_0.<ComputeDependencyNodeDependencies>g__CompileOneMethod|5(ILCompiler.DependencyAnalysisFramework.DependencyNodeCore`1<ILCompiler.DependencyAnalysis.NodeFactory>, Int32)
         at ILCompiler.ReadyToRunCodegenCompilation+<>c__DisplayClass46_0.<ComputeDependencyNodeDependencies>g__CompileOnThread|4(Int32)
         at ILCompiler.ReadyToRunCodegenCompilation+<>c__DisplayClass46_0.<ComputeDependencyNodeDependencies>g__CompileMethodList|2(System.Collections.Generic.IEnumerable`1<ILCompiler.DependencyAnalysisFramework.DependencyNodeCore`1<ILCompiler.DependencyAnalysis.NodeFactory>>)
         at ILCompiler.ReadyToRunCodegenCompilation.ComputeDependencyNodeDependencies(System.Collections.Generic.List`1<ILCompiler.DependencyAnalysisFramework.DependencyNodeCore`1<ILCompiler.DependencyAnalysis.NodeFactory>>)
         at ILCompiler.DependencyAnalysisFramework.DependencyAnalyzer`2[[ILCompiler.DependencyAnalysisFramework.NoLogStrategy`1[[System.__Canon, System.Private.CoreLib, Version=8.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]], ILCompiler.DependencyAnalysisFramework, Version=8.0.0.0, Culture=neutral, PublicKeyToken=null],[System.__Canon, System.Private.CoreLib, Version=8.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]].ComputeMarkedNodes()
         at ILCompiler.ReadyToRunCodegenCompilation.Compile(System.String)
         at ILCompiler.Program.RunSingleCompilation(System.Collections.Generic.Dictionary`2<System.String,System.String>, ILCompiler.InstructionSetSupport, System.String, System.Collections.Generic.Dictionary`2<System.String,System.String>, System.Collections.Generic.HashSet`1<Internal.TypeSystem.ModuleDesc>, ILCompiler.ReadyToRunCompilerContext)
         at ILCompiler.Program.Run()
         at ILCompiler.Crossgen2RootCommand+<>c__DisplayClass187_0.<.ctor>b__0(System.CommandLine.Invocation.InvocationContext)
         at System.CommandLine.Invocation.AnonymousCommandHandler.Invoke(System.CommandLine.Invocation.InvocationContext)
         at System.CommandLine.Invocation.InvocationPipeline+<>c__DisplayClass4_0+<<BuildInvocationChain>b__0>d.MoveNext()
         at System.Runtime.CompilerServices.AsyncMethodBuilderCore.Start[[System.CommandLine.Invocation.InvocationPipeline+<>c__DisplayClass4_0+<<BuildInvocationChain>b__0>d, System.CommandLine, Version=2.0.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35]](<<BuildInvocationChain>b__0>d ByRef)
         at System.CommandLine.Invocation.InvocationPipeline+<>c__DisplayClass4_0.<BuildInvocationChain>b__0(System.CommandLine.Invocation.InvocationContext, System.Func`2<System.CommandLine.Invocation.InvocationContext,System.Threading.Tasks.Task>)
         at System.CommandLine.CommandLineBuilderExtensions+<>c__DisplayClass16_0+<<UseParseErrorReporting>b__0>d.MoveNext()
         at System.Runtime.CompilerServices.AsyncMethodBuilderCore.Start[[System.CommandLine.CommandLineBuilderExtensions+<>c__DisplayClass16_0+<<UseParseErrorReporting>b__0>d, System.CommandLine, Version=2.0.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35]](<<UseParseErrorReporting>b__0>d ByRef)
         at System.CommandLine.CommandLineBuilderExtensions+<>c__DisplayClass16_0.<UseParseErrorReporting>b__0(System.CommandLine.Invocation.InvocationContext, System.Func`2<System.CommandLine.Invocation.InvocationContext,System.Threading.Tasks.Task>)
         at System.CommandLine.CommandLineBuilderExtensions+<>c__DisplayClass11_0+<<UseHelp>b__0>d.MoveNext()
         at System.Runtime.CompilerServices.AsyncMethodBuilderCore.Start[[System.CommandLine.CommandLineBuilderExtensions+<>c__DisplayClass11_0+<<UseHelp>b__0>d, System.CommandLine, Version=2.0.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35]](<<UseHelp>b__0>d ByRef)
         at System.CommandLine.CommandLineBuilderExtensions+<>c__DisplayClass11_0.<UseHelp>b__0(System.CommandLine.Invocation.InvocationContext, System.Func`2<System.CommandLine.Invocation.InvocationContext,System.Threading.Tasks.Task>)
         at System.CommandLine.CommandLineBuilderExtensions+<>c__DisplayClass22_0+<<UseVersionOption>b__0>d.MoveNext()
         at System.Runtime.CompilerServices.AsyncMethodBuilderCore.Start[[System.CommandLine.CommandLineBuilderExtensions+<>c__DisplayClass22_0+<<UseVersionOption>b__0>d, System.CommandLine, Version=2.0.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35]](<<UseVersionOption>b__0>d ByRef)
         at System.CommandLine.CommandLineBuilderExtensions+<>c__DisplayClass22_0.<UseVersionOption>b__0(System.CommandLine.Invocation.InvocationContext, System.Func`2<System.CommandLine.Invocation.InvocationContext,System.Threading.Tasks.Task>)
         at System.CommandLine.Invocation.InvocationPipeline.<Invoke>g__FullInvocationChain|3_0(System.CommandLine.Invocation.InvocationContext)
         at System.CommandLine.Invocation.InvocationPipeline.Invoke(System.CommandLine.IConsole)
         at ILCompiler.Program.Main(System.String[])
      
      Return code:      1
      Raw output file:      C:\h\w\B74F09AC\w\AB8B0995\uploads\Reports\Interop.PInvoke\Delegate\DelegateTest\DelegateTest.output.txt
      Raw output:
      BEGIN EXECUTION
      DelegateTest.dll
      DelegateTestNative.dll
      TestLibrary.dll
              3 file(s) copied.
      10:30:55.78
      Response file: C:\h\w\B74F09AC\w\AB8B0995\e\Interop\PInvoke\Delegate\DelegateTest\DelegateTest.dll.rsp
      C:\h\w\B74F09AC\w\AB8B0995\e\Interop\PInvoke\Delegate\DelegateTest\IL-CG2\DelegateTest.dll
      -o:C:\h\w\B74F09AC\w\AB8B0995\e\Interop\PInvoke\Delegate\DelegateTest\DelegateTest.dll
      --targetarch:arm64
      --targetos:windows
      --verify-type-and-field-layout
      --method-layout:random
      -r:C:\h\w\B74F09AC\p\System.*.dll
      -r:C:\h\w\B74F09AC\p\Microsoft.*.dll
      -r:C:\h\w\B74F09AC\p\mscorlib.dll
      -r:C:\h\w\B74F09AC\p\netstandard.dll
      -O
      " "dotnet" "C:\h\w\B74F09AC\p\crossgen2\crossgen2.dll" @"C:\h\w\B74F09AC\w\AB8B0995\e\Interop\PInvoke\Delegate\DelegateTest\DelegateTest.dll.rsp"   -r:C:\h\w\B74F09AC\w\AB8B0995\e\Interop\PInvoke\Delegate\DelegateTest\IL-CG2\*.dll"
      Emitting R2R PE file: C:\h\w\B74F09AC\w\AB8B0995\e\Interop\PInvoke\Delegate\DelegateTest\DelegateTest.dll
      " "dotnet" "C:\h\w\B74F09AC\p\r2rdump\r2rdump.dll" --header --sc --in C:\h\w\B74F09AC\w\AB8B0995\e\Interop\PInvoke\Delegate\DelegateTest\DelegateTest.dll --out C:\h\w\B74F09AC\w\AB8B0995\e\Interop\PInvoke\Delegate\DelegateTest\DelegateTest.dll.r2rdump --val"
      10:30:57.99
      10:30:57.99
      Response file: C:\h\w\B74F09AC\w\AB8B0995\e\Interop\PInvoke\Delegate\DelegateTest\DelegateTestNative.dll.rsp
      C:\h\w\B74F09AC\w\AB8B0995\e\Interop\PInvoke\Delegate\DelegateTest\IL-CG2\DelegateTestNative.dll
      -o:C:\h\w\B74F09AC\w\AB8B0995\e\Interop\PInvoke\Delegate\DelegateTest\DelegateTestNative.dll
      --targetarch:arm64
      --targetos:windows
      --verify-type-and-field-layout
      --method-layout:random
      -r:C:\h\w\B74F09AC\p\System.*.dll
      -r:C:\h\w\B74F09AC\p\Microsoft.*.dll
      -r:C:\h\w\B74F09AC\p\mscorlib.dll
      -r:C:\h\w\B74F09AC\p\netstandard.dll
      -O
      " "dotnet" "C:\h\w\B74F09AC\p\crossgen2\crossgen2.dll" @"C:\h\w\B74F09AC\w\AB8B0995\e\Interop\PInvoke\Delegate\DelegateTest\DelegateTestNative.dll.rsp"   -r:C:\h\w\B74F09AC\w\AB8B0995\e\Interop\PInvoke\Delegate\DelegateTest\IL-CG2\*.dll"
      No input files are loadable
      10:30:58.61
      10:30:58.61
      Response file: C:\h\w\B74F09AC\w\AB8B0995\e\Interop\PInvoke\Delegate\DelegateTest\TestLibrary.dll.rsp
      C:\h\w\B74F09AC\w\AB8B0995\e\Interop\PInvoke\Delegate\DelegateTest\IL-CG2\TestLibrary.dll
      -o:C:\h\w\B74F09AC\w\AB8B0995\e\Interop\PInvoke\Delegate\DelegateTest\TestLibrary.dll
      --targetarch:arm64
      --targetos:windows
      --verify-type-and-field-layout
      --method-layout:random
      -r:C:\h\w\B74F09AC\p\System.*.dll
      -r:C:\h\w\B74F09AC\p\Microsoft.*.dll
      -r:C:\h\w\B74F09AC\p\mscorlib.dll
      -r:C:\h\w\B74F09AC\p\netstandard.dll
      -O
      " "dotnet" "C:\h\w\B74F09AC\p\crossgen2\crossgen2.dll" @"C:\h\w\B74F09AC\w\AB8B0995\e\Interop\PInvoke\Delegate\DelegateTest\TestLibrary.dll.rsp"   -r:C:\h\w\B74F09AC\w\AB8B0995\e\Interop\PInvoke\Delegate\DelegateTest\IL-CG2\*.dll"
      10:31:10.51
      Crossgen2 failed with exitcode - -1073741819
      Test Harness Exitcode is : 1

@BruceForstall BruceForstall added the blocking-outerloop Blocking the 'runtime-coreclr outerloop' and 'runtime-libraries-coreclr outerloop' runs label Feb 14, 2023
@BruceForstall
Copy link
Member

@markples @trylek Since this happened in an outerloop run, I'm re-adding the "blocking-outerloop" label

@BruceForstall BruceForstall removed the Priority:2 Work that is important, but not critical for the release label Feb 14, 2023
@BruceForstall
Copy link
Member

Another:

Interop\UnmanagedCallConv\UnmanagedCallConvTest\UnmanagedCallConvTest.cmd

R2R-CG2 windows arm64 Checked no_tiered_compilation @ Windows.11.Arm64.Open

https://dev.azure.com/dnceng-public/public/_build/results?buildId=170991&view=ms.vss-test-web.build-test-results-tab&runId=3424615&paneView=debug&resultId=115544

@JulieLeeMSFT
Copy link
Member Author

Ping @jkoritzinsky.

@JulieLeeMSFT - For the problem regarding "Send to Helix" not reporting the merged JIT/Methodical work item as failed, I think this is likely a bug in the generated merged test wrapper, it probably returns 0 while it should return an exit code if any of the component tests have failed, adding @jkoritzinsky to confirm.

@jkoritzinsky
Copy link
Member

We specifically have the merged wrapper succeed even if a test fails because we don't want the "Work Item Failure" entry for a test failure, only when the job crashed or didn't run to completion. This is the expected behavior for Helix.

@markples
Copy link
Member

I think this is change in behavior, isn't it? When I have the test monitor role, I look at the run list for a pipeline, and if recent instances have failed, I open them and look at the stages/jobs. Should I now be looking at the "tests" tab for specific tests and the job list for things like job crashes?

@jkoritzinsky
Copy link
Member

The "tests" tab will have both crashes and test failures. Individual test failures will have the name of the test, and crashes that take down the whole work item will have a "WorkItemName Work Item Failure" item in addition to any recorded test failures. I've been using this workflow for years for both the runtime and libraries test trees.

@trylek
Copy link
Member

trylek commented Feb 14, 2023

@jkoritzinsky - I think that Julie's concern I responded to stemmed from the fact that for non-merged tests you see this bit around "Helix work item failure blah blah blah and here is the log" in the "Send to Helix" phase of the runtime test run jobs; apparently we're not seeing the equivalent thing for the merged tests, if it's not about the exit code, what is causing the difference?

@BruceForstall
Copy link
Member

Another case:

baseservices\invalid_operations\InvalidOperations\InvalidOperations.cmd

R2R-CG2 windows arm Checked @ Windows.11.Arm64.Open

https://dev.azure.com/dnceng-public/public/_build/results?buildId=174073&view=ms.vss-test-web.build-test-results-tab&runId=3460010&paneView=debug&resultId=100024

@BruceForstall
Copy link
Member

Several new occurrences, e.g.:

CoreMangLib\system\enum\EnumIConvertibleToUint16\EnumIConvertibleToUint16.cmd

R2R-CG2 windows arm64 Checked jitstressregs0x10 @ Windows.11.Arm64.Open

https://dev.azure.com/dnceng-public/public/_build/results?buildId=183267&view=ms.vss-test-web.build-test-results-tab&runId=3560173&paneView=debug&resultId=121383

    CoreMangLib\system\enum\EnumIConvertibleToUint16\EnumIConvertibleToUint16.cmd [FAIL]
      Fatal error. Internal CLR error. (0x80131506)
         at System.GC.AllocateNewArray(IntPtr, Int32, GC_ALLOC_FLAGS)
         at System.Text.StringBuilder.ExpandByABlock(Int32)
         at System.Text.StringBuilder.Append(Char, Int32)
         at System.Text.StringBuilder.Append(Char)
         at System.Diagnostics.StackTrace.ToString(TraceFormat, System.Text.StringBuilder)
         at System.Diagnostics.StackTrace.ToString(TraceFormat)
         at System.Exception.get_StackTrace()
         at System.Exception.ToString()
         at Internal.JitInterface.CorInfoImpl.AllocException(System.Exception)
         at Internal.JitInterface.CorInfoImpl._resolveToken(IntPtr, IntPtr*, Internal.JitInterface.CORINFO_RESOLVED_TOKEN*)
         at Internal.JitInterface.CorInfoImpl.JitCompileMethod(IntPtr ByRef, IntPtr, IntPtr, IntPtr, Internal.JitInterface.CORINFO_METHOD_INFO ByRef, UInt32, IntPtr ByRef, UInt32 ByRef)
         at Internal.JitInterface.CorInfoImpl.JitCompileMethod(IntPtr ByRef, IntPtr, IntPtr, IntPtr, Internal.JitInterface.CORINFO_METHOD_INFO ByRef, UInt32, IntPtr ByRef, UInt32 ByRef)
         at Internal.JitInterface.CorInfoImpl.CompileMethodInternal(ILCompiler.DependencyAnalysis.IMethodNode, Internal.IL.MethodIL)
         at Internal.JitInterface.CorInfoImpl.CompileMethod(ILCompiler.DependencyAnalysis.ReadyToRun.MethodWithGCInfo, ILCompiler.Logger)
         at ILCompiler.ReadyToRunCodegenCompilation+<>c__DisplayClass46_0.<ComputeDependencyNodeDependencies>g__CompileOneMethod|5(ILCompiler.DependencyAnalysisFramework.DependencyNodeCore`1<ILCompiler.DependencyAnalysis.NodeFactory>, Int32)
         at ILCompiler.ReadyToRunCodegenCompilation+<>c__DisplayClass46_0.<ComputeDependencyNodeDependencies>g__CompileOnThread|4(Int32)
         at ILCompiler.ReadyToRunCodegenCompilation+<>c__DisplayClass46_0.<ComputeDependencyNodeDependencies>g__CompileMethodList|2(System.Collections.Generic.IEnumerable`1<ILCompiler.DependencyAnalysisFramework.DependencyNodeCore`1<ILCompiler.DependencyAnalysis.NodeFactory>>)
         at ILCompiler.ReadyToRunCodegenCompilation.ComputeDependencyNodeDependencies(System.Collections.Generic.List`1<ILCompiler.DependencyAnalysisFramework.DependencyNodeCore`1<ILCompiler.DependencyAnalysis.NodeFactory>>)
         at ILCompiler.DependencyAnalysisFramework.DependencyAnalyzer`2[[ILCompiler.DependencyAnalysisFramework.NoLogStrategy`1[[System.__Canon, System.Private.CoreLib, Version=8.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]], ILCompiler.DependencyAnalysisFramework, Version=8.0.0.0, Culture=neutral, PublicKeyToken=null],[System.__Canon, System.Private.CoreLib, Version=8.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]].ComputeMarkedNodes()
         at ILCompiler.ReadyToRunCodegenCompilation.Compile(System.String)
         at ILCompiler.Program.RunSingleCompilation(System.Collections.Generic.Dictionary`2<System.String,System.String>, ILCompiler.InstructionSetSupport, System.String, System.Collections.Generic.Dictionary`2<System.String,System.String>, System.Collections.Generic.HashSet`1<Internal.TypeSystem.ModuleDesc>, ILCompiler.ReadyToRunCompilerContext)
         at ILCompiler.Program.Run()
         at ILCompiler.Crossgen2RootCommand+<>c__DisplayClass187_0.<.ctor>b__0(System.CommandLine.Invocation.InvocationContext)
         at System.CommandLine.Invocation.AnonymousCommandHandler.Invoke(System.CommandLine.Invocation.InvocationContext)
         at System.CommandLine.Invocation.InvocationPipeline+<>c__DisplayClass4_0+<<BuildInvocationChain>b__0>d.MoveNext()
         at System.Runtime.CompilerServices.AsyncMethodBuilderCore.Start[[System.CommandLine.Invocation.InvocationPipeline+<>c__DisplayClass4_0+<<BuildInvocationChain>b__0>d, System.CommandLine, Version=2.0.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35]](<<BuildInvocationChain>b__0>d ByRef)
         at System.CommandLine.Invocation.InvocationPipeline+<>c__DisplayClass4_0.<BuildInvocationChain>b__0(System.CommandLine.Invocation.InvocationContext, System.Func`2<System.CommandLine.Invocation.InvocationContext,System.Threading.Tasks.Task>)
         at System.CommandLine.CommandLineBuilderExtensions+<>c__DisplayClass16_0+<<UseParseErrorReporting>b__0>d.MoveNext()
         at System.Runtime.CompilerServices.AsyncMethodBuilderCore.Start[[System.CommandLine.CommandLineBuilderExtensions+<>c__DisplayClass16_0+<<UseParseErrorReporting>b__0>d, System.CommandLine, Version=2.0.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35]](<<UseParseErrorReporting>b__0>d ByRef)
         at System.CommandLine.CommandLineBuilderExtensions+<>c__DisplayClass16_0.<UseParseErrorReporting>b__0(System.CommandLine.Invocation.InvocationContext, System.Func`2<System.CommandLine.Invocation.InvocationContext,System.Threading.Tasks.Task>)
         at System.CommandLine.CommandLineBuilderExtensions+<>c__DisplayClass11_0+<<UseHelp>b__0>d.MoveNext()
         at System.Runtime.CompilerServices.AsyncMethodBuilderCore.Start[[System.CommandLine.CommandLineBuilderExtensions+<>c__DisplayClass11_0+<<UseHelp>b__0>d, System.CommandLine, Version=2.0.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35]](<<UseHelp>b__0>d ByRef)
         at System.CommandLine.CommandLineBuilderExtensions+<>c__DisplayClass11_0.<UseHelp>b__0(System.CommandLine.Invocation.InvocationContext, System.Func`2<System.CommandLine.Invocation.InvocationContext,System.Threading.Tasks.Task>)
         at System.CommandLine.CommandLineBuilderExtensions+<>c__DisplayClass22_0+<<UseVersionOption>b__0>d.MoveNext()
         at System.Runtime.CompilerServices.AsyncMethodBuilderCore.Start[[System.CommandLine.CommandLineBuilderExtensions+<>c__DisplayClass22_0+<<UseVersionOption>b__0>d, System.CommandLine, Version=2.0.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35]](<<UseVersionOption>b__0>d ByRef)
         at System.CommandLine.CommandLineBuilderExtensions+<>c__DisplayClass22_0.<UseVersionOption>b__0(System.CommandLine.Invocation.InvocationContext, System.Func`2<System.CommandLine.Invocation.InvocationContext,System.Threading.Tasks.Task>)
         at System.CommandLine.Invocation.InvocationPipeline.<Invoke>g__FullInvocationChain|3_0(System.CommandLine.Invocation.InvocationContext)
         at System.CommandLine.Invocation.InvocationPipeline.Invoke(System.CommandLine.IConsole)
         at ILCompiler.Program.Main(System.String[])

@JulieLeeMSFT
Copy link
Member Author

@trylek another 6 occurences as listed in #83407.

One of them:: runtime-coreclr outerloop 20230312.3

Failed test:

R2R-CG2 windows arm Checked no_tiered_compilation @ Windows.11.Arm64.Open
-Interop\\PInvoke\\CriticalHandles\\StructTest\\StructTest\\StructTest.cmd

@trylek
Copy link
Member

trylek commented Mar 14, 2023

@jkotas commented on this yesterday in the issue thread

#77820

Apparently the problem is understood and caused by a race condition that has been fixed since but we need to roll forward to SDK Preview 2 as the LKG version used for executing Crossgen2 to get this fixed completely. As the cause of the crash is now understood, I'll put up a PR fixing the primary cause of the exception - missing reference to xunit assemblies when running Crossgen2 to build the tests - and that should mitigate the problem. (I didn't want to fix this before the crash is understood as otherwise the fixed references would basically hide this error.)

@JulieLeeMSFT
Copy link
Member Author

@trylek, You merged #83413. Are we waiting for additional fix to close this isse?

@trylek
Copy link
Member

trylek commented Apr 4, 2023

@JulieLeeMSFT - my change made Crossgen2 throw much fewer exceptions so we no longer see the occasional exception handling failures on arm64. According to my understanding of @jkotas' explanation the underlying problem is a race condition in exception handling that should be fixed by rolling forward to SDK preview 2 as the LKG version used by the runtime repo. In my runtime repo clone from earlier today,

dotnet --version

still yields

8.0.100-preview.1.23115.2

so I suspect more work may be needed to fully fix this.

@markples
Copy link
Member

markples commented Apr 7, 2023

Here's a direct link to dotnet/runtime main global.json to easily check the sdk version:
https://github.com/dotnet/runtime/blob/main/global.json

@JulieLeeMSFT JulieLeeMSFT added this to the 8.0.0 milestone Apr 13, 2023
@markples
Copy link
Member

@trylek - we appear to be on preview.3 now. In your comment "much fewer exceptions", did you mean that we're down to the expected number in the tests, or that there are still ones in the same flavor as the ones that you fixed? (In other words, did "more work" only mean waiting for preview.3 to appear so that we can close this now, or is there something else?) Thanks!

@trylek
Copy link
Member

trylek commented Apr 17, 2023

@markples - I believe this can be closed now. My original mitigation just reduced the number of exceptions internally thrown and caught during Crossgen2 compilation and so reduced the repro rate of this non-deterministic race condition. As we're now at Preview 3 containing the proper fix for the race condition according to JanK's explanation, we should be good now.

@trylek trylek closed this as completed Apr 17, 2023
@ghost ghost locked as resolved and limited conversation to collaborators May 17, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI blocking-outerloop Blocking the 'runtime-coreclr outerloop' and 'runtime-libraries-coreclr outerloop' runs
Projects
None yet
Development

No branches or pull requests

5 participants