Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ComInterfaceGenerator.Unit.Tests.WorkItemExecution failing with Internal CLR error on Win x86 #77087

Closed
jkotas opened this issue Oct 15, 2022 · 6 comments · Fixed by #77577
Closed
Labels
area-GC-coreclr blocking-clean-ci Blocking PR or rolling runs of 'runtime' or 'runtime-extra-platforms' bug Known Build Error Use this to report build issues in the .NET Helix tab tenet-reliability Reliability/stability related issue (stress, load problems, etc.)

Comments

@jkotas
Copy link
Member

jkotas commented Oct 15, 2022

  Discovering: ComInterfaceGenerator.Unit.Tests (method display = ClassAndMethod, method display options = None)
  Discovered:  ComInterfaceGenerator.Unit.Tests (found 10 test cases)
  Starting:    ComInterfaceGenerator.Unit.Tests (parallel test collections = on, max threads = 2)
Fatal error. Internal CLR error. (0x80131506)

Build Information

Build: https://dev.azure.com/dnceng-public/cbb18261-c48f-4abb-8651-8cdcb5474649/_build/results?buildId=53751
Build error leg or test failing: ComInterfaceGenerator.Unit.Tests.WorkItemExecution
Pull request: #77080
Log: https://dev.azure.com/dnceng-public/public/_build/results?buildId=53751&view=ms.vss-test-web.build-test-results-tab&runId=1099832&resultId=206563&paneView=dotnet-dnceng.dnceng-build-release-tasks.helix-test-information-tab

{
  "ErrorMessage": "Fatal error. Internal CLR error. (0x80131506)",
  "BuildRetry": false
}

Report

Build Definition Test Pull Request
68007 dotnet/runtime LibraryImportGenerator.Unit.Tests.WorkItemExecution #77594
67915 dotnet/runtime LibraryImportGenerator.Unit.Tests.WorkItemExecution #77595
67410 dotnet/runtime System.Runtime.Tests.WorkItemExecution #77649
67272 dotnet/runtime System.Collections.Tests.WorkItemExecution
67238 dotnet/runtime Microsoft.Extensions.Logging.Generators.Roslyn3.11.Tests.WorkItemExecution #77563
67132 dotnet/runtime LibraryImportGenerator.Unit.Tests.WorkItemExecution #77196
66591 dotnet/runtime System.Diagnostics.PerformanceCounter.Tests.WorkItemExecution
66272 dotnet/runtime LibraryImportGenerator.Unit.Tests.WorkItemExecution #77582
66111 dotnet/runtime LibraryImportGenerator.Unit.Tests.WorkItemExecution #77595
64947 dotnet/runtime LibraryImportGenerator.Unit.Tests.WorkItemExecution #77522
65443 dotnet/runtime LibraryImportGenerator.Unit.Tests.WorkItemExecution #77560
64934 dotnet/runtime System.Threading.Tasks.Tests.WorkItemExecution #77449
64914 dotnet/runtime LibraryImportGenerator.Unit.Tests.WorkItemExecution #77169
64782 dotnet/runtime System.Diagnostics.PerformanceCounter.Tests.WorkItemExecution #77487
64783 dotnet/runtime ComInterfaceGenerator.Unit.Tests.WorkItemExecution #77487
64185 dotnet/runtime System.Xml.Linq.Events.Tests.WorkItemExecution #77499
64174 dotnet/runtime System.Text.RegularExpressions.Tests.WorkItemExecution #77149
64009 dotnet/runtime System.Reflection.Metadata.Tests.WorkItemExecution #77449
63786 dotnet/runtime System.Diagnostics.PerformanceCounter.Tests.WorkItemExecution #77386
63754 dotnet/runtime LibraryImportGenerator.Unit.Tests.WorkItemExecution #77261
62229 dotnet/runtime System.Private.Uri.Functional.Tests.WorkItemExecution #76873
63668 dotnet/runtime System.Net.Primitives.Pal.Tests.WorkItemExecution
63007 dotnet/runtime System.Runtime.Tests.WorkItemExecution #76803
62572 dotnet/runtime JSImportGenerator.Unit.Tests.WorkItemExecution #77386
62538 dotnet/runtime System.Text.RegularExpressions.Tests.WorkItemExecution #77169
62446 dotnet/runtime LibraryImportGenerator.Unit.Tests.WorkItemExecution #76461
62315 dotnet/runtime LibraryImportGenerator.Unit.Tests.WorkItemExecution #77261
62055 dotnet/runtime LibraryImportGenerator.Unit.Tests.WorkItemExecution #77425
61588 dotnet/runtime System.Text.RegularExpressions.Tests.WorkItemExecution #69911
61384 dotnet/runtime LibraryImportGenerator.Unit.Tests.WorkItemExecution #74157
60884 dotnet/runtime ComInterfaceGenerator.Unit.Tests.WorkItemExecution #77181
60534 dotnet/runtime LibraryImportGenerator.Unit.Tests.WorkItemExecution #77360

Summary

24-Hour Hit Count 7-Day Hit Count 1-Month Count
3 30 32
@jkotas jkotas added blocking-clean-ci Blocking PR or rolling runs of 'runtime' or 'runtime-extra-platforms' Known Build Error Use this to report build issues in the .NET Helix tab labels Oct 15, 2022
@ghost ghost added the untriaged New issue has not been triaged by the area owner label Oct 15, 2022
@jkotas jkotas added area-System.Runtime.InteropServices and removed untriaged New issue has not been triaged by the area owner labels Oct 15, 2022
@ghost
Copy link

ghost commented Oct 15, 2022

Tagging subscribers to this area: @dotnet/interop-contrib
See info in area-owners.md if you want to be subscribed.

Issue Details

Build Information

Build: https://dev.azure.com/dnceng-public/cbb18261-c48f-4abb-8651-8cdcb5474649/_build/results?buildId=53751
Build error leg or test failing: ComInterfaceGenerator.Unit.Tests.WorkItemExecution
Pull request: #77080

Error Message

Fill the error message using known issues guidance.

{
  "ErrorMessage": "ComInterfaceGenerator.Unit.Tests.WorkItemExecution",
  "BuildRetry": false
}
Author: jkotas
Assignees: -
Labels:

area-System.Runtime.InteropServices, blocking-clean-ci, Known Build Error

Milestone: -

@jkotas
Copy link
Member Author

jkotas commented Oct 15, 2022

Stacktrace of the crash:

coreclr!WKS::gc_heap::mark_array_marked+0x1a [D:\a\_work\1\s\src\coreclr\gc\gc.cpp @ 8471] 
coreclr!WKS::gc_heap::background_mark1+0x1a [D:\a\_work\1\s\src\coreclr\gc\gc.cpp @ 23743] 
coreclr!WKS::gc_heap::background_mark_simple+0x1b [D:\a\_work\1\s\src\coreclr\gc\gc.cpp @ 25142] 
coreclr!WKS::gc_heap::background_mark_object+0x22 [D:\a\_work\1\s\src\coreclr\gc\gc.cpp @ 25162] 
coreclr!WKS::gc_heap::revisit_written_page+0x3c8 [D:\a\_work\1\s\src\coreclr\gc\gc.cpp @ 35631] 
coreclr!WKS::gc_heap::revisit_written_pages+0xf1 [D:\a\_work\1\s\src\coreclr\gc\gc.cpp @ 35832] 
coreclr!WKS::gc_heap::background_mark_phase+0x1b2 [D:\a\_work\1\s\src\coreclr\gc\gc.cpp @ 35093] 
coreclr!WKS::gc_heap::gc1+0x49e [D:\a\_work\1\s\src\coreclr\gc\gc.cpp @ 21184] 
coreclr!WKS::gc_heap::bgc_thread_function+0x5a [D:\a\_work\1\s\src\coreclr\gc\gc.cpp @ 36283] 
coreclr!WKS::gc_heap::bgc_thread_stub+0x1f [D:\a\_work\1\s\src\coreclr\gc\gc.cpp @ 34225] 
0:014> !verifyheap
No heap corruption detected.

@ghost
Copy link

ghost commented Oct 15, 2022

Tagging subscribers to this area: @dotnet/gc
See info in area-owners.md if you want to be subscribed.

Issue Details
  Discovering: ComInterfaceGenerator.Unit.Tests (method display = ClassAndMethod, method display options = None)
  Discovered:  ComInterfaceGenerator.Unit.Tests (found 10 test cases)
  Starting:    ComInterfaceGenerator.Unit.Tests (parallel test collections = on, max threads = 2)
Fatal error. Internal CLR error. (0x80131506)

Build Information

Build: https://dev.azure.com/dnceng-public/cbb18261-c48f-4abb-8651-8cdcb5474649/_build/results?buildId=53751
Build error leg or test failing: ComInterfaceGenerator.Unit.Tests.WorkItemExecution
Pull request: #77080
Log: https://dev.azure.com/dnceng-public/public/_build/results?buildId=53751&view=ms.vss-test-web.build-test-results-tab&runId=1099832&resultId=206563&paneView=dotnet-dnceng.dnceng-build-release-tasks.helix-test-information-tab

{
  "ErrorMessage": "ComInterfaceGenerator.Unit.Tests.WorkItemExecution",
  "BuildRetry": false
}
Author: jkotas
Assignees: -
Labels:

area-System.Runtime.InteropServices, area-GC-coreclr, blocking-clean-ci, Known Build Error

Milestone: -

@jkotas
Copy link
Member Author

jkotas commented Oct 15, 2022

The problem is that the C++ compiler optimizer decided to duplicate the memory load.

The code (after inlining) looks like this:

uint_8* o = *po;
if ((o >= background_saved_lowest_address) && (o < background_saved_highest_address))
{
   ...

It got transformed to this:

if ((*po >= background_saved_lowest_address) && (*po < background_saved_highest_address))
{
    uint_8* o = *po;
   ...

The crash is caused *po being changed on the foreground thread to a null pointer between the point the bounds check is executed and the rest of the method.

Duplicating memory loads like this is allowed optimization in C++. The fix should be to sprinkle VolatileLoadWithoutBarrier to all places where this sort of pattern occurs.

@jkotas jkotas added bug tenet-reliability Reliability/stability related issue (stress, load problems, etc.) labels Oct 15, 2022
@jkotas
Copy link
Member Author

jkotas commented Oct 15, 2022

Disassembly for reference:

735782d6 8b18            mov     ebx,dword ptr [eax]
735782d8 3b1d44a78873    cmp     ebx,dword ptr [coreclr!WKS::gc_heap::background_saved_lowest_address (7388a744)]
735782de 8b5de4          mov     ebx,dword ptr [ebp-1Ch]
735782e1 721e            jb      coreclr!WKS::gc_heap::revisit_written_page+0x3d1 (73578301)
735782e3 8b18            mov     ebx,dword ptr [eax] <- this is the duplicated memory load
735782e5 3b1d50a78873    cmp     ebx,dword ptr [coreclr!WKS::gc_heap::background_saved_highest_address (7388a750)]
735782eb 8b5de4          mov     ebx,dword ptr [ebp-1Ch]
735782ee 7311            jae     coreclr!WKS::gc_heap::revisit_written_page+0x3d1 (73578301)

@Maoni0
Copy link
Member

Maoni0 commented Oct 28, 2022

thanks @jkotas. @cshung is making a fix for what Jan mentioned above. we'll be reviewing the code to see if there are other places we need to fix.

@cshung cshung linked a pull request Oct 31, 2022 that will close this issue
@ghost ghost locked as resolved and limited conversation to collaborators Nov 30, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-GC-coreclr blocking-clean-ci Blocking PR or rolling runs of 'runtime' or 'runtime-extra-platforms' bug Known Build Error Use this to report build issues in the .NET Helix tab tenet-reliability Reliability/stability related issue (stress, load problems, etc.)
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants