Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Random System.InvalidCastException thrown when using Windows.Graphics.Imaging.BitmapDecoder and OcrEngine #762

Closed
Nukepayload2 opened this issue Mar 8, 2021 · 6 comments · Fixed by #833
Assignees
Labels
bug Something isn't working fixed Issue has been fixed in an upcoming or existing release
Milestone

Comments

@Nukepayload2
Copy link

Describe the bug
System.Private.CoreLib.dll throws System.InvalidCastException randomly when using WinRT API Windows.Graphics.Imaging.BitmapDecoder and Windows.Media.Ocr.OcrEngine. In average, 90% of the call will success. 10% calls to BitmapDecoder.CreateAsync(IRandomAccessStream) or BitmapDecoder.GetSoftwareBitmapAsync or OcrEngine.RecognizeAsync throws exception.

Typical exception message:

Exception thrown: 'System.InvalidCastException' in System.Private.CoreLib.dll
Unable to cast object of type 'Windows.Foundation.AsyncOperationWithProgressCompletedHandler`2[Windows.Storage.Streams.IBuffer,System.UInt32]' to type 'Windows.Graphics.Imaging.BitmapDecoder'.

To Reproduce

Download the sample project and run. Click the "Test 100 times" button and wait until the sum of three numbers is 100. If the number under the "Information" text block is 100, click the button again.
WinRTCallUnstable.zip

When the problem is successfully reproduced, the error count > 0.
image

Expected behavior
System.InvalidCastException should not be thrown.

Version Info

I don't use CsWinRT nuget directly, but my application call WinRT APIs through WinRT.Runtime.dll which is generated by CsWinRT. The target framework is net5.0-windows10.0.18362.0 .

.NET SDK (reflecting any global.json):
Version: 5.0.200
Commit: 70b3e65d53

Runtime Environment:
OS Name: Windows
OS Version: 10.0.19042
OS Platform: Windows
RID: win10-x64
Base Path: C:\Program Files\dotnet\sdk\5.0.200\

Host (useful for support):
Version: 5.0.3
Commit: c636bbdc8a

Visual Studio 2019 16.9.0

Additional context
This problem is probably related to threading or garbage collection. I added try ... catch to retry WinRT API calls in my code as workaround. I don't know whether my workaround is safe, because I'm not sure if this bug will cause memory corruption.

@Nukepayload2 Nukepayload2 added the bug Something isn't working label Mar 8, 2021
@manodasanW manodasanW self-assigned this Mar 8, 2021
@angelazhangmsft angelazhangmsft added this to the Release 1.1.4 milestone Mar 11, 2021
@manodasanW
Copy link
Member

Confirmed this repros with the agile reference changes related to this in 1.1.4, need to further investigate.

@manodasanW
Copy link
Member

Also to note this only repros when running on a machine / VM with a lower amount of RAM.

@manodasanW
Copy link
Member

Investigation seems show that pointer values are being reused (once for async operation, other time for IBuffer) and that is causing an old RCW to be brought back and invalid cast exception because the RCW is for the wrong type.

@manodasanW
Copy link
Member

Turns out this issue is not related to running with a lower amount of RAM as I previously indicated. It turns out it was only reproing before on my VM and not my machine due to differences in Windows build numbers. It seems recent insider Windows builds don't hit the issue, but the RTM ones do hit it which my VM was on. At the same time the issue is not related to changes in Windows, but rather a issue between CsWinRT and .NET 5 based on when finalizers run.

For context, CsWinRT registers the RCW object with the .NET ComWrappers API for the respective ptr, but the lifetime of the ptr is managed by an IObjectReference object stored in the RCW. What is happening is that the finalizer on the IObjectReference has ran letting go of all the references for a ptr allowing for the ptr to be reused, but the RCW hasn't been collected neither has the syncblock for the RCW which removes the RCW from the ComWrappers cache. So at this point if the ptr is used for a new object, .NET thinks it is for the same one in its cache and brings it back alive and returns it. But that causes an InvalidCastException because it is not and is for another type.

The real fix after discussion with .NET folks is for .NET to add a new API that allows CsWinRT to remove a registered RCW from the ComWrappers cache when it is no longer alive. This is being looked at for .NET 6 and is tracked by dotnet/runtime#51968

But to address this for current .NET 5 consumers, CsWinRT would need to do a mitigation for the issue. The mitigation being considered is to make the final release on the ptr only occur after the RCW has been finalized and the sync block has been finalized. There is no reliably way of telling when this happens but we can try to achieve that by making the release happen in Gen2 finalization. This seems to be able to be achieved by registering the object for finalization twice.

@manodasanW
Copy link
Member

Previous fix was reverted due to it introduced other issues. The new plan to address this issue is a fix in the dotnet runtime (both .NET 5 and .NET 6). See referenced PRs.

@manodasanW
Copy link
Member

This is confirmed to be fixed in the upcoming .NET servicing update (5.0.8).

@angelazhangmsft angelazhangmsft added the fixed Issue has been fixed in an upcoming or existing release label Jul 15, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working fixed Issue has been fixed in an upcoming or existing release
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants