Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prevent Native EXC_BAD_ACCESS signal for NullRefrenceExceptions #3909

Merged
merged 17 commits into from
Feb 3, 2025

Conversation

jamescrosswell
Copy link
Collaborator

@jamescrosswell jamescrosswell commented Jan 21, 2025

Resolves #3776:

Analysis

The following all appear to impact error reporting behaviour for iOS applications:

  1. The runtime (determined by whether AOT compilation is enabled or not)
  2. The ObjCRuntime.MarshalManagedExceptionMode
  3. Whether or not we try to suppress EXC_BAD_ACCESS errors (as in this PR)
  4. Whether or not the exception is caught/ignored by managed code

MONO vs CLR Runtime

When AOT compilation is not used, .NET iOS apps use the Mono runtime. When AOT compilation is used, .NET iOS apps use the CLR runtime. Managed exceptions are marshalled differently by the two different runtimes and unwinding native exceptions is currently not supported by the CLR.

MarshalManagedExceptionMode

In earlier versions of .NET, the AppDomain.CurrentDomain.UnhandledException was not triggered correctly unless the marshalling mode for managed exceptions was set to MarshalManagedExceptionMode.UnwindNativeCode. (see xamarin/xamarin-macios#15252 and the workaround in Sentry).

This has been fixed in net9.0 on the MONO runtime.

Scenarios

I tested the various different permutations of the above, with the following results:

Solution

AOT / CLR

  • Fix the issue of not getting NREs (unhandled exception handler not firing) in AOT Apps
  • Suppress EXC_BAD_ACCESS errors, to avoid duplication (this PR)

MONO

Notes on EXC_BAD_ACCESS suppression

We don't know, from the stack trace, whether the EXC_BAD_ACCESS comes from native or managed code. This workaround assumes it comes from managed code and will be reported separately as a managed exception.

Prior to this PR the behaviour was the opposite. All EXC_BAD_ACCESS errors were assumed to be native exceptions and reported by the native SDK (so often users would get a EXC_BAD_ACCESS, that was a bit confusing, in addition to an NRE that makes more sense to .NET devs)

Potential alternative: add an option to let users decide whether to suppress these native errors or not.

Copy link
Member

@bruno-garcia bruno-garcia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to only remove the events we'd like to drop. @filipnavara might have tips on how to do this better but ai think it's good enough

@jamescrosswell jamescrosswell marked this pull request as draft January 29, 2025 09:34
@jamescrosswell jamescrosswell marked this pull request as ready for review January 30, 2025 08:32
Copy link
Member

@bruno-garcia bruno-garcia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we have a device test that verifies this? I imagine it would only be possible if we don't actually crash the app though

namespace Sentry.Cocoa;

/// <summary>
/// When AOT Compiling iOS applications, the AppDomain UnhandledExceptionHandler doesn't fire. So instead we intercept
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh we've been missing out on a category of errors?

Copy link
Collaborator Author

@jamescrosswell jamescrosswell Jan 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Essentially, yes. Matt created this workaround to the problem (after this conversation):

// Workaround for https://github.com/xamarin/xamarin-macios/issues/15252
ObjCRuntime.Runtime.MarshalManagedException += (_, args) =>
{
args.ExceptionMode = ObjCRuntime.MarshalManagedExceptionMode.UnwindNativeCode;
};

However that marshalling mode isn't supported when AOT compiling iOS apps. It causes xamarin to throw an assertion error, which is unrelated to the error from the user's code.

ex.SetSentryMechanism(
"Runtime.MarshalManagedException",
"This exception was caught by the .NET Runtime Marshal Managed Exception global error handler. " +
"The application may have crashed as a result of this exception.",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not certain that it did crash?

Copy link
Collaborator Author

@jamescrosswell jamescrosswell Jan 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure about that no. We're using this event instead of AppDomain.Current.UnhandledException because Runtime.MarshalManagedException fires even in AOT compiled applications but MarshalManagedExceptionEventArgs has no equivalent of UnhandledExceptionEventArgs.IsTerminating.

I've asked the question here:

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But regardless of that bool, if that fires (on your tests) the app crashes? If so, I'd indicate that's the case. But keeping it vague in here is fine, I was just wondering how this works

src/Sentry/Platforms/Cocoa/SentrySdk.cs Show resolved Hide resolved
@jamescrosswell
Copy link
Collaborator Author

jamescrosswell commented Jan 31, 2025

Can we have a device test that verifies this? I imagine it would only be possible if we don't actually crash the app though

Tricky... the fix is using the native.OnBeforeSend handler.

An integration test might be possible but those are quite painful to build and maintain. Ivan basically has some pwsh scripts that create new MAUI apps with csharp source code, build and run these and then watch for stuff in the logs. I think they're good for smoke testing (making sure we can build and run JIT and AOT apps that init the SDK successfully) but not a good idea to try to build more sophisticated tests that way.

@jamescrosswell jamescrosswell merged commit 44c48f6 into main Feb 3, 2025
22 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Sentry reports handled NRE as EXC_BAD_ACCESS signal
3 participants