-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comment out a false-positive assert that is firing and preventing jit-diffs #52462
Conversation
cc @dotnet/jit-contrib I think we should take this to unblock other work. |
So a |
@BruceForstall That's my understanding. We have a pointer (possibly to the stack or to native memory) and we turn it into a byref. In the case of When we have inlining, struct promotion, and other things at play, we will end up with something like the following, where
Since they are both assigned
We only do liveness updates for emitted code, eliding the move in codegen means that liveness update never happens. Updating codegen to still do the liveness update for such moves is non-trivial because we do not centrally elide the moves today. Instead, we have dozens of
|
I guess I'm having a hard time understanding if this is really a "false-positive" assert. From the scenario you describe, a byref goes live but we don't tell the GC about it. If there's a GC at the object pointed to is not otherwise rooted, then the GC can collect it, and if we don't report the GC byref, the GC doesn't need to update it. So it seems like it's actually a GC hole. (How byrefs "appear out of thin air" I guess is not worth considering since this is all unsafe code and the user sure better know what they're doing?) |
You don't have to be unsafe to create a byref out of thin air. e.g. Edit: never mind, that could still be a real ref 🤦 , but yeah, we magic byrefs out unmanaged pointers with Span all the time. |
AFAICT, this scenario can only happen when going from a Now, I could conceive a possible GC hole in a scenario where you go from var a = new int[30]; // a is a live gcref
ref var x = ref a[0]; // x is a live byref
a = null; // a is a dead gcref, the backing array is kept alive due to the x byref
fixed (int* p = &x) // p is a live pinned byref
{
x = ref Unsafe.NullRef<int>(); // x is a dead byref, the backing array is kept alive due to the p pinned byref
int* t = p; // t is a pointer, not a pinned byref
x = ref Unsafe.AsRef<int>(t); // x is a live byref, it still points to the pinned backing array
} // p is a dead pinned byref, x should be keeping the backing array alive
GC.Collect();
Console.WriteLine(x); Now, if It's possible that this isn't happening in practice due to register allocation or a number of other scenarios. But I could conceive it being an actual hole in some edge case scenario like this. But maybe someone more knowledgeable here could weigh in. |
@BruceForstall I have many of the same concerns from you. It is not easy to understand how a reg-reg mov can be the source of a GC ref. But presumably we're already "handling" this case today in cases where the registers are different, it's just when they are the same that we don't do the GC liveness updates. Tanner has a preliminary set of changes where we make the GC updates in the same reg-reg mov case, but they're extensive and still don't resolve all the cases of this issue. I've also seen cases where the allocator uses commutativity to reorder a last-use add to a byref and introduces a small window where there is no live byref, and the assert fires (see #10821 (comment)). So I think the assert is intrinsically flawed and is now giving us a lot of false positive cases and so we should disable it for now as we work on a better solution. My thinking is that the only comfortable resolution to this problem is to walk back up the data flow, and we'll find either pointer being retyped as a byref or a pin, or else something we can't track further (say an arg); once we have that we can retype the entire lifetime as pointer or byref as appropriate. My understanding is this sort of "gc rooting" flow analysis is akin to how NUTC did (does) gc reporting, instead of the gc pointer kind distinctions we try and maintain throughout the jit today. But all that requires dataflow. Figuring out where this can/should happen is not straightforward. Since most of these issues seem to arise from inlining, the natural place for this analysis is perhaps the object stack allocation phase; it happens to already run a very similar analysis since it needs to change GC types. |
@BruceForstall, @AndyAyersMS; #52661 is an alternative to this. It centralizes the move management and results in zero PMI diffs for x64 S.P.Corelib (still validating for tests and other platforms). It's relatively small (+854, -820 lines) and ensures that the relevant liveness updates happen so also resolves the assert. |
Putting this up as suggested by @AndyAyersMS on Discord.
As per the comment in code and the analysis in #51728, we have many scenarios where a
pointer
is converted to abyref
. These conversions can range from something as simple asref *x
to things likeref T Unsafe.AsRef<T>(void*)
to evennew Span<T>(void*, int)
.This conversion is correctly tracked and represented in metadata. However, if the
pointer
isenregistered
and "last use" the register allocator may decide thebyref
will exist in the same register. There are many places in codegen that have logic similar to:Due to how many locations this check exists, it is non trivial to go and fix this to be handled correctly everywhere today. Additionly, for most scenarios, this is a "safe" optimization. However, when that move is for a
pointer<->byref
conversion, eliding this move means that the liveness update will never occur and so asserts such as the one listed will fire due to the mismatch.