-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AVX512 codegen issues #100404
Comments
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch |
The issue looks to be rooted in vmovsd xmm1, qword ptr [r8]
vmulpd xmm0, xmm0, xmm1 While AVX2 is emitting: vmovsd xmm0, qword ptr [r8]
vmovddup xmm0, xmm0
vmulpd xmm0, xmm0, xmmword ptr [rcx] This looks to be a disconnect in the containment logic around embedded broadcasts as it is expected AVX-512 would have instead emitted |
CC. @dotnet/avx512-contrib |
The following is a minimal repro: using System.Runtime.CompilerServices;
using System.Runtime.Intrinsics;
internal class Program
{
private static void Main(string[] args)
{
Console.WriteLine(Map(Vector128<double>.One, new FloatPoint(2.0, 3.0)));
}
[MethodImpl(MethodImplOptions.NoInlining)]
public static Vector128<double> Map(Vector128<double> m0, FloatPoint point)
{
return m0 * Vector128.Create(point.X);
}
}
public struct FloatPoint(double x, double y)
{
public double X = x;
public double Y = y;
} |
It looks like this was fixed in .NET 9, so potentially just something that hasn't been backported or that isn't in the latest .NET 8 release yet. |
This is #97783, which hasn't been backported. |
The fix was backported. |
Description
NET 8:
NET 9 (or NET 8 with AVX512F_VL=0):
Reproduction Steps
It is terribly difficult to narrow down the issue.
There is a "massive" method at https://github.com/TechPizzaDev/SharpBlaze/blob/f330516bd8e88b607e36b6362e06b90f30472029/SharpBlaze/Linearizer.cs#L459. This method contains a lot of smaller inlined methods, and since most of the small methods are marked with
AggressiveInlining
, the massiveProcessUncontained
method turns out to around 7kB of JIT ASM, and the JIT gives up on inlining at the end which causes helper methods ofInlineArray
to show up.As soon as the JIT gives up on inlining (by tweaking
MethodImplOptions
in hot paths), the issues seemingly start to appear.This project is a verbatim port of https://github.com/aurimasg/blaze to C#, and the 30k-Paris assets can be found at that original repo.
Expected behavior
Inlining and tiered compilation does not affect behavior. Release and debug builds create the same result.
Actual behavior
Tiered compilation changes the result (most likely because of inlining changing between JIT tiers).
Changing from
[MethodImpl(MethodImplOptions.AggressiveInlining)]
to[MethodImpl(MethodImplOptions.NoInlining)]
reduces the problem in some hot paths (tiered compilation will sometimes stabilize to a correct result, but the result changing at all should not occur):https://github.com/TechPizzaDev/SharpBlaze/blob/f330516bd8e88b607e36b6362e06b90f30472029/SharpBlaze/Linearizer.cs#L938-L940
Debug builds produce a correct result unlike Release builds.
Regression?
Since AVX512 plays a role, it seems unlikely that .NET 7 is affected (since AVX512 was only exposed in .NET 8).
Known Workarounds
Disable AVX512V_FL with
DOTNET_EnableAVX512F_VL="0"
.Updating to .NET 9 Preview does make the issues disappear, but looking at the disassembly it could simply be because .NET 9 has better inlining, which could hide the underlying issue.
Configuration
.NET: 8.0.3
OS: Microsoft Windows 11 Home (10.0.22631 version 22631)
CPU: 11th Gen Intel(R) Core(TM) i3-1115G4 @ 3.00GHz
Arch: x64
Other information
I could not narrow the issue down to any specific instruction sequence, and the only obvious difference I could see from a glance between NET8 and NET9 was a few
vbroadcasti32x4
instead ofvmovups
that look like the result of #92017 but this is unlikely the problem.Here is a zip with some JIT disasm.
The text was updated successfully, but these errors were encountered: