-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[arm64] JIT: Recognize sbfiz/ubfiz idioms #61045
Conversation
Tagging subscribers to this area: @JulieLeeMSFT Issue DetailsThis PR recognizes '(ulong)x << cns' idioms in order to emit sbfiz/ubfiz. This patterns shows up in array accesses and while I am planning to fix array accesses in #61026 differently it still makes sense to have it. // explicit pattern:
static ulong Test1(uint x) => ((ulong)x) << 2;
// implicit pattern:
static int Test2(int[] array, int i) => array[i]; Codegen diff: ; Method Prog:Test1(int):long
G_M16463_IG01:
stp fp, lr, [sp,#-16]!
mov fp, sp
G_M16463_IG02:
- mov w0, w0
- lsl x0, x0, #2
+ ubfiz x0, x0, #2, #32
G_M16463_IG03:
ldp fp, lr, [sp],#16
ret lr
-; Total bytes of code: 24
+; Total bytes of code: 20
; Method Prog:Test2(System.Int32[],int):int
G_M15622_IG01:
stp fp, lr, [sp,#-16]!
mov fp, sp
G_M15622_IG02:
ldr w2, [x0,#8]
cmp w1, w2
bhs G_M15622_IG04
- mov w1, w1
- lsl x1, x1, #2
+ ubfiz x1, x1, #2, #32
add x1, x1, #16
ldr w0, [x0, x1]
G_M15622_IG03:
ldp fp, lr, [sp],#16
ret lr
G_M15622_IG04:
bl CORINFO_HELP_RNGCHKFAIL
bkpt
-; Total bytes of code: 52
+; Total bytes of code: 48 Diffs are impressive and gives us a hint we should implement proper "addressing modes" for arm64 asap 🙂 coreclr_tests.pmi.Linux.arm64.checked.mch:
Detail diffs
libraries.crossgen2.Linux.arm64.checked.mch:
Detail diffs
libraries.pmi.Linux.arm64.checked.mch:
Detail diffs
libraries_tests.pmi.Linux.arm64.checked.mch:
Detail diffs
|
runtime/src/libraries/System.Text.RegularExpressions/gen/RegexGenerator.Emitter.cs Line 96 in 034d5e2
has most regressions. is there anything special about this pattern which disagrees with bfiz optimization? |
@kasperk81 it's loop-alignment artifacts, e.g. https://www.diffchecker.com/La4x8Sgz |
@SingleAccretion could you please take another look, I added smallint support (diffs are updated) and removed redundant checks. |
using System.Collections.Generic;
using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Running;
public class Program
{
static void Main(string[] args) =>
BenchmarkSwitcher.FromAssembly(typeof(Program).Assembly).Run(args);
public IEnumerable<object[]> TestData()
{
yield return new object[] { new int[100], new int[100] };
yield return new object[] { new int[1000], new int[1000] };
yield return new object[] { new int[100000], new int[100000] };
}
[Benchmark]
[ArgumentsSource(nameof(TestData))]
public void CopyArray(int[] src, int[] dst)
{
for (int i = 0; i < src.Length; i++)
dst[i] = src[i];
}
} Results on Apple M1 arm64:
(tested with and without loop alignment) codegen diff: https://www.diffchecker.com/pLkuGMn6 (yes, address calculation is still not perfect and is not hoisted, but I work on it) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great to me (modulo one note)!
@dotnet/jit-contrib PTAL, should be ready to review/merge |
arm64 improvements: dotnet/perf-autofiling-issues#2247 and dotnet/perf-autofiling-issues#2248 |
This PR recognizes '(ulong)x << cns' idioms in order to emit sbfiz/ubfiz. This patterns shows up in array accesses and while I am planning to fix array accesses in #61026 differently it still makes sense to have this peephole for explicit patterns.
Example:
Codegen diff:
Diffs are impressive and gives us a hint we should implement proper "addressing modes" for arm64 asap 🙂
coreclr_tests.pmi.Linux.arm64.checked.mch:
Detail diffs
libraries.crossgen2.Linux.arm64.checked.mch:
Detail diffs
libraries.pmi.Linux.arm64.checked.mch:
Detail diffs
libraries_tests.pmi.Linux.arm64.checked.mch:
Detail diffs