-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JIT optimization: loop head alignment #8108
Comments
This came up in #4400 |
Eugene has some evidence that 32 byte alignment matters for hot loops on recent CPUs. More crucial when loop bodies are small. We'd also, as a prerequisite, need to align method entries. Right now for x64 we always use 8 byte alignment (while somewhat oddly, on x86, we do things differently). R2R images don't honor alignment requests (fragile NIs do though). I would like to to collect some distribution data on native code sizes so we could guess at likely costs for different method alignment algorithms. Probably need SPMI working in Core or could try it over in desktop. Benefit would be harder to assess. So, lots to sort through here. |
what do you mean by unsupported? Even if I set |
IIRC setting the env var will do something, but is not guaranteed to do what you'd expect. |
@AndyAyersMS thanks! btw if you would like to test the performance of CoreCLR with some flag enabled/disabled, then you can use following config to run defeault settings vs flag configured: static void Main(string[] args)
=> BenchmarkRunner.Run<Benchmarks>(
DefaultConfig.Instance
.With(DisassemblyDiagnoser.Create(new DisassemblyDiagnoserConfig(printAsm: true, printPrologAndEpilog: true)))
.With(Job.Default.With(Runtime.Core).AsBaseline())
.With(Job.Default.With(Runtime.Core).With(new EnvironmentVariable[1] { new EnvironmentVariable("COMPlus_JitAlignLoops", "1") }))
); |
I wonder whether it will possible some time in the future to allow loop head alignment on per-method basis, eg. with |
This has been enabled for x86/x64 by #44370. |
(I'm creating tracking issues for some optimizations that RyuJit doesn't perform, so we'll have a place to reference/note when we see the lack of them affecting particular benchmarks)
This one actually has an implementation plumbed through, but it's triggered only by the unsupported
COMPlus_JitAlignLoops
flag. It should probably be enabled by default in at least some configurations/methods.category:cq
theme:loop-opt
skill-level:expert
cost:medium
The text was updated successfully, but these errors were encountered: