Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix LLC cache issue on Apple M1 #64576

Merged
merged 3 commits into from
Feb 1, 2022
Merged

Conversation

EgorBo
Copy link
Member

@EgorBo EgorBo commented Jan 31, 2022

Addresses the L3 cache issue we've found in #60166 but for macOS-arm64

It seems that since macOS 12.0, Apple added more entries for sysctl (see #62832 (comment)). And it turns out the keys we were using report cache size for the "efficiency cores", not the performance ones. As the result, PAL_GetLogicalProcessorCacheSizeFromOS used to return 4mb instead of expected 12mb.

This change shows nice improvements for GC-bound benchmarks and should address #60616 (comment)

using System;
using System.Runtime.CompilerServices;
using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Running;

public class Test
{
    public static void Main(string[] args) => 
        BenchmarkSwitcher.FromAssembly(typeof(Test).Assembly).Run(args);

    [Benchmark]
    [Arguments(10000)]
    [Arguments(128)]
    public char[] AllocateUninit(int len) => GC.AllocateUninitializedArray<char>(len);

    [Benchmark]
    [Arguments(2000)]
    [Arguments(256)]
    public char[] AllocateArray(int len) => new char[len];
    
    [Benchmark]
    public object SmallAllocation() => new object[] { new(), new() };

    [Benchmark]
    public string IntToString() => 42.ToString();
}

Results (Apple M1 mac mini):

Method Toolchain len Mean Error StdDev Ratio
SmallAllocation /Core_Root/corerun ? 12.695 ns 0.0422 ns 0.0352 ns 1.10
SmallAllocation /Core_Root_PR/corerun ? 11.534 ns 0.0453 ns 0.0423 ns 1.00
IntToString /Core_Root/corerun ? 6.239 ns 0.0058 ns 0.0052 ns 1.08
IntToString /Core_Root_PR/corerun ? 5.782 ns 0.0074 ns 0.0069 ns 1.00
AllocateUninit /Core_Root/corerun 128 14.195 ns 0.1914 ns 0.1790 ns 1.43
AllocateUninit /Core_Root_PR/corerun 128 9.952 ns 0.0622 ns 0.0582 ns 1.00
AllocateArray /Core_Root/corerun 256 26.071 ns 0.3068 ns 0.2562 ns 1.46
AllocateArray /Core_Root_PR/corerun 256 17.843 ns 0.1272 ns 0.1190 ns 1.00
AllocateArray /Core_Root/corerun 2000 180.445 ns 3.6465 ns 3.4110 ns 1.66
AllocateArray /Core_Root_PR/corerun 2000 108.966 ns 0.9273 ns 0.8674 ns 1.00
AllocateUninit /Core_Root/corerun 10000 529.512 ns 3.2077 ns 3.0005 ns 2.51
AllocateUninit /Core_Root_PR/corerun 10000 211.415 ns 3.5344 ns 3.3060 ns 1.00

@EgorBo
Copy link
Member Author

EgorBo commented Jan 31, 2022

Unrelated to this PR, when I play with DOTNET_GCgen0size with bigger values I get even better results for these benchmarks.

@EgorBo
Copy link
Member Author

EgorBo commented Jan 31, 2022

cc @dotnet/gc @janvorli

@EgorBo
Copy link
Member Author

EgorBo commented Feb 1, 2022

Btw, here is what sysctrl -a reports on my M1: https://gist.github.com/EgorBo/df63dfe24f5463009fbf2490e406b39c

@janvorli
Copy link
Member

janvorli commented Feb 1, 2022

Thanks @EgorBo for sharing the output of the sysctrl -a! There is much more interesting information than I expected.

Copy link
Member

@janvorli janvorli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thank you!

@EgorBo EgorBo merged commit 69b9000 into dotnet:main Feb 1, 2022
@JulieLeeMSFT JulieLeeMSFT added this to the 7.0.0 milestone Feb 8, 2022
@ghost ghost locked as resolved and limited conversation to collaborators Mar 10, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants