Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Removed] #1588

Closed
adamsitnik opened this issue Nov 10, 2020 · 9 comments
Closed

[Removed] #1588

adamsitnik opened this issue Nov 10, 2020 · 9 comments

Comments

@adamsitnik
Copy link
Member

adamsitnik commented Nov 10, 2020

The study is not finished yet, I am going to create this issue now and add a comment for every benchmark that has a significant difference and afterward update the description of this issue and provide a single summary. There are simply too many benchmarks to put them in one GH comment ;)

BDN change: dotnet/BenchmarkDotNet#1587
Since this change requires us to call [GlobalSetup] and [GlobalCleanup] more than once I had to:

List of benchmarks dependent on memory alignment for which enabling the feature helps:

List of benchmarks where it does not help but changes the reported time:

Notes: we might need more than 20 iterations and we should not be removing upper outliers
Note: we should not enable it by default for all benchmarks

@adamsitnik
Copy link
Member Author

As expected, reallocating the arrays after every iteration helps with CopyTo benchmarks. We are now getting non-flat distribution with a single run, while previously it was always flat depending on the initial alignment.

Sample distribution:

Before:
-------------------- Histogram --------------------
[87.431 ns ; 89.140 ns) | @@@@@@@@@@@@@
---------------------------------------------------

After:
-------------------- Histogram --------------------
[ 79.009 ns ; 183.239 ns) | @@@@@@@@@@
[183.239 ns ; 287.047 ns) | @
[287.047 ns ; 391.277 ns) | @@@@@@@@
[391.277 ns ; 501.821 ns) | @
---------------------------------------------------

Note: we might need to run more than 20 iterations to get the full distribution and we can't exclude outliers. Otherwise for the distribution below:

-------------------- Histogram --------------------
[ 70.333 ns ; 151.954 ns) | @@
[151.954 ns ; 224.021 ns) | @@@@@@@@@@@@@@
[224.021 ns ; 262.438 ns) | @@@
[262.438 ns ; 334.505 ns) | 
[334.505 ns ; 421.098 ns) | 
[421.098 ns ; 493.165 ns) | @
---------------------------------------------------

The 421.098 ns ; 493.165 ns bucket with just a single result might get recognized as an outlier and removed.

Type Method OutlierMode Random Size Mean Median Min Max Ratio
CopyTo<Int32> Array DontRemove True 2048 195.86 ns 188.54 ns 106.37 ns 457.13 ns 1.23
CopyTo<Int32> Array RemoveUpper False 2048 160.13 ns 158.17 ns 156.13 ns 167.71 ns 1.00
CopyTo<String> Array DontRemove True 2048 497.14 ns 493.98 ns 473.03 ns 521.19 ns 1.09
CopyTo<String> Array RemoveUpper False 2048 457.89 ns 458.05 ns 455.23 ns 460.50 ns 1.00
CopyTo<Int32> Span DontRemove True 2048 240.38 ns 187.61 ns 106.56 ns 449.71 ns 2.82
CopyTo<Int32> Span RemoveUpper False 2048 88.11 ns 88.06 ns 87.62 ns 88.95 ns 1.00
CopyTo<String> Span DontRemove True 2048 551.28 ns 553.19 ns 525.40 ns 561.59 ns 1.01
CopyTo<String> Span RemoveUpper False 2048 547.60 ns 548.00 ns 544.06 ns 550.11 ns 1.00
CopyTo<Int32> ReadOnlySpan DontRemove True 2048 304.46 ns 243.03 ns 107.91 ns 512.46 ns 3.52
CopyTo<Int32> ReadOnlySpan RemoveUpper False 2048 88.72 ns 88.83 ns 87.90 ns 90.13 ns 1.00
CopyTo<String> ReadOnlySpan DontRemove True 2048 542.40 ns 540.23 ns 529.51 ns 570.88 ns 1.06
CopyTo<String> ReadOnlySpan RemoveUpper False 2048 516.74 ns 517.69 ns 510.92 ns 522.16 ns 1.00
CopyTo<Int32> Memory DontRemove True 2048 320.58 ns 316.07 ns 110.51 ns 518.71 ns 3.41
CopyTo<Int32> Memory RemoveUpper False 2048 93.92 ns 92.91 ns 91.43 ns 100.19 ns 1.00
CopyTo<String> Memory DontRemove True 2048 564.98 ns 562.26 ns 546.27 ns 588.31 ns 1.00
CopyTo<String> Memory RemoveUpper False 2048 564.37 ns 563.69 ns 558.38 ns 572.91 ns 1.00
CopyTo<Int32> ReadOnlyMemory DontRemove True 2048 277.53 ns 285.12 ns 108.95 ns 415.08 ns 3.06
CopyTo<Int32> ReadOnlyMemory RemoveUpper False 2048 91.81 ns 91.81 ns 91.33 ns 92.37 ns 1.00
CopyTo<String> ReadOnlyMemory DontRemove True 2048 584.96 ns 584.08 ns 560.89 ns 600.57 ns 1.12
CopyTo<String> ReadOnlyMemory RemoveUpper False 2048 521.95 ns 522.85 ns 513.46 ns 526.16 ns 1.00
CopyTo<Int32> List DontRemove True 2048 394.03 ns 416.93 ns 223.23 ns 467.90 ns 0.84
CopyTo<Int32> List RemoveUpper False 2048 470.78 ns 471.84 ns 465.79 ns 474.09 ns 1.00
CopyTo<String> List DontRemove True 2048 503.69 ns 493.76 ns 477.75 ns 561.17 ns 1.02
CopyTo<String> List RemoveUpper False 2048 486.84 ns 486.59 ns 481.98 ns 491.72 ns 1.00
CopyTo<Int32> ImmutableArray DontRemove True 2048 338.59 ns 373.97 ns 220.69 ns 505.24 ns 1.57
CopyTo<Int32> ImmutableArray RemoveUpper False 2048 222.62 ns 222.42 ns 221.79 ns 224.19 ns 1.00
CopyTo<String> ImmutableArray DontRemove True 2048 494.27 ns 489.13 ns 471.58 ns 526.61 ns 1.04
CopyTo<String> ImmutableArray RemoveUpper False 2048 478.20 ns 477.09 ns 473.28 ns 484.69 ns 1.00

@adamsitnik
Copy link
Member Author

The solution is not a silver bullet, there are some unstable benchmarks like System.Collections.CreateAddAndClear.Stack
where randomizing memory does not help. We still have a flat distribution, but the reported time slightly changes (most probably due to not having a perfectly warmed up memory):

Before:

-------------------- Histogram --------------------
[1.846 us ; 1.960 us) | @@@@@@@@@@@@@@@
---------------------------------------------------

After:

-------------------- Histogram --------------------
[2.087 us ; 2.204 us) | @@@@@@@@@@@@@@@
---------------------------------------------------
Method OutlierMode MemoryRandomization Size Mean Median Min Max Ratio Allocated
Stack DontRemove True 512 2.130 us 2.122 us 2.093 us 2.188 us 1.13 4 KB
Stack RemoveUpper False 512 1.883 us 1.874 us 1.850 us 1.944 us 1.00 4 KB

While in the Reporting System we can see:

obraz

@adamsitnik
Copy link
Member Author

There are also benchmarks that were quite stable so far and enabling this feature turns them into unstable benchmarks.

Example:

Before:

-------------------- Histogram --------------------
[136.962 ns ; 144.806 ns) | @@@@@@@@@@@@@@
---------------------------------------------------

After:

-------------------- Histogram --------------------
[137.882 ns ; 153.224 ns) | @@@@@@@@@@@@@@@@
[153.224 ns ; 160.511 ns) |
[160.511 ns ; 175.853 ns) | @@@
[175.853 ns ; 195.154 ns) |
[195.154 ns ; 210.496 ns) | @
---------------------------------------------------

Historical Data shows that it's a stable benchmark:

obraz

However, we can't be realy sure if consider the historical outliers:

obraz

@adamsitnik
Copy link
Member Author

adamsitnik commented Nov 10, 2020

For System.Memory.Span.IndexOfAnyThreeValues which is quite unstable when looking at historical data:

obraz

The randomization brings the desired effect:

Before:

-------------------- Histogram --------------------
[19.563 ns ; 19.919 ns) | @@@@@@@@@@@@@@
---------------------------------------------------

After:

-------------------- Histogram --------------------
[18.258 ns ; 20.703 ns) | @@@@@@@@@@@@@@@@
[20.703 ns ; 22.180 ns) | @
[22.180 ns ; 25.094 ns) |
[25.094 ns ; 28.005 ns) | @@@
---------------------------------------------------

@adamsitnik
Copy link
Member Author

The same goes for System.Collections.ContainsTrue<Int32>.Span:

Before:

-------------------- Histogram --------------------
[36.393 us ; 37.218 us) | @@@@@@@@@@@@@@@
---------------------------------------------------

After:

-------------------- Histogram --------------------
[36.313 us ; 37.255 us) | @@@@@@@@@@@@@@@@@@
[37.255 us ; 38.198 us) |
[38.198 us ; 39.683 us) | @
[39.683 us ; 40.931 us) | @
---------------------------------------------------

obraz

@danmoseley
Copy link
Member

@adamsitnik I'm following along as an observer, but I'm confused by your last two histograms above. Those seem to be more multimodal after your changes, not less?

@adamsitnik
Copy link
Member Author

Those seem to be more multimodal after your changes, not less?

@danmosemsft You are right and this is a very good question.

Some benchmarks seem to be stable when you run them just once. Example:

-------------------- Histogram --------------------
[19.563 ns ; 19.919 ns) | @@@@@@@@@@@@@@
--------------------------------------------------

But when you run them many times, they turn to be unstable or multimodal. Example:

obraz

Very often it's caused by code|memory alignment changes. When it comes to memory, we recommend allocating whatever is needed for the benchmark run in a [GlobalSetup] method. In the case of this particular benchmark, we allocate the array once and use it for a given benchmark run. If it's aligned, then we get one mode, when it's not we get another.

[GlobalSetup]
public void Setup()
{
_found = ValuesGenerator.ArrayOfUniqueValues<T>(Size);
_array = _found.ToArray();

With the "memory randomization", BDN calls [GlobalSetup] after every benchmark iteration. By doing that we test the same code for different inputs (aligned and not aligned). We don't get a nice flat distribution, but the distribution that shows all possible scenarios (at least this is what I hope for).

The problem that we have encountered while working on the new bot is benchmarks that... tend to stay in one mode for a while (typically weeks) after they switch to 2nd mode for another few weeks.

With this "randomization" I hope that we can switch from using averages in the historical data to "Median" and by doing that flatten the charts.

So for example for this histogram:

-------------------- Histogram --------------------
[18.258 ns ; 20.703 ns) | @@@@@@@@@@@@@@@@
[20.703 ns ; 22.180 ns) | @
[22.180 ns ; 25.094 ns) |
[25.094 ns ; 28.005 ns) | @@@
---------------------------------------------------

we would show a value from the first bucket (18.258 ns ; 20.703) and most probably never anything from 25.094 ns ; 28.005 ns

@danmoseley
Copy link
Member

Got it - thanks for the explanation.

@adamsitnik adamsitnik changed the title Memory Randomization experiment study [Removed] Nov 18, 2020
@adamsitnik
Copy link
Member Author

I've decided to create a new issue with a final report: #1602

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants