-
Notifications
You must be signed in to change notification settings - Fork 272
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Removed] #1588
Comments
As expected, reallocating the arrays after every iteration helps with Sample distribution: Before:
-------------------- Histogram --------------------
[87.431 ns ; 89.140 ns) | @@@@@@@@@@@@@
---------------------------------------------------
After:
-------------------- Histogram --------------------
[ 79.009 ns ; 183.239 ns) | @@@@@@@@@@
[183.239 ns ; 287.047 ns) | @
[287.047 ns ; 391.277 ns) | @@@@@@@@
[391.277 ns ; 501.821 ns) | @
--------------------------------------------------- Note: we might need to run more than 20 iterations to get the full distribution and we can't exclude outliers. Otherwise for the distribution below: -------------------- Histogram --------------------
[ 70.333 ns ; 151.954 ns) | @@
[151.954 ns ; 224.021 ns) | @@@@@@@@@@@@@@
[224.021 ns ; 262.438 ns) | @@@
[262.438 ns ; 334.505 ns) |
[334.505 ns ; 421.098 ns) |
[421.098 ns ; 493.165 ns) | @
--------------------------------------------------- The
|
The solution is not a silver bullet, there are some unstable benchmarks like System.Collections.CreateAddAndClear.Stack Before: -------------------- Histogram --------------------
[1.846 us ; 1.960 us) | @@@@@@@@@@@@@@@
--------------------------------------------------- After: -------------------- Histogram --------------------
[2.087 us ; 2.204 us) | @@@@@@@@@@@@@@@
---------------------------------------------------
While in the Reporting System we can see: |
There are also benchmarks that were quite stable so far and enabling this feature turns them into unstable benchmarks. Example: Before: -------------------- Histogram --------------------
[136.962 ns ; 144.806 ns) | @@@@@@@@@@@@@@
--------------------------------------------------- After: -------------------- Histogram --------------------
[137.882 ns ; 153.224 ns) | @@@@@@@@@@@@@@@@
[153.224 ns ; 160.511 ns) |
[160.511 ns ; 175.853 ns) | @@@
[175.853 ns ; 195.154 ns) |
[195.154 ns ; 210.496 ns) | @
--------------------------------------------------- Historical Data shows that it's a stable benchmark: However, we can't be realy sure if consider the historical outliers: |
For System.Memory.Span.IndexOfAnyThreeValues which is quite unstable when looking at historical data: The randomization brings the desired effect: Before: -------------------- Histogram --------------------
[19.563 ns ; 19.919 ns) | @@@@@@@@@@@@@@
--------------------------------------------------- After: -------------------- Histogram --------------------
[18.258 ns ; 20.703 ns) | @@@@@@@@@@@@@@@@
[20.703 ns ; 22.180 ns) | @
[22.180 ns ; 25.094 ns) |
[25.094 ns ; 28.005 ns) | @@@
--------------------------------------------------- |
The same goes for Before: -------------------- Histogram --------------------
[36.393 us ; 37.218 us) | @@@@@@@@@@@@@@@
--------------------------------------------------- After: -------------------- Histogram --------------------
[36.313 us ; 37.255 us) | @@@@@@@@@@@@@@@@@@
[37.255 us ; 38.198 us) |
[38.198 us ; 39.683 us) | @
[39.683 us ; 40.931 us) | @
--------------------------------------------------- |
@adamsitnik I'm following along as an observer, but I'm confused by your last two histograms above. Those seem to be more multimodal after your changes, not less? |
@danmosemsft You are right and this is a very good question. Some benchmarks seem to be stable when you run them just once. Example: -------------------- Histogram --------------------
[19.563 ns ; 19.919 ns) | @@@@@@@@@@@@@@
-------------------------------------------------- But when you run them many times, they turn to be unstable or multimodal. Example: Very often it's caused by code|memory alignment changes. When it comes to memory, we recommend allocating whatever is needed for the benchmark run in a performance/src/benchmarks/micro/libraries/System.Collections/Contains/ContainsTrue.cs Lines 38 to 42 in ca80d8e
With the "memory randomization", BDN calls The problem that we have encountered while working on the new bot is benchmarks that... tend to stay in one mode for a while (typically weeks) after they switch to 2nd mode for another few weeks. With this "randomization" I hope that we can switch from using averages in the historical data to "Median" and by doing that flatten the charts. So for example for this histogram: -------------------- Histogram --------------------
[18.258 ns ; 20.703 ns) | @@@@@@@@@@@@@@@@
[20.703 ns ; 22.180 ns) | @
[22.180 ns ; 25.094 ns) |
[25.094 ns ; 28.005 ns) | @@@
--------------------------------------------------- we would show a value from the first bucket ( |
Got it - thanks for the explanation. |
I've decided to create a new issue with a final report: #1602 |
The study is not finished yet, I am going to create this issue now and add a comment for every benchmark that has a significant difference and afterward update the description of this issue and provide a single summary. There are simply too many benchmarks to put them in one GH comment ;)
BDN change: dotnet/BenchmarkDotNet#1587
Since this change requires us to call
[GlobalSetup]
and[GlobalCleanup]
more than once I had to:[GlobalSetup]
methods, split big setups into smaller targeted ones, and fix some bugs that occurred when invoking them more than once: Refactor initialization logic to allow for enabling Memory Randomization #1587List of benchmarks dependent on memory alignment for which enabling the feature helps:
System.Collections.ContainsTrue<Int32>.Span
List of benchmarks where it does not help but changes the reported time:
Notes: we might need more than 20 iterations and we should not be removing upper outliers
Note: we should not enable it by default for all benchmarks
The text was updated successfully, but these errors were encountered: