-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Culture sensitive comparison performance on Linux #26054
Comments
Trying to find our perf tests for these.. |
The tests are under Tests.System.Tests... @sebastienros do you have evidence that this ratio is any different to 2.0? From the benchmarks we have, I do see that string comparison on Linux is significantly slower than Windows, but much faster than 2.0, and the ratio has improved. [edit: that's comparing Linux with Windows for the same comparer, which was not what you flagged. Nevertheless, the question remains has this changed since 2.0?] |
I could find some culture tests on benchview, but none for string comparison.
So you are right that the gap is less important on 2.1 from 2.0, but not for the good reason. |
@sebastienros the link above show's CoreFX perf results. Of course the tests may be poor (they are certainly too few iterations) but they show improvements over 2.0 for Linux. Can you repeat your benchmark, without ASP.NET in the picture -- just a console app? |
Ideally with Benchmark.NET |
With BenchmarkDotNet we can also see regressions on Linux 2.0
Linux 2.1
Windows 2.0
Windows 2.1
|
Can you please share the benchmark? |
https://github.com/sebastienros/stringbenchmarks branch |
Snap, I pushed a change to remove the Sort just before your comment. One commit before in the 'console' branch then. |
@sebastienros, thanks for sharing. First I'm surprised by some of the absolute values your benchmark shows. That's saying it took almost 5 microseconds to do that comparison on Linux? That must be a very slow machine, or something else is going on, as it's at least an order of magnitude more than I'd expect. I just tried your [Benchmarks]: [Benchmark]
public int CompareTo() => Fortune1.CompareTo(Fortune2);
[Benchmark]
public int CompareOrdinal() => String.CompareOrdinal(Fortune1, Fortune2);
private const string Fortune1 = "fortune: No such file or directory";
private const string Fortune2 = "A computer scientist is someone who fixes things that aren''t broken."; by plugging them into the harness I described in https://blogs.msdn.microsoft.com/dotnet/2018/04/18/performance-improvements-in-net-core-2-1/, and I get these results on my Ubuntu 16.04 VM:
and this on my Windows 10 machine:
so significantly smaller numbers in magnitude than what you got. Second, extrapolating from a single micro-benchmark can be misleading. The particular micro-benchmark you've chosen has the strings entirely different, which means it's really just testing the overhead involved in setting up the comparison, that'll end up failing on the very first character examined. For culture-based comparisons, there is a tiny bit more overhead there in 2.1 due to spans being used internally, converting from strings to spans, etc. Once the comparison gets going, though, 2.1's implementation is better, e.g. try making the beginning of your two strings equal, and you should see 2.1 outshine 2.0. For example, I just changed the above to the following that has some differences in the middle of the short strings being compared: private const string Fortune1 = "fortune: No such file or directory";
private const string Fortune2 = "fortune: No such file is directory"; On Linux I got:
and on Windows:
showing 2.1 beating out 2.0. Then I further changed it to be: private const string Fortune1 = "A computer scientist is someone who fixes things that aren''t broken!";
private const string Fortune2 = "A computer scientist is someone who fixes things that aren''t broken."; so that the strings differ only by the last character and are slightly longer; on Linux I got:
and on Windows:
showing 2.1 being significantly better on these inputs than 2.0. |
I mentioned it in my previous comment that the results I pasted are not from the HEAD commit on the repository but the commit before, it had I ran the same tests as you then, but with a different outcome. That's problematic, but I can run it on a different set of machines to get numbers we can be confident with. Compare private const string Fortune1 = "fortune: No such file or directory";
private const string Fortune2 = "A computer scientist is someone who fixes things that aren''t broken."; CompareSame private const string Fortune1 = "fortune: No such file or directory";
private const string Fortune2 = "fortune: No such file is directory"; CompareSimilar private const string Fortune1 = "A computer scientist is someone who fixes things that aren''t broken!";
private const string Fortune2 = "A computer scientist is someone who fixes things that aren''t broken."; Linux 2.0
Linux 2.1
Windows 2.0
Windows 2.1
Note that I am running Linux and Windows on two identical physical machines, without a VM (docker in the case of Linux) so the comparisons between Linux and Windows are fair. Sample BenchmarkDotNet framework summary, to show the framework versions are correct:
and
I will update this thread with results from Azure VMs to exclude any environment specificity. |
More data: Linux - Azure - 2.0
Linux - Azure - 2.1
Linux - "Citrine (same hardware as TechEmpower)" - 2.0
Linux - "Citrine (same hardware as TechEmpower)" - 2.1
Interestingly the Citrine machines which are much powerful have the same results as the Azure VMs we are using. Note that the Citrine servers don't have Page Table Isolation disabled (Meltdown security vulnerability). |
The CompareToSame lines appear to be highlighting something else. Note that what you wrote above for the "CompareSame" case is not actually what your benchmark is testing: for "CompareSame" you copied what I had, which was almost the same text but swapping the word "or" for "is" so that there was a difference in the middle. But your benchmark CompareToSame is actually doing what its name says and is comparing not only identical strings, but identical references. As such, it should be hitting the same fast path in both 2.0: |
Correct, I didn't see this difference, I will add it. I won't add the results here unless you want to, I think I already drowned this thread with too much data already. |
@adamsitnik has merged just a fix which likely fix this one too. could you try it with your scenario and look if you see any improvement? |
Ok, I have forked the example provided by @sebastienros in https://github.com/dotnet/corefx/issues/37691 and extended it with the benchmarks provided here (I could not use https://github.com/sebastienros/stringbenchmarks/blob/master/Startup.cs because it targets 2.0): sample command: dotnet run -- --server http://asp-perf-lin:5001 --client http://asp-perf-load:5002 --repository https://github.com/adamsitnik/invariantcultureperf --project-file InvariantCulture.csproj --path /api/values/CompareOrdinal --warmup 1 --duration 5 --runtime 3.0.0-* The results are RPS for asp-perf-lin and asp-perf-win machines:
The results are RPS for the Citrix machines:
I am going to take a look at the traces from Citrix machines |
Citrine (not citrix) |
I have run the StringComparer benchmarks from the performance repo using latest CoreCLR bits with my 3 fixes. (https://github.com/dotnet/performance/blob/master/src/benchmarks/micro/corefx/System.Runtime/Perf.StringComparer.cs) OS=Windows 10.0.17763.107 (1809/October2018Update/Redstone5) Intel Xeon CPU E5-1650 v3 3.50GHz, 1 CPU, 12 logical and 6 physical cores
Linux is on par for I am going to do some research and remove the gap for |
The PR #40910 is addressing the ordinal cases. |
After noticing a very important impact on string comparison algorithms while sorting a list of business objects, I decided to run a benchmark to analyze the differences between Linux and Windows.
The code is here: https://github.com/sebastienros/stringbenchmarks/blob/master/Startup.cs
Result:
CompareTo is expected to be slower that CompareOrdinal and I am not questioning that, but on Linux the ratio is 46% while on Windows it's 86%. This could have a significant impact on ASP.NET that uses it extensively. In the TechEmpower Fortunes scenario, on our 12 Cores machine we noticed using a performance by a factor of 3 while sorting the results using ordinal comparison (70K RPS to 216K RPS), so the impact seems to be even bigger than these micro benchmark differences.
The text was updated successfully, but these errors were encountered: