-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
String.StartsWith slower on Linux with some characters #30716
Comments
It seems to affect characters outside of the [9], [11-12], [32-38], [40-44], and [47-126] ranges. |
@krwq that's an extraordinary multiplier. I am curious what this is about. |
Yes this is interesting. @kevingosse what's you current culture set to? Looking at the code it behaves differently depending on that. @tarekgh do you know if we (or ICU) have any special optimizations for some ranges? |
I believe Linux treat '-' with some sort weights which can affect the operation with different cultures. on Windows, '-' is treated as normal ASCII and we optimize the call as ordinal at that time. |
We do have special optimization for specific ranges. Look for IsFastSort in https://github.com/dotnet/coreclr/blob/52651008a5fb72eb678467cd2eb42aac4e5334e8/src/System.Private.CoreLib/shared/System/Globalization/CompareInfo.Unix.cs#L242 . This "optimization" is used on Unix only. We used to have it on Windows as well, but removed it there because of it was not very helpful. One way to fix this problem is to remove this "optimization" on Unix too (PR: dotnet/coreclr#24400) and replace it with fast path and slow path with smaller degradation. |
we need to be careful here as there is some characters in ASCII range is treated differently than Windows in ICU. trying to change that will start deviate us from the ICU sorting behavior and may not be a good thing to do. |
Also, in such corner cases, it is usually better to use the ordinal operations and not the linguistic operations. |
en-US. With other cultures the performance is bad with any set of character but this is a known issue since 1.0 (https://github.com/dotnet/coreclr/issues/5612)
Yes, we're currently adding |
Note that this issue happens only in case of using some specific characters in the ASCII range while the whole string is ASCII too. These characters are mainly control characters which not really used in main stream cases. The only characters that is interesting are the - and '. So we are talking about a corner case here and not really main stream scenarios. You can look at the special characters here https://github.com/dotnet/coreclr/blob/ab2f9caaf35c96d029b96aa171ee65d04253cf7c/src/utilcode/util_nodependencies.cpp#L856 |
@tarekgh It’s not truly helpful to simply label this as a corner case. People expect consistent performance and if you have a method that is suddenly magnitudes slower due to certain characters being used it’s like dropping time bombs into peoples code. |
How this is different than inserting non ASCII character in the string even on Windows? you will get the same performance hit. Again, this is why we provide Ordinal option to let users control the behavior. The issue here is the correctness or sort behavior against the perf. we can fix the perf by not specializing the hyphen on Linux but we should be clear this for sure can be a problem for some expected sort behavior. we can try to find a middle ground here which may work but we need to detect the cases we need to allow ordinal even when hyphen exist. just need some more investigation. |
Looking more, looks on Windows we don't even try to optimize for ASCII cases and always call the underlying OS. so what @jkotas mentioned before looks reasonable to try. |
On a related note, we are running dotnet/performance benchmarks on CoreCLR and Mono/netcore with LLVM JIT. I was investigating the spots where Mono performed especially poorly. One of them is the The difference in the benchmark between Mono (which does not have this code path) and CoreCLR (which has it) was quite enormous:
However, the numbers are actually lying because the IsFastSort/IsAscii optimization is computed during the benchmark initialization and not accounted for in the benchmark run. To accurately measure the impact we would likely have to get rid of the cache in the object header first and see what happens. Another question is whether people run |
Unless I misunderstand, @jkotas's suggestion assumes that the perf disparity comes from the I've done a few more benchmarks to get more data. I'm measuring the cost of
We see that On the other hand, we see how the performance degrades with the length of the string. We have an initial 4µs cost whenever we take the slow path, no matter the length of the string. Then we see the cost increasing linearly (3µs for 80, 6µs for 160, 30µs for 800). So we really have two issues:
|
I am sure there are some edge cases I didn't consider but it should be relatively easy to implement |
I've profiled following app: class Program
{
static void Main()
{
while (true)
{
Consume(string.Concat(new string('a', 512), "-").StartsWith("i"));
}
}
[MethodImpl(MethodImplOptions.NoInlining)]
private static void Consume<T>(in T _) { }
} with PerfCollect and the most expensive methods are:
So |
@tarekgh is there any reason why we should not implement bool StartsWith(string source, string prefix, StringComparison stringComparison)
=> CompareString(source.AsSpan(start: 0, length: prefix.Length), prefix, stringComparison); Edit: nevermind, I've got an answer from @kevingosse in dotnet/coreclr#26481 ;) |
@kevingosse is right this will not work except for ordinal cases. |
@kevingosse thanks for you measurements. I believe @adamsitnik PR is going to help some with the StartsWith scenario. |
dotnet/coreclr#26759 and dotnet/coreclr#26621 combined together have fixed this problem. Fun fact: while working on improving the performance of StartsWith on Linux we have found and fixed an 18 year old bug in ICU unicode-org/icu#840 ;) |
string.StartsWith
on Linux becomes 2 orders of magnitude slower when the string contains a dash (-
).On Linux:
On Windows (only for reference, the hardware is not the same):
Benchmark code:
The performance issue does not occur if using ordinal comparison.
The text was updated successfully, but these errors were encountered: