Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Redesign benchmarks for culture-specific string operations #892

Merged
merged 7 commits into from
Sep 23, 2019

Conversation

adamsitnik
Copy link
Member

The culture-specific string benchmarks we had so far were using a very small and simple input and were testing very few cultures and CompareOptions. This is why we have missed issues like https://github.com/dotnet/corefx/issues/40674

This is my proposal for fixing it.

Instead of using some made-up text, we are using part of "Alice's Adventures in Wonderland" book. It contains mostly simply ASCII characters, but also some "high" chars that get special treatment and hit the slow path.

The tested matrix now is:

(new CultureInfo("en-US"), CompareOptions.Ordinal)
(new CultureInfo("en-US"), CompareOptions.OrdinalIgnoreCase)
(new CultureInfo("en-US"), CompareOptions.None)
(new CultureInfo("en-US"), CompareOptions.IgnoreCase)
(new CultureInfo("en-US"), CompareOptions.IgnoreSymbols)
(CultureInfo.InvariantCulture, CompareOptions.None)
(CultureInfo.InvariantCulture, CompareOptions.IgnoreCase)
(new CultureInfo("pl-PL"), CompareOptions.None) // as an example of complex language hitting the slow path on Unix

I've removed the old benchmarks that would now be duplicated.

Moreover I've realized that Perf_CompareInfo had a serious bug inside - the strings were always identical because source argument was never used..

private static string GenerateInputString(char source, int count, char replaceChar, int replacePos)
{
char[] str = new char[count];
for (int i = 0; i < count; i++)
{
str[i] = replaceChar;
}
str[replacePos] = replaceChar;
return new string(str);
}

Fixes #885

Copy link
Member

@tarekgh tarekgh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Modulo @jkotas comments, LGTM

@adamsitnik adamsitnik merged commit cc73a01 into dotnet:master Sep 23, 2019
@adamsitnik adamsitnik deleted the stringBenchmarksRedesign branch September 23, 2019 15:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement more realistic string benchmarks
4 participants