The result of full-width string comparison with InvariantCultureIgnoreCase has changed in .NET 5 #44789

Nukepayload2 · 2020-11-17T09:54:31Z

Description

The result of String.Equals("ＡＥ", "ａｅ", StringComparison.InvariantCultureIgnoreCase) has changed to False in .NET 5 .

Steps:

Create a new .NET Core 3.1 or .NET Framework 4.8 VB console project
Run the following code. The output is True .

Module Program
    Sub Main()
        Console.WriteLine(String.Equals("ＡＥ", "ａｅ", StringComparison.InvariantCultureIgnoreCase))
    End Sub
End Module

Set target framework to .NET 5.0, and run the project again.

Expected behavior
The output is True

Actual behavior
The output is False

Configuration

.NET SDK (reflecting any global.json):
Version: 5.0.100
Commit: 5044b93829

Runtime Environment:
OS Name: Windows
OS Version: 10.0.19041
OS Platform: Windows
RID: win10-x64
Base Path: C:\Program Files\dotnet\sdk\5.0.100\

Host (useful for support):
Version: 5.0.0
Commit: cf258a1

Regression?

Yes. This problem doesn't exist in .NET Framework 4.8 and .NET Core 3.1 .

Other information

Probably caused by switching to ICU. But this documentation didn't mention string comparison behavior changes for full-width strings.

We switched back to NLS because of this problem.

<ItemGroup>
  <RuntimeHostConfigurationOption Include="System.Globalization.UseNls" Value="true" />
</ItemGroup>

The text was updated successfully, but these errors were encountered:

ghost · 2020-11-17T09:54:35Z

Tagging subscribers to this area: @tarekgh, @safern, @krwq
See info in area-owners.md if you want to be subscribed.

Issue Details

Description:	Description The result of `String.Equals("ＡＥ", "ａｅ", StringComparison.InvariantCultureIgnoreCase)` has changed to `False` in .NET 5 . Steps: Create a new .NET Core 3.1 or .NET Framework 4.8 VB console project Run the following code. The output is `True` . Module Program Sub Main() Console.WriteLine(String.Equals("ＡＥ", "ａｅ", StringComparison.InvariantCultureIgnoreCase)) End Sub End Module Set target framework to .NET 5.0, and run the project again. Expected behavior The output is `True` Actual behavior The output is `False` Configuration .NET SDK (reflecting any global.json): Version: 5.0.100 Commit: 5044b93829 Runtime Environment: OS Name: Windows OS Version: 10.0.19041 OS Platform: Windows RID: win10-x64 Base Path: C:\Program Files\dotnet\sdk\5.0.100\ Host (useful for support): Version: 5.0.0 Commit: `cf258a1` Regression? Yes. This problem doesn't exist in .NET Framework 4.8 and .NET Core 3.1 . Other information Probably caused by switching to ICU. But this documentation didn't mention string comparison behavior changes for full-width strings. We switched back to NLS because of this problem. <ItemGroup> <RuntimeHostConfigurationOption Include="System.Globalization.UseNls" Value="true" /> </ItemGroup>
Author:	Nukepayload2
Assignees:	-
Labels:	`area-System.Globalization`, `untriaged`
Milestone:	-

tarekgh · 2020-11-18T00:08:09Z

@Nukepayload2 thanks for reporting the issue. I think you analysis is correct that this happen because switching using ICU but the issue is not ICU itself but how we internally handle this full-width range. we'll work on fixing this issue.

tarekgh · 2020-11-20T00:21:21Z

@Nukepayload2 just to mention the workaround for this issue is do something like:

string.Compare("ＡＥ", "ａｅ", CultureInfo.InvariantCulture, CompareOptions.IgnoreWidth | CompareOptions.IgnoreCase) == 0

This should gives you the desired behavior and you don't have to switch back to NLS.

Nukepayload2 · 2020-11-23T06:02:15Z

@Nukepayload2 just to mention the workaround for this issue is do something like:
string.Compare("ＡＥ", "ａｅ", CultureInfo.InvariantCulture, CompareOptions.IgnoreWidth | CompareOptions.IgnoreCase) == 0
This should gives you the desired behavior and you don't have to switch back to NLS.

@tarekgh
Thanks for providing the workaround. However, we still need to switch back to NLS. Because the comparison result of Japanese Hiragana and Katakana strings has changed.

Console.WriteLine(String.Compare("まりお", "マリオ", StringComparison.InvariantCultureIgnoreCase))

.NET Framework output: 1
.NET 5 output: -1

tarekgh · 2020-11-23T19:15:40Z

@Nukepayload2 thanks again for the feedback.

However, we still need to switch back to NLS. Because the comparison result of Japanese Hiragana and Katakana strings has changed.

Could you please elaborate more about your scenario? I mean why comparing まりお and マリオ new results will be broken to you? I understand it will change the sort order but why this break you?

Nukepayload2 · 2020-11-24T06:06:15Z

@tarekgh

I understand it will change the sort order but why this break you?

Because our Japanese end users expect the same order as Excel when sorting Japanese strings.

ewfian · 2020-11-24T11:48:08Z

@tarekgh

I understand it will change the sort order but why this break you?

Because our Japanese end users expect the same order as Excel when sorting Japanese strings.

FYI.

Refs:

tarekgh · 2020-11-24T16:45:48Z

Because our Japanese end users expect the same order as Excel when sorting Japanese strings.

@Nukepayload2 do you expect this order with all cultures? or with Japanese cultures only?

@ewfian thanks for the references. The second one is from Unicode which is what ICU implementing and that is the behavior I guess the complaint is about. no? could you elaborate more about what you are trying to point at from these references?

ewfian · 2020-12-01T03:18:05Z

@tarekgh The new behavior base on ICU is what I expected. Just provide some references about Japanese sort order.

tarekgh · 2021-01-04T19:44:03Z

Closing this one per the PR #45079

Dotnet-GitSync-Bot added area-System.Globalization untriaged New issue has not been triaged by the area owner labels Nov 17, 2020

tarekgh removed the untriaged New issue has not been triaged by the area owner label Nov 17, 2020

tarekgh added the bug label Nov 18, 2020

tarekgh added this to the 6.0.0 milestone Nov 18, 2020

tarekgh mentioned this issue Nov 22, 2020

Fix Full Width Chars Casing #45079

Merged

tarekgh closed this as completed Jan 4, 2021

ghost locked as resolved and limited conversation to collaborators Feb 3, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The result of full-width string comparison with InvariantCultureIgnoreCase has changed in .NET 5 #44789

The result of full-width string comparison with InvariantCultureIgnoreCase has changed in .NET 5 #44789

Nukepayload2 commented Nov 17, 2020

ghost commented Nov 17, 2020

Description

Configuration

Regression?

Other information

tarekgh commented Nov 18, 2020

tarekgh commented Nov 20, 2020

Nukepayload2 commented Nov 23, 2020

tarekgh commented Nov 23, 2020

Nukepayload2 commented Nov 24, 2020 •

edited

Loading

ewfian commented Nov 24, 2020

tarekgh commented Nov 24, 2020

ewfian commented Dec 1, 2020

tarekgh commented Jan 4, 2021

The result of full-width string comparison with InvariantCultureIgnoreCase has changed in .NET 5 #44789

The result of full-width string comparison with InvariantCultureIgnoreCase has changed in .NET 5 #44789

Comments

Nukepayload2 commented Nov 17, 2020

Description

Configuration

Regression?

Other information

ghost commented Nov 17, 2020

Description

Configuration

Regression?

Other information

tarekgh commented Nov 18, 2020

tarekgh commented Nov 20, 2020

Nukepayload2 commented Nov 23, 2020

tarekgh commented Nov 23, 2020

Nukepayload2 commented Nov 24, 2020 • edited Loading

ewfian commented Nov 24, 2020

tarekgh commented Nov 24, 2020

ewfian commented Dec 1, 2020

tarekgh commented Jan 4, 2021

Nukepayload2 commented Nov 24, 2020 •

edited

Loading