-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
StringComparer.CurrentCultureIgnoreCase does not ignore case when LANG=C.UTF-8 #27376
Comments
"C" and "Posix" collations behavior is case sensitive. we encourage not using these locales if you are performing any sorting operations. you may look at https://github.com/dotnet/corefx/issues/28611 for more info. You may look at the comment on the code https://github.com/dotnet/coreclr/blob/master/src/corefx/System.Globalization.Native/locale.cpp#L128 for Posix but it apply to C too. Also, you may look at https://www.postgresql.org/docs/10/static/collation.html which is telling same things
|
This still seems wrong @tarekgh I could see that the lang could impact default comparison but I would expect functions where explicitly asked for "IgnoreCase" do case-insensitive comparison. |
@wfurt did you read the comment https://github.com/dotnet/corefx/issues/32250#issuecomment-420749205? it has the explanation there. IF you don't like the behavior switch off from POSIX and C. |
Yes, I did read through and I think the argument is not correct. |
https://wiki.musl-libc.org/functional-differences-from-glibc.html |
@tarekgh what are your thoughts about @wfurt comment above? i'm guessing most .NET code out there relies somewhere on case insensitive comparison, it seems quite odd if WSL is truly configured such that that is impossible. Should they set a different LANG? (Or maybe this is different in the new WSL2 ) |
I see @tarekgh already updated docs which is good 😄 dotnet/docs#8179 Just curious about @wfurt point. |
To summarize/clarify, When setting the culture to "C", ICU map it to en_us_posix. and the posix locale is for sure doesn't support case insensitive comparisons. We are just working as ICU. So, .NET Core behavior is 100% matching ICU which is the main globalization component for Linux. In short, we are not defining how "C" locale work, we just do whatever ICU does. Now, if we need to do something here, we can just start changing the mapping of "C" to something else rather than POSIX locale. This is kind of breaking change as no idea if there is people already taking dependency in such behavior. POSIX locale just doing that because people wanted to compare strings as binaries (regardless of using case sensitive option or not). Let me know what you think. |
I see. In regular distros/shells, do they typically have different choices for LANG? I wonder why WSL does this, it seems like a large gotcha. |
I can follow up with WSL team about that. but in Linux world think about "C" as invariant culture. |
Alpine is also like this. There really is no locale other than C. (see #962) |
I am happy to close this and defer to experts. It would be great to circle back with WSL team's response. Thanks guys. |
See also: PowerShell/PowerShell#7761
For a small repro:
dotnet new console
dotnet run
and get output:LANG=C.UTF-8 dotnet run
and get output:Expected Behaviour (with
LANG=C.UTF-8
):Actual Behaviour (with
LANG=C.UTF-8
):Runtime information:
Given that
C.UTF-8
is a Debian locale, this is probably also the case on other Debian-based Linux distros.The text was updated successfully, but these errors were encountered: