-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Test failure System.Text.RegularExpressions.Tests.RegexCultureTests.TestIgnoreCaseRelationBorderCasesInNonBacktracking(options: NonBacktracking) #60753
Comments
Tagging subscribers to this area: @eerhardt, @dotnet/area-system-text-regularexpressions Issue DetailsRun: runtime-libraries-coreclr outerloop 20211021.5 Failed test:
One of error message:
or
|
I believe the issue here is different data coming from either NLS or older ICUs. I can repro the same failures by setting the environment variable: @veanes, @olsaarik, I think this highlights we'll need to revisit the pre-generation of the relation data (for reasons beyond the size ones we've already discussed). The data can vary based on the machine on which pregeneration was performed, but we ship a single S.T.RegularExpressions.dll binary that's used regardless of OS version and regardless of what globalization library is in use for a given process. For both size and correctness/consistency then, we'll likely need to remove the pregenerated data and come up with a scheme for computing it lazily at run-time. cc: @danmoseley |
Same for the source generator? Basically our position on culture is always: we respect the thread's culture at the time the regex object is first used. |
It's a question of where are we baking the assumptions into. I think it's reasonable to bake them into the user assembly being built. I think it's unreasonable to bake them into S.T.RE.dll. |
The generated tables really depend on the definitions/semantics of Unicode standard categories and in this case IgnoreCase rules for 'i'. I was under the impression that these are fixed for a given system version. The 'i' part can be tested and generated locally for a given runtime also if there are differences with handling of Turkish 'i' -- I'm sure that's what this is about. |
Also, if it is ever possible that the Unicode Categories actually change during the lifetime of one .NET version (thus actually affecting the semantics of Regex matching, which I think I've seen happen once), then the generated BDD tables need to be updated simultaneously -- or generated dynamically. |
We get data from the underlying OS and/or ICU library in play. You can see the effect of this by running this on .NET 6: using System.Globalization;
var enUs = new CultureInfo("en-US");
var invariant = CultureInfo.InvariantCulture;
for (int i = 0; i <= char.MaxValue; i++)
{
if (char.ToUpper((char)i, enUs) != char.ToUpper((char)i, invariant))
{
Console.WriteLine($"ToUpper: {i}");
}
if (char.ToLower((char)i, enUs) != char.ToLower((char)i, invariant))
{
Console.WriteLine($"ToLower: {i}");
}
} You should see output like:
Now run it again, but this time with this environment variable set:
Now you should see output more like this:
|
To clarify what Stephen said it's not just that flag that affects the behavior but whatever libicu version is present on that OS or distro. Each version has slightly different data, although probably casing is relatively stable. |
Failed again in: runtime-libraries-coreclr outerloop 20211031.1 Test failure:
Error message:
Failed test:
Error message:
|
Failed again in: runtime-libraries-coreclr outerloop 20211108.1 Failed test:
Error message:
Failed test:
Error message:
|
Failed again in: runtime-libraries-coreclr outerloop 20211116.1 Failed test:
Error mesage:
Failed test:
Error message:
|
Failed again in: runtime-libraries-coreclr outerloop 20211124.6 Failed test:
Error message:
Failed test:
Error message:
|
Failed again https://github.com/dotnet/runtime/pull/62095/checks?check_run_id=4401635475
https://github.com/dotnet/runtime/pull/62095/checks?check_run_id=4401635312
|
Failed again in: runtime-libraries-coreclr outerloop 20211202.6 Failed test:
Error message:
Failed test:
Error message:
|
Failed again in: runtime-libraries-coreclr outerloop 20220410.7 Failed test:
Error message:
|
Failed again in: runtime-libraries-coreclr outerloop 20220418.8 Failed test:
Error message:
|
Failed again in: runtime-libraries-coreclr outerloop 20220426.3 Failed test:
Error message:
|
Run: runtime-libraries-coreclr outerloop 20211021.5
Failed test:
One of error message:
or
The text was updated successfully, but these errors were encountered: