-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[wasm] Compiled regexes in interpreter ~25-50x regression #99553
Comments
Tagging subscribers to this area: @BrzVlad, @kotlarmilos |
Tagging subscribers to 'arch-wasm': @lewing |
Maybe #98791 ? It would suggest that the interpreter doesn't currently play nicely with |
It's possible, I wouldn't expect that to cause this big of a regression but iirc this regression set also included some SearchValues-related regressions. |
cc @vitek-karas |
This performance issue is not related to mono interpreter, it is indeed caused by the Regex change linked above: commit. @stephentoub The issue is related to support for |
Per my comment in #99553 (comment), you don't see |
I think I benchmarked that API some time ago and I didn't notice perf problems. Anyhow, as I mentioned above, the same regression would still happen on CoreCLR if dynamic code is disabled, so this should rule out potential problems with missing interpreter optimizations. The hottest methods detected by interpreter in this benchmark are the following, starting with the hottest:
|
Thanks, @BrzVlad. I still need to validate this, but I expect the problem then is that optimizations which used to kick in for such situations now aren't. This particular test has this regex pattern: "tempus|magna|semper". Previously, that would have resulted in the regex looking for the next match by doing an IndexOfAny for e.g. the third character of each, so Lines 145 to 150 in 16506b7
After choosing to do that, it's then actually creating the SearchValues: Lines 152 to 155 in 16506b7
But, it's only doing so if it thinks it's going to need it, and it would only need it for non-compiled, because this instance would be created elsewhere for compiled. That logic isn't taking into account the possibility, however, that something which claims to be compiled isn't actually compiled. If dynamic code isn't compiled, then later on when it goes to actually do the compilation, it'll instead fall back to using the interpreter, which means we will have selected to use this multi-string search, but we won't actually have the SearchValues<string> instance to do it. When the interpreter then goes to find the next location to consider doing a match, it'll try to do the multi-string search, find it unexpectedly doesn't have the SearchValues, and bail:Lines 699 to 707 in 16506b7
which means it'll end up trying the full match at every position. So whereas previously in this situation it would have been doing the full match only at places where IndexOfAny('m', 'g') matched, now it's doing it everywhere. Assuming I'm right, the fix is likely to just always do: Line 154 in 16506b7
rather than guarding it with an if, and accept the small extra overhead in the compiled case where it'll be throwaway. (We could pass around more state to avoid that, but it's probably not worth the churn.) |
Calling this closed, we can track improvements in net10 separately |
At some point recently, performance for BDN's compiled regex tests under the interpreter on wasm regressed by around 25-50x. See https://pvscmdupload.blob.core.windows.net/reports/allTestHistory/refs/heads/main_x64_ubuntu%2022.04_CompilationMode=wasm_RunKind=micro/System.Text.RegularExpressions.Tests.Perf_Regex_Common.MatchWord(Options%3a%20IgnoreCase%2c%20Compiled).html for one example.
I tested this in isolation and was able to reproduce bad performance, but I couldn't identify the cause, and the code involved is very complex so I couldn't make sense of what was going on. I didn't notice any misbehavior on the jiterpreter side of things for this scenario.
See dotnet/perf-autofiling-issues#29881 for diff range.
The text was updated successfully, but these errors were encountered: