-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable constant CSE for ARM #100054
Enable constant CSE for ARM #100054
Conversation
Also, extract out the code that determines if constant CSE or shared constant CSE are enabled, and rewrite it to be easier to understand.
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch |
/azp run runtime-coreclr outerloop, runtime-coreclr jitstress, runtime-coreclr libraries-jitstress |
Azure Pipelines successfully started running 3 pipeline(s). |
Only diffs on arm32, as expected. Overall PerfScore wins, and large code size improvement, with both regressions and improvements. TP cost (regression) somewhat large on some collections. |
Failures are all known issues. |
@EgorBo @AndyAyersMS PTAL This seems like a win. Should we enable it? I was curious what the result would be because I was confused to see that shared constant CSE was enabled for arm32 (#70580), but not non-shared constant CSE. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My understanding that CSE for constants is a bit less useful on ARM32 because every 32bit constant can be done via not more than 2 instructions (and it's up to 4 instructions on ARM64). The diffs seem to be dominated by coreclr_tests.run
collection
The same is true for Arm64, any 32-bit constant is no more than 2 instructions since Notably, however, some of the C/C++ compilers still opt to emit a load for such constants from memory rather than emitting the 2-instruction sequence. For Arm32, clang emits It seems there's no real consensus for what's best and I imagine it likely depends partially on whether the underlying hardware allows for fusing of the |
I meant that a good chunks of constants are handles, and they're 32bit on arm32 and 64bit on arm64 |
Failures are all known. |
/ba-g Failure is #100047 -- should have been marked ok here |
Also, extract out the code that determines if constant CSE or shared constant CSE are enabled, and rewrite it to be easier to understand.