-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Couple optimization to MultiRegStoreLoc #64857
Couple optimization to MultiRegStoreLoc #64857
Conversation
/azp run runtime-coreclr outerloop |
Azure Pipelines successfully started running 1 pipeline(s). |
Tagging subscribers to this area: @JulieLeeMSFT Issue Detailsnull
|
/azp run runtime-coreclr outerloop |
…ters as src in src/coreclr/jit/lsrabuild.cpp
…nearScan::BuildStoreLocDef() in src/coreclr/jit/lsrabuild.cpp
79d3bb6
to
d94f2b8
Compare
@dotnet/jit-contrib PTAL |
/azp run runtime-coreclr jitstressregs |
Azure Pipelines successfully started running 1 pipeline(s). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM as long as jitstressregs is clean.
Taking a look at the linux-arm failures. |
The failures on linux-arm are known #65395 and macos-x64 ones look like an infrastructure hiccup. |
macos-x64 is clean after re-run (https://dev.azure.com/dnceng/public/_build/results?buildId=1612824&view=logs&j=4621ec7e-3d9b-5fdb-df5a-cf7177b660c1) |
Improvements on arm64: dotnet/perf-autofiling-issues#3565 |
Change dotnet#64857 exposed an existing problem where when generating code for a multi-reg GT_STORE_LCL_VAR, if the first register slot was not enregistered, but the second or subsequent slots was, and those non-first slots contained GC pointers, we wouldn't properly add those GC pointers to the GC tracking sets. This led to cases where the register lifetimes would be killed in the GC info before the actual lifetime was complete. The primary fix is to make `gtHasReg()` handle the `IsMultiRegLclVar()` case. As a side-effect, this fixes some LSRA dumps that weren't displaying multiple registers properly. There are about 50 SPMI asm diffs on win-arm64 where register lifetimes get extended, fixing GC holes. I also made `GetMultiRegCount()` handle the `IsMultiRegLclVar()` case. I made a number of cleanup changes along the way: 1. Fixed two cases of calling `gcInfo.gcMarkRegSetNpt` with regNumber, not regMaskTP 2. Marked some functions `const` 3. Improved some comments 4. Changed "ith" to "i'th" in comments which still doesn't read great, but at least I'm not left trying to parse "ith" as an English word. 5. Use `OperIsScalarLocal()` more broadly 6. Renamed `gtDispRegCount` to `gtDispMultiRegCount` to make it clear it only applies to the multi-reg case. Fixes dotnet#65476.
* Fix GC hole with multi-reg local var stores Change #64857 exposed an existing problem where when generating code for a multi-reg GT_STORE_LCL_VAR, if the first register slot was not enregistered, but the second or subsequent slots was, and those non-first slots contained GC pointers, we wouldn't properly add those GC pointers to the GC tracking sets. This led to cases where the register lifetimes would be killed in the GC info before the actual lifetime was complete. The primary fix is to make `gtHasReg()` handle the `IsMultiRegLclVar()` case. As a side-effect, this fixes some LSRA dumps that weren't displaying multiple registers properly. There are about 50 SPMI asm diffs on win-arm64 where register lifetimes get extended, fixing GC holes. I also made `GetMultiRegCount()` handle the `IsMultiRegLclVar()` case. I made a number of cleanup changes along the way: 1. Fixed two cases of calling `gcInfo.gcMarkRegSetNpt` with regNumber, not regMaskTP 2. Marked some functions `const` 3. Improved some comments 4. Changed "ith" to "i'th" in comments which still doesn't read great, but at least I'm not left trying to parse "ith" as an English word. 5. Use `OperIsScalarLocal()` more broadly 6. Renamed `gtDispRegCount` to `gtDispMultiRegCount` to make it clear it only applies to the multi-reg case. Fixes #65476. * Update src/coreclr/jit/gentree.cpp Co-authored-by: Kunal Pathak <[email protected]> Co-authored-by: Kunal Pathak <[email protected]>
The diffs are net positive. There are some regressions. Looked at couple of them - do not see how they can be mitigated - they are due to re-shuffling of the registers in LSRA.
Here is a description of the change:
GT_STORE_LCL_VAR
if the source is last-use local or not a local (i.e. a multi-reg call or hardware intrinsic). The similar strategy is already done for single regGT_STORE_LCL_VAR
.LinearScan::BuildStoreLocDef()
.The second issue can be demonstrated by examples.
Consider the following program:
The code diffs for
Main
(on win-arm64):Note that in the base case a variable that corresponds to promoted
val._1
was added (unnecessarily) tocurrentLiveVars
and, as a consequence,x19
was used to keep a dead value of the variable.Now, if you consider another program:
the diffs are opposite in some sense
val._1
was NOT added tocurrentLiveVars
and instead of assigning a callee-saved register toval._1
(to be able to survive a call toTrashRegs()
) the value had to be spilled on the stack.