Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove redundant load that is immediately after the store in same src/dst #35613

Closed
kunalspathak opened this issue Apr 29, 2020 · 2 comments · Fixed by #39222
Closed

Remove redundant load that is immediately after the store in same src/dst #35613

kunalspathak opened this issue Apr 29, 2020 · 2 comments · Fixed by #39222
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI optimization
Milestone

Comments

@kunalspathak
Copy link
Member

kunalspathak commented Apr 29, 2020

I have seen below pattern in generated code for framework libraries. We can eliminate the ldr if it is loading the value into same register that was used to store in previous instruction in same memory location.

Example of ARM64 code:

B90023A0          str     w0, [fp,#32]	// [V04 loc1]
B94023A0          ldr     w0, [fp,#32]	// [V04 loc1]

Example of x64 code:

48894C2430           mov      gword ptr [rsp+30H], rcx
488B4C2430           mov      rcx, gword ptr [rsp+30H]

There are approx. 2500 such patterns in 1500 methods. Details in str-ldr.txt.

category:cq
theme:basic-cq
skill-level:intermediate
cost:medium

@kunalspathak kunalspathak added area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI optimization labels Apr 29, 2020
@Dotnet-GitSync-Bot Dotnet-GitSync-Bot added the untriaged New issue has not been triaged by the area owner label Apr 29, 2020
@kunalspathak kunalspathak changed the title Remove redundant load that is followed by store Remove redundant load that is immediately after the store in same src/dst Apr 29, 2020
@BruceForstall BruceForstall added this to the Future milestone May 3, 2020
@BruceForstall BruceForstall removed the untriaged New issue has not been triaged by the area owner label May 3, 2020
@BruceForstall
Copy link
Member

Related: #35614

@kunalspathak
Copy link
Member Author

It turns out that we shouldn't do this optimization for 4-byte registers. Consider for below example, the 2nd instruction zero-extends the upper bits of x0 by doing ldr. If 2nd instruction is eliminated, that won't be true and hence would cause functional bug.

B90023A0          str     w0, [fp,#32]	// [V04 loc1]
B94023A0          ldr     w0, [fp,#32]	// [V04 loc1]

With that, I re-calculated the patterns in framework libraries and looks like there are 1442 patterns in 830 methods.

Details:
str-ldr-x.txt

@ghost ghost locked as resolved and limited conversation to collaborators Dec 9, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI optimization
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants