Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor codegen and emit to centralize mov reg, reg handling #52661

Merged
merged 2 commits into from
May 17, 2021

Conversation

tannergooding
Copy link
Member

@tannergooding tannergooding commented May 12, 2021

This centralizes the handling of moves so that move elision can be centrally managed.

This results in zero diffs for the x64, arm32, and arm64 frameworks, benchmarks, and tests PMI disassembly.

@dotnet-issue-labeler dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label May 12, 2021
@tannergooding
Copy link
Member Author

CC. @AndyAyersMS, @BruceForstall

@tannergooding
Copy link
Member Author

This might also make it easier to do optimizations for certain scenarios and to add more interesting kinds of move elimination (such as we have basic support for on Arm64 via IsRedundantMov)

@tannergooding tannergooding force-pushed the centralize-move-elision branch from 838b22e to 02606db Compare May 13, 2021 01:00
@@ -203,15 +203,15 @@ void CodeGen::instGen_Set_Reg_To_Imm(emitAttr size,

if (GetEmitter()->isLowRegister(reg) && (imm_hi16 == 0xffff) && ((imm_lo16 & 0x8000) == 0x8000))
{
GetEmitter()->emitIns_R_R(INS_sxth, EA_4BYTE, reg, reg);
GetEmitter()->emitIns_Mov(INS_sxth, EA_4BYTE, reg, reg, /* canSkip */ false);
Copy link
Member Author

@tannergooding tannergooding May 13, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most of these emitIns_Mov calls should "probably" be inst_Mov or inst_Move_Extend calls instead, where the instruction is picked based on the register and type.
I tried, for the most part, to minimize the diff from this change however, and so I kept existing calls directly to emitIns_* where possible.

@@ -550,7 +550,7 @@ void CodeGen::genLclHeap(GenTree* tree)
inst_JMP(EJ_lo, done);

// Update SP to be at the next page of stack that we will tickle
GetEmitter()->emitIns_R_R(INS_mov, EA_PTRSIZE, REG_SPBASE, regTmp);
GetEmitter()->emitIns_Mov(INS_mov, EA_PTRSIZE, REG_SPBASE, regTmp, /* canSkip */ false);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Many of these are currently /* canSkip */ false because that's how the existing code handled it.

However, almost all of them except for genIntToIntCast are technically elidable (assuming the registers are the same) as they don't do have any special semantics.

@tannergooding tannergooding force-pushed the centralize-move-elision branch 4 times, most recently from 254653b to 82ac186 Compare May 13, 2021 04:01
Copy link
Member

@AndyAyersMS AndyAyersMS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I've said elsewhere I don't think this really addresses the heart of the issue.

However, it makes the jit's behavior more consistent, and the centralizing of the move elision logic is a nice improvement. It will be interesting to see if removing some of the unneded canSkips leads to interesting diffs.

Can you verify that when the problematic assert is restored, it no longer fires..?

src/coreclr/jit/emitxarch.cpp Outdated Show resolved Hide resolved
@tannergooding tannergooding force-pushed the centralize-move-elision branch from 82ac186 to 4c1b097 Compare May 13, 2021 23:05
@tannergooding
Copy link
Member Author

Can you verify that when the problematic assert is restored, it no longer fires..?

@AndyAyersMS, validated that there are no diffs with the first commit and the second commit allows the assert to no longer fire.
The second commit does cause a few textual diffs since it requires instrDesc to be allocated and therefore changes the instruction groups.
I updated some of the metrics to track how many zero size instrDesc are allocated.

As I've said elsewhere I don't think this really addresses the heart of the issue.

I'm certain we can also do this higher in the JIT (possibly by retyping things), but the "root" issue today is that we have a pointer to byref assignment and the assignment is elided so the emitter never marks the register as becoming a live byref. Preserving the instrDesc for these elided moves and setting its size to zero allows us to correctly perform the liveness update while still never emitting any code. This "resolves" the issue and avoids any theoretical GC hole we have (#52462 (comment)).

However, it makes the jit's behavior more consistent, and the centralizing of the move elision logic is a nice improvement. It will be interesting to see if removing some of the unneded canSkips leads to interesting diffs.

Right, I think the first commit is worth taking even if we decide the second is not. This at least allows us to centrally manage the logic and allows easier introduction of other kinds of move elision. ARM64 for example has IsRedundantMov which avoids the mov dst, src; mov src, dst shuffle while x64 does not.

Likewise, it looks like almost all cases where /* canSkip */ false is specified "could" be /* canSkip */ true. The exception is cases like mov eax, eax from genIntToIntCast since this is how you encode zero-extend 32-bit to 64-bit

@tannergooding
Copy link
Member Author

tannergooding commented May 13, 2021

ARM64 isn't quite zero diffs. It looks like there is some scenario where sxtw was being elided before and where it isn't now.
This one in particular seems a bit strange to me since sxtw has side effects, so I'm going to investigate why...

Edit: This is an existing elided sign extension on ARM64: https://github.com/dotnet/runtime/pull/52661/files#diff-abc75a0357f9afd3dbad2d0f5ed03e13aa04c4e6c6c69118014558f3c7f61f61R2356-R2357

@tannergooding tannergooding force-pushed the centralize-move-elision branch from 31e3298 to 519a7d0 Compare May 14, 2021 07:05
@tannergooding tannergooding changed the title [PROTOTYPE] Refactor codegen and emit to centralize mov reg, reg handling Refactor codegen and emit to centralize mov reg, reg handling May 14, 2021
@tannergooding
Copy link
Member Author

@AndyAyersMS, to simplify things (and since there is still some issue with the second commit I'm working on), I've limited this to just centralizing the move elision support.

I got PMI diffs for --frameworks, --benchmarks, and --tests and saw zero diffs for x64, arm, and arm64.

@tannergooding tannergooding marked this pull request as ready for review May 14, 2021 20:13
@AndyAyersMS
Copy link
Member

Thanks. I appreciate all the work you're putting into this.

I'll approve, but I'd like to get one other approver from @dotnet/jit-contrib

@kunalspathak
Copy link
Member

I will review it later today too.

Copy link
Member

@BruceForstall BruceForstall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks for the cleanup!

Copy link
Member

@kunalspathak kunalspathak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments. Overall, good approach and thanks for doing this.

src/coreclr/jit/emitxarch.cpp Outdated Show resolved Hide resolved
src/coreclr/jit/emitxarch.cpp Show resolved Hide resolved
assert(size <= EA_32BYTE);
noway_assert(emitVerifyEncodable(ins, size, dstReg, srcReg));

if (canSkip && (dstReg == srcReg))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what are some examples where canSkip == false but dstReg == srcReg?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This mostly comes down to inst_Mov_Extend where extensions should not be elided.

The point of this initial PR is to be a refactoring with zero-diffs. We need to audit/cleanup the callers in a secondary PR so that all copy moves can be elided when the registers are the same and all extension moves are not elided (outside the logic that checks if the value was already extended).

// Ensure we aren't overwriting op2
assert(op2Reg != targetReg);
// Ensure we aren't overwriting op2
assert((op2Reg != targetReg) || (op1Reg == targetReg));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the assert be just assert(op2 != targetReg)? It doesn't matter if op1Reg == targetReg or not, right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This assert is validating that the move won't overwrite op2.

This requires that either op2Reg != targetReg and so therefore mov tgtReg, op1Reg is always safe or op1Reg == tgtReg in which case mov tgtReg, op1Reg is a "nop" and it doesn't matter if op2Reg is tgtReg.

src/coreclr/jit/emitxarch.cpp Show resolved Hide resolved
src/coreclr/jit/emitarm64.cpp Outdated Show resolved Hide resolved
src/coreclr/jit/codegencommon.cpp Outdated Show resolved Hide resolved
src/coreclr/jit/codegenarmarch.cpp Outdated Show resolved Hide resolved
@ghost
Copy link

ghost commented May 16, 2021

Hello @tannergooding!

Because this pull request has the auto-merge label, I will be glad to assist with helping to merge this pull request once all check-in policies pass.

p.s. you can customize the way I help with merging this pull request, such as holding this pull request until a specific person approves. Simply @mention me (@msftbot) and give me an instruction to get started! Learn more here.

@sandreenko
Copy link
Contributor

/azp run runtime-coreclr jitstress, runtime-coreclr outerloop

@azure-pipelines
Copy link

Azure Pipelines successfully started running 2 pipeline(s).

@tannergooding
Copy link
Member Author

Only failure is readytorun/determinism/crossgen2determinism/crossgen2determinism on OSX ARM64, tracked by #50466

@tannergooding tannergooding merged commit 2f5f1d5 into dotnet:main May 17, 2021
@karelz karelz added this to the 6.0.0 milestone May 20, 2021
@ghost ghost locked as resolved and limited conversation to collaborators Jun 19, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants