-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JIT ARM64: Add IF_SVE_DT_3A, IF_SVE_DU_3A, IF_SVE_DX_3A, IF_SVE_DY_3A #96201
Conversation
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch Issue DetailsPart of #94549. cc @dotnet/arm64-contrib, @a74nh. JitDisasm output:
cstool output:
Notice that for the last few instructions with vector length specifiers, cstool prints the predicate number slightly differently from how we currently do it in the JIT (e.g.
|
I don't think cstool output is correct. They are printing the register name in hex. @TIHan something to fix. |
I can update my changes to use the same aliases as cstool. I'm guessing we'd have to split some of the |
Generally we should be using the preferred alias, as it's intended as easier to read. What we've been doing for other instructions with aliases is just to change For runtime/src/coreclr/jit/emitarm64.cpp Line 8842 in 8bd23f0
For runtime/src/coreclr/jit/emitarm64.cpp Line 9987 in 8bd23f0
|
However, having said that about aliases, I don't think that's the issue here. The difference in the output is, for example, Looking at Which suggests something is going wrong in the coreclr encoding. Looking at
/*static*/ emitter::code_t emitter::insEncodeReg_P_2_to_0(regNumber reg)
{
assert(isPredicateRegister(reg));
emitter::code_t ureg = (emitter::code_t)reg - (emitter::code_t)REG_P0;
assert((ureg >= 8) && (ureg <= 15));
return (ureg - 8 ) << 0;
} Looking at the autogenerated code, |
@a74nh thank you for the fix! Changing I'm trying to determine the best way to differentiate these instruction formats in Another option is to not split up the instructions, but to (mis-)use existing |
I hit the same issue with a previous PR, but managed to reduce the insopts value for that one. I was a little concerned we'd hit it again.
Ideally I'd rather avoid this. But we might have to eventually.
Reducing One way would be to turn it from an enum into flags:
But that's a lot of refactoring code and we might still run out. So, probably not. There should be no overlap the non-SVE and SVE values. So we should simply be able to restart the enum at 1:
So, |
Good point, thank you for the suggestion! I had to refactor
and the (sanitized) cstool output:
The only diffs I see locally are the different predicate register formats at the end (e.g. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM now. Thanks!
(note: I'm away until new year now).
Thank you for the review, and enjoy your holiday! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added few questions and suggestions.
src/coreclr/jit/instr.h
Outdated
INS_OPTS_SCALABLE_B, | ||
// There should be no overlap between non-SVE and SVE values, | ||
// so reset value to 1 here | ||
INS_OPTS_SCALABLE_B = 1, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should move the INS_OPTS_SCALABLE*
below INS_OPTS_ALIGN
, so there is no gap between INS_OPTS_2D
and INS_OPTS_MSL
.
src/coreclr/jit/codegenarm64test.cpp
Outdated
@@ -5371,7 +5371,111 @@ void CodeGen::genArm64EmitterUnitTestsSve() | |||
theEmitter->emitIns_R_R_R(INS_sve_frecpx, EA_SCALABLE, REG_V5, REG_P5, REG_V5, | |||
INS_OPTS_SCALABLE_H); // FRECPX <Zd>.<T>, <Pg>/M, <Zn>.<T> | |||
theEmitter->emitIns_R_R_R(INS_sve_fsqrt, EA_SCALABLE, REG_V6, REG_P6, REG_V6, | |||
INS_OPTS_SCALABLE_S); // FSQRT <Zd>.<T>, <Pg>/M, <Zn>.<T> | |||
INS_OPTS_SCALABLE_S); /* FSQRT <Zd>.<T>, <Pg>/M, <Zn>.<T> */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
INS_OPTS_SCALABLE_S); /* FSQRT <Zd>.<T>, <Pg>/M, <Zn>.<T> */ | |
INS_OPTS_SCALABLE_S); // FSQRT <Zd>.<T>, <Pg>/M, <Zn>.<T> |
src/coreclr/jit/emitarm64.h
Outdated
|
||
inline static bool isHighPredicateRegister(regNumber reg) | ||
{ | ||
return (reg > REG_PREDICATE_LOW_LAST) && (reg <= REG_PREDICATE_LAST); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
probably good to introduce REG_PREDICATE_HIGH_FIRST
and REG_PREDICATE_HIGH_LAST
and use it here. It is more explicit that way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and a static_assert_no_msg(REG_PREDICATE_HIGH_LAST == REG_PREDICATE_LAST)
.
src/coreclr/jit/emitarm64.cpp
Outdated
@@ -15658,6 +15844,44 @@ void emitter::emitDispArrangement(insOpts opt) | |||
str = "8b"; | |||
break; | |||
case INS_OPTS_16B: | |||
str = "16b"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
any reason why these were added in emitDispArrangement
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Those cases were already in emitDispArrangement
, I think the diff just makes it look like they were added in. I moved all the SVE-specific cases to emitDispSveArrangement
, and left the remaining cases in emitDispArrangement
as-is.
src/coreclr/jit/emitarm64.cpp
Outdated
@@ -15668,10 +15892,6 @@ void emitter::emitDispArrangement(insOpts opt) | |||
case INS_OPTS_SCALABLE_B_WITH_PREDICATE_MERGE: | |||
str = "b"; | |||
break; | |||
case INS_OPTS_4H: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does removing these have any implications on existing instructions that we added so far? Can you try to compare the output of all instructions jitdisasm vs. cstool with this change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you try to compare the output of all instructions jitdisasm vs. cstool with this change?
Sure thing. I'll post a comment with diffable outputs from both.
I created ugly but diffable JitDisasm/cstool outputs for the SVE instructions we have so far. I don't see any diffs locally, except for the different predicate register encodings we noted above. |
Thanks for doing this. |
Looks like there are several test failures because of moving around the enum ordering.
|
Thank you for pointing that out. Those asserts I added -- e.g. |
I must say that although the case INS_OPTS_SCALABLE_B:
case INS_OPTS_SCALABLE_WIDE_B:
case INS_OPTS_SCALABLE_B_WITH_SIMD_SCALAR:
case INS_OPTS_SCALABLE_B_WITH_SIMD_VECTOR:
case INS_OPTS_SCALABLE_B_WITH_SCALAR:
case INS_OPTS_SCALABLE_B_WITH_PREDICATE_MERGE:
return EA_1BYTE; Of course, the alternative to increase the size of |
I like this idea of separating SVE-specific logic out into separate functions. All the new SVE-specific
Sure thing. I initially tried this strategy, though I ran into some weird build issues from the static |
This is the crucial part - in some cases, once the group is decided, this information in the I'll write a quick PR that introduces a |
See #96692 |
I will wait for this to merge before reviewing this PR. |
This is now merged. All your |
src/coreclr/jit/instr.h
Outdated
|
||
// There should be no overlap between non-SVE and SVE values, | ||
// so reset value to 1 here | ||
INS_OPTS_SCALABLE_B = 1, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we don't need resetting anymore. also, you can pretty much revert changes done in enum insOpts
@@ -1012,6 +1022,17 @@ inline static bool insScalableOptsNone(insScalableOpts sopt) | |||
return sopt == INS_SCALABLE_OPTS_NONE; | |||
} | |||
|
|||
inline static bool insScalableOptsWithPredicatePair(insScalableOpts sopt) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks like we missed the summary header for newer SVE methods we added recently. do you mind adding those?
src/coreclr/jit/emitarm64.cpp
Outdated
@@ -16362,7 +16499,6 @@ size_t emitter::emitOutputInstr(insGroup* ig, instrDesc* id, BYTE** dp) | |||
{ | |||
code = insEncodeSveElemsize_dtype(ins, optGetSveElemsize(id->idInsOpt()), code); | |||
} | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
extra line deletion
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM (once other review comments are fixed up)
Thank you for the reviews! @kunalspathak I've addressed your feedback. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Part of #94549. cc @dotnet/arm64-contrib, @a74nh.
JitDisasm output:
cstool output:
Notice that for the last few instructions with vector length specifiers, cstool prints the predicate number slightly differently from how we currently do it in the JIT (e.g.
pn0xA.h
vsp10.h
). Should we match cstool's formatting exactly, or is this unimportant in this case?