-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JIT: Added SVE LoadVector*NonFaultingZeroExtendTo* APIs #102860
Conversation
Note regarding the
|
Tagging subscribers to this area: @dotnet/area-system-runtime-intrinsics |
@dotnet/arm64-contrib @kunalspathak this is ready. All tests pass with no assertions.
|
|
||
// Validates calling via reflection works | ||
// TODO-SVE: Enable once register allocation exists for predicates. | ||
// test.RunReflectionScenario_Load(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we have a scenario for ConditionalSelect(mask, LoadVectorByteNonFaultingZeroExtendToInt64(address), mergeValue)
? @a74nh ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if that statement is still true. It may have been fixed by the better handling we now have of predicates. Does this test work if you enable it?
If so, could you also enable the same test in SveLoadMaskedUnOpTest.template
too please.
@@ -496,7 +496,37 @@ void CodeGen::genHWIntrinsic(GenTreeHWIntrinsic* node) | |||
} | |||
} | |||
|
|||
GetEmitter()->emitIns_R_R_R(insEmbMask, emitSize, targetReg, maskReg, embMaskOp1Reg, opt); | |||
if (intrinEmbMask.codeGenIsTableDriven()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will codeGenIsTableDriven
always hold going forwards for additional intrinsics?
I wonder if it needs to be:
switch (intrinEmbMask.id)
{
case NI_Sve_LoadVectorByteNonFaultingZeroExtendToInt16:
case NI_Sve_LoadVectorByteNonFaultingZeroExtendToInt32:
....
case NI_Sve_LoadVectorUInt32NonFaultingZeroExtendToInt64:
case NI_Sve_LoadVectorUInt32NonFaultingZeroExtendToUInt64:
GetEmitter()->emitIns_R_R_R_I ....
default:
GetEmitter()->emitIns_R_R_R .....
break;
}
Maybe for now it's fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It will hold for additional intrinsics that do not have the HW_Flag_SpecialCodeGen
flag, but have the HW_Flag_EmbeddedMaskedOperation
flag.
@@ -95,6 +95,18 @@ HARDWARE_INTRINSIC(Sve, LoadVectorUInt16ZeroExtendToUInt32, | |||
HARDWARE_INTRINSIC(Sve, LoadVectorUInt16ZeroExtendToUInt64, -1, 2, false, {INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_sve_ld1h, INS_invalid, INS_invalid}, HW_Category_MemoryLoad, HW_Flag_Scalable|HW_Flag_ExplicitMaskedOperation|HW_Flag_LowMaskedOperation) | |||
HARDWARE_INTRINSIC(Sve, LoadVectorUInt32ZeroExtendToInt64, -1, 2, false, {INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_sve_ld1w, INS_invalid, INS_invalid, INS_invalid}, HW_Category_MemoryLoad, HW_Flag_Scalable|HW_Flag_ExplicitMaskedOperation|HW_Flag_LowMaskedOperation) | |||
HARDWARE_INTRINSIC(Sve, LoadVectorUInt32ZeroExtendToUInt64, -1, 2, false, {INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_sve_ld1w, INS_invalid, INS_invalid}, HW_Category_MemoryLoad, HW_Flag_Scalable|HW_Flag_ExplicitMaskedOperation|HW_Flag_LowMaskedOperation) | |||
HARDWARE_INTRINSIC(Sve, LoadVectorByteNonFaultingZeroExtendToInt16, -1, 1, false, {INS_invalid, INS_invalid, INS_sve_ldnf1b, INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_invalid, INS_invalid}, HW_Category_MemoryLoad, HW_Flag_Scalable|HW_Flag_EmbeddedMaskedOperation|HW_Flag_LowMaskedOperation|HW_Flag_SpecialCodeGen) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe I'm missing something....
HW_Flag_SpecialCodeGen
is set for these.
During codegen, genHWIntrinsic()
is called with LoadVectorByteNonFaultingZeroExtendToInt16
.
intrin.codeGenIsTableDriven()
check fails (due to HW_Flag_SpecialCodeGen
).
Code falls into the switch (intrin.id)
at the end of the function.
Switch hits default: unreached()
due to no LoadVectorByteNonFaultingZeroExtendToInt16
case
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because of the HW_Flag_EmbeddedMaskedOperation
flag, at the point of genHWIntrinsic
, intrin
will never be Sve_LoadVectorByteNonFaultingZeroExtendToInt16
and instead be Sve_ConditionalSelect
that wraps Sve_LoadVectorByteNonFaultingZeroExtendToInt16
. This is why I had to handle intrinEmbMask.codeGenIsTableDriven()
like you saw.
@kunalspathak this is ready again, thanks for updating the test. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Contributes to #99957
Adds SVE APIs: