-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
(WIP) Arm64/SVE: Implemented AddRotateComplex
and AddSequentialAcross
#104258
Conversation
updating branch due to assert issue.
…when Op1 and Op2 have different types.
Note regarding the
|
Tagging subscribers to this area: @dotnet/area-system-runtime-intrinsics |
AddRotateAcross
and AddSequentialAcross
AddRotateAcross
and AddSequentialAcross
AddRotateAcross
and AddSequentialAcross
AddRotateAcross
AddRotateAcross
AddRotateAcross
and AddSequentialAcross
AddRotateAcross
and AddSequentialAcross
AddRotateAcross
and AddSequentialAcross
@@ -8391,13 +8391,13 @@ void CodeGen::genArm64EmitterUnitTestsSve() | |||
INS_OPTS_SCALABLE_D); // ST1B {<Zt>.D }, <Pg>, [<Xn|SP>, <Zm>.D] | |||
|
|||
// IF_SVE_GP_3A | |||
theEmitter->emitIns_R_R_R_I(INS_sve_fcadd, EA_SCALABLE, REG_V0, REG_P1, REG_V2, 90, | |||
theEmitter->emitIns_R_R_R_I(INS_sve_fcadd, EA_SCALABLE, REG_V0, REG_P1, REG_V2, 0, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kunalspathak at the API level, do we want users to pass the actual angle value (90, 180, etc) for the immediate? If so, we might have to do some awkward transformations throughout the JIT's phases to get this to work:
- If we need to generate a switch table of all immediate values (in case the user doesn't pass a constant),
HWIntrinsicImmOpHelper
expects the immediates to be contiguous, like [0, 3]. If the possible values are 90, 180, etc., we'll need some special handling there to pass the correct immediates toemitIns_R_R_R_I
. - For
FCADD
,emitIns_R_R_R_I
expects us to pass the immediate as an angle value; it then converts the value to its bitwise representation[0, 3]
internally. If we streamline this so we can just pass the immediate in its bitwise form toemitIns_R_R_R_I
, that might simplify the logic elsewhere in the JIT. For example, the bounds for this intrinsic inlookupImmBounds
would be[0, 3]
, and we'd just have to transform the user's input to this form somewhere in the JIT -- perhaps during importation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one is bit tricky. The only acceptable and valid values for FCADD is 90
or 270
and that's what we expect user to pass. All the other values are invalid and we should probably throw ArgumentOutOfRangeException
. Also, since the values are not contiguous, we might not be able to use the generic table generation logic. It is meant for the contiguous value. For this API, we want something like this:
if (IsConstant(rot) && (rot == 90) || (rot == 270))
{
fcadd ... // here we will embed 0 or 1, depending on if the rot is 90 or 270
}
else
{
// generate fallback
}
// fallback codegen
rot = ... // either constant or from variable
if (rot == 90)
{
fcadd ...0 // '0' to specify rotation is 90
}
else if (rot == 270)
{
fcadd ...1 // '1' to specify rotation is 270
}
else
{
throw ArgumentOutOfRangeException();
}
@tannergooding - I don't believe we have API that has such restriction about the input value, do we? For eg. I don't see we have implemented AdvSimd's FcAdd.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I meant to put a comment here, but spoke with @tannergooding offline and the right thing to do here is to handle the fallback in C# level, something like:
if (cns == 90) { AddRotateComplex(..., 90) }
else if (cns == 270) { .... }
else { throw }
and then in rationalizer, make sure to the argument is indeed a constant and is "in bounds", before we rewrite it back to the call.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ebepho @amanasifkhalid - let me know if you need anything else to move this further.
@ebepho try syncing your fork with dotnet/runtime, then running |
AddRotateAcross
and AddSequentialAcross
AddRotateComplex
and AddSequentialAcross
AddSequentialAcross is implemented here: #104640 |
Contributes to #99957
Stress test output:
cc @dotnet/arm64-contrib