-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SVE: Added Load2xVectorAndUnzip
, Load3xVectorAndUnzip
, Load4xVectorAndUnzip
APIs
#102180
Conversation
Note regarding the
|
Tagging subscribers to this area: @dotnet/area-system-runtime-intrinsics |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you also follow the instructions at #99957 (comment) and confirm if all the newly added tests pass? I can work with you on machine where you can run these tests.
src/tests/JIT/HardwareIntrinsics/Arm/Shared/SveLoadVectorx2Test.template
Outdated
Show resolved
Hide resolved
src/libraries/System.Private.CoreLib/src/System/Runtime/Intrinsics/Arm/Sve.cs
Show resolved
Hide resolved
src/tests/Common/GenerateHWIntrinsicTests/GenerateHWIntrinsicTests_Arm.cs
Outdated
Show resolved
Hide resolved
The tests don't pass right now, but I think the fix your suggesting will work. |
src/libraries/System.Private.CoreLib/src/System/Runtime/Intrinsics/Arm/Sve.cs
Outdated
Show resolved
Hide resolved
Tests pass now:
|
It is running just the |
Gotcha. Will need to find an arm64 machine that supports it then. |
@dotnet/arm64-contrib @kunalspathak this is ready. Had to play whack-a-mole. Tests are passing and running :)
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please update the API names to Load2xVectorAndUnzip
.
src/coreclr/jit/lsraarm64.cpp
Outdated
@@ -1758,6 +1758,18 @@ int LinearScan::BuildHWIntrinsic(GenTreeHWIntrinsic* intrinsicTree, int* pDstCou | |||
break; | |||
} | |||
|
|||
case NI_Sve_LoadVectorx2: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this can be combined with above cases:
case NI_Sve_LoadVectorx2:
case NI_Sve_LoadVectorx3:
case NI_Sve_LoadVectorx4:
assert(intrin.op2 != nullptr);
srcCount += BuildOperandUses(intrin.op2);
FALLTHROUGH;
case NI_AdvSimd_LoadVector64x2AndUnzip:
case NI_AdvSimd_LoadVector64x3AndUnzip:
case NI_AdvSimd_LoadVector64x4AndUnzip:
case NI_AdvSimd_Arm64_LoadVector128x2AndUnzip:
case NI_AdvSimd_Arm64_LoadVector128x3AndUnzip:
case NI_AdvSimd_Arm64_LoadVector128x4AndUnzip:
case NI_AdvSimd_LoadVector64x2:
case NI_AdvSimd_LoadVector64x3:
case NI_AdvSimd_LoadVector64x4:
case NI_AdvSimd_Arm64_LoadVector128x2:
case NI_AdvSimd_Arm64_LoadVector128x3:
case NI_AdvSimd_Arm64_LoadVector128x4:
case NI_AdvSimd_LoadAndReplicateToVector64x2:
case NI_AdvSimd_LoadAndReplicateToVector64x3:
case NI_AdvSimd_LoadAndReplicateToVector64x4:
case NI_AdvSimd_Arm64_LoadAndReplicateToVector128x2:
case NI_AdvSimd_Arm64_LoadAndReplicateToVector128x3:
case NI_AdvSimd_Arm64_LoadAndReplicateToVector128x4:
{
assert(intrin.op1 != nullptr);
BuildConsecutiveRegistersForDef(intrinsicTree, dstCount);
*pDstCount = dstCount;
break;
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is still not resolved.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't do this as a fallthrough because there is another fallthrough above it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, never mind.
src/tests/JIT/HardwareIntrinsics/Arm/Shared/SveLoadVectorx2Test.template
Outdated
Show resolved
Hide resolved
src/tests/Common/GenerateHWIntrinsicTests/GenerateHWIntrinsicTests_Arm.cs
Outdated
Show resolved
Hide resolved
LoadVectorx2
, LoadVectorx3
, LoadVectorx4
APIsLoad2xVectorAndUnzip
, Load3xVectorAndUnzip
, Load4xVectorAndUnzip
APIs
Stress modes are failing badly for the |
Having some failures with Load2xVector and Load3xVector in stress testing; the results are coming back as zeroes for public void RunClassFldScenario()
{
TestLibrary.TestFramework.BeginScenario(nameof(RunClassFldScenario));
//TODO-SVE: Once register allocation exists for predicates, move loadMask into DataTable
{Op1VectorType}<{Op1BaseType}> loadMask = Sve.CreateTrueMask{RetBaseType}(SveMaskPattern.All);
(_fld1, _fld2) = {Isa}.{Method}(loadMask, ({Op1BaseType}*)_dataTable.inArrayPtr);
ValidateResult(_fld1, _fld2, _dataTable.inArrayPtr);
} |
@kunalspathak this is ready. I do get an assertion in the non_faulting cases under stress modes, but it appears to be related to
Where the assertion appears: case IF_SVE_ID_2A: // ..........iiiiii ...iiinnnnn.TTTT -- SVE load predicate register
case IF_SVE_JG_2A: // ..........iiiiii ...iiinnnnn.TTTT -- SVE store predicate register
assert(insOptsNone(id->idInsOpt()));
assert(isScalableVectorSize(id->idOpSize()));
assert(isPredicateRegister(id->idReg1())); // TTTT
assert(isGeneralRegisterOrZR(id->idReg2())); // nnnnn
assert(isValidSimm<9>(emitGetInsSC(id))); // iii <------ ASSERTION FAILS
break; |
src/tests/JIT/HardwareIntrinsics/Arm/Shared/SveLoad2xVectorAndUnzipTest.template
Outdated
Show resolved
Hide resolved
@kunalspathak this is ready again. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks!
…torAndUnzip` APIs (dotnet#102180)
Contributes to #99957
Adds:
Load2xVectorAndUnzip
Load3xVectorAndUnzip
Load4xVectorAndUnzip