Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JIT ARM64-SVE: Add AbsoluteCompare* APIs #102611

Closed
wants to merge 10 commits into from

Conversation

mikabl-arm
Copy link
Contributor

Tests results:

BEGIN EXECUTION
/home/mikabl01/dotnet/runtime/artifacts/tests/coreclr/linux.arm64.Checked/Tests/Core_Root/corerun -p System.Reflection.Metadata.MetadataUpdater.IsSupported=false -p System.Runtime.Serialization.EnableUnsafeBinaryFormatterSerialization=true HardwareIntrinsics_Arm_ro.dll 'Sve_AbsoluteCompare'
17:04:42.307 Running test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_AbsoluteCompareGreaterThan_float()
Supported ISAs:
  AdvSimd:   True
  Aes:       True
  ArmBase:   True
  Crc32:     True
  Dp:        True
  Rdm:       True
  Sha1:      True
  Sha256:    True
  Sve:       True

Beginning scenario: RunBasicScenario_UnsafeRead
Beginning scenario: RunBasicScenario_Load
Beginning scenario: RunReflectionScenario_UnsafeRead
Beginning scenario: RunLclVarScenario_UnsafeRead
Beginning scenario: RunClassFldScenario
Beginning scenario: RunStructLclFldScenario
Beginning scenario: RunStructFldScenario
Beginning scenario: ConditionalSelect_Op1_mask
Beginning scenario: ConditionalSelect_Op1_zero
Beginning scenario: ConditionalSelect_Op1_all
Beginning scenario: ConditionalSelect_Op2_mask
Beginning scenario: ConditionalSelect_Op2_zero
Beginning scenario: ConditionalSelect_Op2_all
Beginning scenario: ConditionalSelect_FalseOp_mask
Beginning scenario: ConditionalSelect_FalseOp_zero
Beginning scenario: ConditionalSelect_FalseOp_all
Beginning scenario: ConditionalSelect_ZeroOp_mask
Beginning scenario: ConditionalSelect_ZeroOp_zero
Beginning scenario: ConditionalSelect_ZeroOp_all
17:04:42.420 Passed test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_AbsoluteCompareGreaterThan_float()
17:04:42.434 Running test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_AbsoluteCompareGreaterThan_double()
Beginning scenario: RunBasicScenario_UnsafeRead
Beginning scenario: RunBasicScenario_Load
Beginning scenario: RunReflectionScenario_UnsafeRead
Beginning scenario: RunLclVarScenario_UnsafeRead
Beginning scenario: RunClassFldScenario
Beginning scenario: RunStructLclFldScenario
Beginning scenario: RunStructFldScenario
Beginning scenario: ConditionalSelect_Op1_mask
Beginning scenario: ConditionalSelect_Op1_zero
Beginning scenario: ConditionalSelect_Op1_all
Beginning scenario: ConditionalSelect_Op2_mask
Beginning scenario: ConditionalSelect_Op2_zero
Beginning scenario: ConditionalSelect_Op2_all
Beginning scenario: ConditionalSelect_FalseOp_mask
Beginning scenario: ConditionalSelect_FalseOp_zero
Beginning scenario: ConditionalSelect_FalseOp_all
Beginning scenario: ConditionalSelect_ZeroOp_mask
Beginning scenario: ConditionalSelect_ZeroOp_zero
Beginning scenario: ConditionalSelect_ZeroOp_all
17:04:42.454 Passed test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_AbsoluteCompareGreaterThan_double()
17:04:42.456 Running test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_AbsoluteCompareGreaterThanOrEqual_float()
Beginning scenario: RunBasicScenario_UnsafeRead
Beginning scenario: RunBasicScenario_Load
Beginning scenario: RunReflectionScenario_UnsafeRead
Beginning scenario: RunLclVarScenario_UnsafeRead
Beginning scenario: RunClassFldScenario
Beginning scenario: RunStructLclFldScenario
Beginning scenario: RunStructFldScenario
Beginning scenario: ConditionalSelect_Op1_mask
Beginning scenario: ConditionalSelect_Op1_zero
Beginning scenario: ConditionalSelect_Op1_all
Beginning scenario: ConditionalSelect_Op2_mask
Beginning scenario: ConditionalSelect_Op2_zero
Beginning scenario: ConditionalSelect_Op2_all
Beginning scenario: ConditionalSelect_FalseOp_mask
Beginning scenario: ConditionalSelect_FalseOp_zero
Beginning scenario: ConditionalSelect_FalseOp_all
Beginning scenario: ConditionalSelect_ZeroOp_mask
Beginning scenario: ConditionalSelect_ZeroOp_zero
Beginning scenario: ConditionalSelect_ZeroOp_all
17:04:42.474 Passed test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_AbsoluteCompareGreaterThanOrEqual_float()
17:04:42.475 Running test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_AbsoluteCompareGreaterThanOrEqual_double()
Beginning scenario: RunBasicScenario_UnsafeRead
Beginning scenario: RunBasicScenario_Load
Beginning scenario: RunReflectionScenario_UnsafeRead
Beginning scenario: RunLclVarScenario_UnsafeRead
Beginning scenario: RunClassFldScenario
Beginning scenario: RunStructLclFldScenario
Beginning scenario: RunStructFldScenario
Beginning scenario: ConditionalSelect_Op1_mask
Beginning scenario: ConditionalSelect_Op1_zero
Beginning scenario: ConditionalSelect_Op1_all
Beginning scenario: ConditionalSelect_Op2_mask
Beginning scenario: ConditionalSelect_Op2_zero
Beginning scenario: ConditionalSelect_Op2_all
Beginning scenario: ConditionalSelect_FalseOp_mask
Beginning scenario: ConditionalSelect_FalseOp_zero
Beginning scenario: ConditionalSelect_FalseOp_all
Beginning scenario: ConditionalSelect_ZeroOp_mask
Beginning scenario: ConditionalSelect_ZeroOp_zero
Beginning scenario: ConditionalSelect_ZeroOp_all
17:04:42.494 Passed test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_AbsoluteCompareGreaterThanOrEqual_double()
17:04:42.496 Running test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_AbsoluteCompareLessThan_float()
Beginning scenario: RunBasicScenario_UnsafeRead
Beginning scenario: RunBasicScenario_Load
Beginning scenario: RunReflectionScenario_UnsafeRead
Beginning scenario: RunLclVarScenario_UnsafeRead
Beginning scenario: RunClassFldScenario
Beginning scenario: RunStructLclFldScenario
Beginning scenario: RunStructFldScenario
Beginning scenario: ConditionalSelect_Op1_mask
Beginning scenario: ConditionalSelect_Op1_zero
Beginning scenario: ConditionalSelect_Op1_all
Beginning scenario: ConditionalSelect_Op2_mask
Beginning scenario: ConditionalSelect_Op2_zero
Beginning scenario: ConditionalSelect_Op2_all
Beginning scenario: ConditionalSelect_FalseOp_mask
Beginning scenario: ConditionalSelect_FalseOp_zero
Beginning scenario: ConditionalSelect_FalseOp_all
Beginning scenario: ConditionalSelect_ZeroOp_mask
Beginning scenario: ConditionalSelect_ZeroOp_zero
Beginning scenario: ConditionalSelect_ZeroOp_all
17:04:42.513 Passed test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_AbsoluteCompareLessThan_float()
17:04:42.514 Running test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_AbsoluteCompareLessThan_double()
Beginning scenario: RunBasicScenario_UnsafeRead
Beginning scenario: RunBasicScenario_Load
Beginning scenario: RunReflectionScenario_UnsafeRead
Beginning scenario: RunLclVarScenario_UnsafeRead
Beginning scenario: RunClassFldScenario
Beginning scenario: RunStructLclFldScenario
Beginning scenario: RunStructFldScenario
Beginning scenario: ConditionalSelect_Op1_mask
Beginning scenario: ConditionalSelect_Op1_zero
Beginning scenario: ConditionalSelect_Op1_all
Beginning scenario: ConditionalSelect_Op2_mask
Beginning scenario: ConditionalSelect_Op2_zero
Beginning scenario: ConditionalSelect_Op2_all
Beginning scenario: ConditionalSelect_FalseOp_mask
Beginning scenario: ConditionalSelect_FalseOp_zero
Beginning scenario: ConditionalSelect_FalseOp_all
Beginning scenario: ConditionalSelect_ZeroOp_mask
Beginning scenario: ConditionalSelect_ZeroOp_zero
Beginning scenario: ConditionalSelect_ZeroOp_all
17:04:42.530 Passed test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_AbsoluteCompareLessThan_double()
17:04:42.532 Running test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_AbsoluteCompareLessThanOrEqual_float()
Beginning scenario: RunBasicScenario_UnsafeRead
Beginning scenario: RunBasicScenario_Load
Beginning scenario: RunReflectionScenario_UnsafeRead
Beginning scenario: RunLclVarScenario_UnsafeRead
Beginning scenario: RunClassFldScenario
Beginning scenario: RunStructLclFldScenario
Beginning scenario: RunStructFldScenario
Beginning scenario: ConditionalSelect_Op1_mask
Beginning scenario: ConditionalSelect_Op1_zero
Beginning scenario: ConditionalSelect_Op1_all
Beginning scenario: ConditionalSelect_Op2_mask
Beginning scenario: ConditionalSelect_Op2_zero
Beginning scenario: ConditionalSelect_Op2_all
Beginning scenario: ConditionalSelect_FalseOp_mask
Beginning scenario: ConditionalSelect_FalseOp_zero
Beginning scenario: ConditionalSelect_FalseOp_all
Beginning scenario: ConditionalSelect_ZeroOp_mask
Beginning scenario: ConditionalSelect_ZeroOp_zero
Beginning scenario: ConditionalSelect_ZeroOp_all
17:04:42.549 Passed test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_AbsoluteCompareLessThanOrEqual_float()
17:04:42.550 Running test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_AbsoluteCompareLessThanOrEqual_double()
Beginning scenario: RunBasicScenario_UnsafeRead
Beginning scenario: RunBasicScenario_Load
Beginning scenario: RunReflectionScenario_UnsafeRead
Beginning scenario: RunLclVarScenario_UnsafeRead
Beginning scenario: RunClassFldScenario
Beginning scenario: RunStructLclFldScenario
Beginning scenario: RunStructFldScenario
Beginning scenario: ConditionalSelect_Op1_mask
Beginning scenario: ConditionalSelect_Op1_zero
Beginning scenario: ConditionalSelect_Op1_all
Beginning scenario: ConditionalSelect_Op2_mask
Beginning scenario: ConditionalSelect_Op2_zero
Beginning scenario: ConditionalSelect_Op2_all
Beginning scenario: ConditionalSelect_FalseOp_mask
Beginning scenario: ConditionalSelect_FalseOp_zero
Beginning scenario: ConditionalSelect_FalseOp_all
Beginning scenario: ConditionalSelect_ZeroOp_mask
Beginning scenario: ConditionalSelect_ZeroOp_zero
Beginning scenario: ConditionalSelect_ZeroOp_all
17:04:42.567 Passed test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_AbsoluteCompareLessThanOrEqual_double()
Expected: 100
Actual: 100
END EXECUTION - PASSED

@mikabl-arm
Copy link
Contributor Author

@kunalspathak @dotnet/arm64-contrib @a74nh

Copy link

Note regarding the new-api-needs-documentation label:

This serves as a reminder for when your PR is modifying a ref *.cs file and adding/modifying public APIs, please make sure the API implementation in the src *.cs file is documented with triple slash comments, so the PR reviewers can sign off that change.

@dotnet-policy-service dotnet-policy-service bot added the community-contribution Indicates that the PR has been added by a community member label May 23, 2024
Copy link
Contributor

Tagging subscribers to this area: @dotnet/area-system-runtime-intrinsics
See info in area-owners.md if you want to be subscribed.

@kunalspathak kunalspathak added the arm-sve Work related to arm64 SVE/SVE2 support label May 23, 2024
@kunalspathak
Copy link
Member

Tests results:

Thanks. What about stress test results?

if (intrin.op3->IsVectorZero())
if (!instrIsRMW)
{
assert(intrin.op3->IsVectorZero());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

intrin.op3 should be nullptr because there are just 2 operands involved.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this context intrin is Sve_ConditionalSelect, not Sve_AbsoluteCompare*.

@@ -3004,6 +3004,16 @@
("SveSimpleVecOpTest.template", new Dictionary<string, string> { ["TestName"] = "Sve_BooleanNot_uint", ["Isa"] = "Sve", ["LoadIsa"] = "Sve", ["Method"] = "BooleanNot", ["RetVectorType"] = "Vector", ["RetBaseType"] = "UInt32", ["Op1VectorType"] = "Vector", ["Op1BaseType"] = "UInt32", ["LargestVectorSize"] = "64", ["NextValueOp1"] = "TestLibrary.Generator.GetUInt32()", ["ValidateIterResult"] = "Helpers.BooleanNot(firstOp[i]) != result[i]", ["GetIterResult"] = "Helpers.BooleanNot(leftOp[i])"}),
("SveSimpleVecOpTest.template", new Dictionary<string, string> { ["TestName"] = "Sve_BooleanNot_ulong", ["Isa"] = "Sve", ["LoadIsa"] = "Sve", ["Method"] = "BooleanNot", ["RetVectorType"] = "Vector", ["RetBaseType"] = "UInt64", ["Op1VectorType"] = "Vector", ["Op1BaseType"] = "UInt64", ["LargestVectorSize"] = "64", ["NextValueOp1"] = "TestLibrary.Generator.GetUInt64()", ["ValidateIterResult"] = "Helpers.BooleanNot(firstOp[i]) != result[i]", ["GetIterResult"] = "Helpers.BooleanNot(leftOp[i])"}),

// Sve mask
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

delete the comment

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Sve mask

@mikabl-arm
Copy link
Contributor Author

Thanks. What about stress test results?

Thanks, missed those. There are several errors actually. I'll take a look.

dotnet/runtime$ ./stress_tester.py ./artifacts/tests/coreclr/linux.arm64.Checked/Tests/Core_Root/corerun -p System.Reflection.Metadata.MetadataUpdater.IsSupported=false -p System.Runtime.Serialization.EnableUnsafeBinaryFormatterSerialization=true ./artifacts/tests/coreclr/linux.arm64.Checked/JIT/HardwareIntrinsics/HardwareIntrinsics_Arm_ro/HardwareIntrinsics_Arm_ro.dll AbsoluteCompare

Starting test: /home/mikabl01/dotnet/runtime/artifacts/tests/coreclr/linux.arm64.Checked/Tests/Core_Root/corerun -p System.Reflection.Metadata.MetadataUpdater.IsSupported=false -p System.Runtime.Serialization.EnableUnsafeBinaryFormatterSerialization=true ./artifacts/tests/coreclr/linux.arm64.Checked/JIT/HardwareIntrinsics/HardwareIntrinsics_Arm_ro/HardwareIntrinsics_Arm_ro.dll AbsoluteCompare
===================Running default===================
------------------- {} -------------------
Passed test: _AdvSimd_Arm64_ro::JIT.HardwareIntrinsics.Arm._AdvSimd.Arm64.Program.AbsoluteCompareGreaterThan_Vector128_Double() : 7
Passed test: _AdvSimd_Arm64_ro::JIT.HardwareIntrinsics.Arm._AdvSimd.Arm64.Program.AbsoluteCompareGreaterThanScalar_Vector64_Double() : 7
Passed test: _AdvSimd_Arm64_ro::JIT.HardwareIntrinsics.Arm._AdvSimd.Arm64.Program.AbsoluteCompareGreaterThanScalar_Vector64_Single() : 7
Passed test: _AdvSimd_Arm64_ro::JIT.HardwareIntrinsics.Arm._AdvSimd.Arm64.Program.AbsoluteCompareGreaterThanOrEqual_Vector128_Double() : 7
Passed test: _AdvSimd_Arm64_ro::JIT.HardwareIntrinsics.Arm._AdvSimd.Arm64.Program.AbsoluteCompareGreaterThanOrEqualScalar_Vector64_Double() : 7
Passed test: _AdvSimd_Arm64_ro::JIT.HardwareIntrinsics.Arm._AdvSimd.Arm64.Program.AbsoluteCompareGreaterThanOrEqualScalar_Vector64_Single() : 7
Passed test: _AdvSimd_Arm64_ro::JIT.HardwareIntrinsics.Arm._AdvSimd.Arm64.Program.AbsoluteCompareLessThan_Vector128_Double() : 7
Passed test: _AdvSimd_Arm64_ro::JIT.HardwareIntrinsics.Arm._AdvSimd.Arm64.Program.AbsoluteCompareLessThanScalar_Vector64_Double() : 7
Passed test: _AdvSimd_Arm64_ro::JIT.HardwareIntrinsics.Arm._AdvSimd.Arm64.Program.AbsoluteCompareLessThanScalar_Vector64_Single() : 7
Passed test: _AdvSimd_Arm64_ro::JIT.HardwareIntrinsics.Arm._AdvSimd.Arm64.Program.AbsoluteCompareLessThanOrEqual_Vector128_Double() : 7
Passed test: _AdvSimd_Arm64_ro::JIT.HardwareIntrinsics.Arm._AdvSimd.Arm64.Program.AbsoluteCompareLessThanOrEqualScalar_Vector64_Double() : 7
Passed test: _AdvSimd_Arm64_ro::JIT.HardwareIntrinsics.Arm._AdvSimd.Arm64.Program.AbsoluteCompareLessThanOrEqualScalar_Vector64_Single() : 7
Passed test: _AdvSimd_ro::JIT.HardwareIntrinsics.Arm._AdvSimd.Program.AbsoluteCompareGreaterThan_Vector64_Single() : 7
Passed test: _AdvSimd_ro::JIT.HardwareIntrinsics.Arm._AdvSimd.Program.AbsoluteCompareGreaterThan_Vector128_Single() : 7
Passed test: _AdvSimd_ro::JIT.HardwareIntrinsics.Arm._AdvSimd.Program.AbsoluteCompareGreaterThanOrEqual_Vector64_Single() : 7
Passed test: _AdvSimd_ro::JIT.HardwareIntrinsics.Arm._AdvSimd.Program.AbsoluteCompareGreaterThanOrEqual_Vector128_Single() : 7
Passed test: _AdvSimd_ro::JIT.HardwareIntrinsics.Arm._AdvSimd.Program.AbsoluteCompareLessThan_Vector64_Single() : 7
Passed test: _AdvSimd_ro::JIT.HardwareIntrinsics.Arm._AdvSimd.Program.AbsoluteCompareLessThan_Vector128_Single() : 7
Passed test: _AdvSimd_ro::JIT.HardwareIntrinsics.Arm._AdvSimd.Program.AbsoluteCompareLessThanOrEqual_Vector64_Single() : 7
Passed test: _AdvSimd_ro::JIT.HardwareIntrinsics.Arm._AdvSimd.Program.AbsoluteCompareLessThanOrEqual_Vector128_Single() : 7
Passed test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_AbsoluteCompareGreaterThan_float() : 19
Passed test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_AbsoluteCompareGreaterThan_double() : 19
Passed test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_AbsoluteCompareGreaterThanOrEqual_float() : 19
Passed test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_AbsoluteCompareGreaterThanOrEqual_double() : 19
Passed test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_AbsoluteCompareLessThan_float() : 19
Passed test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_AbsoluteCompareLessThan_double() : 19
Passed test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_AbsoluteCompareLessThanOrEqual_float() : 19
Passed test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_AbsoluteCompareLessThanOrEqual_double() : 19
===================Running jitstress===================
------------------- {'JitMinOpts': '1'} -------------------
------------------- {'JitStress': '1'} -------------------
------------------- {'JitStress': '2'} -------------------
------------------- {'JitStress': '1', 'TieredCompilation': '1'} -------------------
Errors:

Assert failure(PID 432647 [0x00069a07], Thread: 432647 [0x69a07]): Assertion failed 'isPredicateRegister(reg1)' in 'JIT.HardwareIntrinsics.Arm._Sve.SimpleBinaryOpTest__Sve_AbsoluteCompareGreaterThan_float:RunLclVarScenario_UnsafeRead():this' during 'Generate code' (IL size 94; hash 0x9fa735e9; Tier0)

    File: /home/mikabl01/dotnet/runtime/src/coreclr/jit/emitarm64sve.cpp:5993
    Image: /home/mikabl01/dotnet/runtime/artifacts/tests/coreclr/linux.arm64.Checked/Tests/Core_Root/corerun


------------------- {'JitStress': '2', 'TieredCompilation': '1'} -------------------
Errors:

Assert failure(PID 432658 [0x00069a12], Thread: 432658 [0x69a12]): Assertion failed 'isPredicateRegister(reg1)' in 'JIT.HardwareIntrinsics.Arm._Sve.SimpleBinaryOpTest__Sve_AbsoluteCompareGreaterThan_float:RunLclVarScenario_UnsafeRead():this' during 'Generate code' (IL size 94; hash 0x9fa735e9; Tier0)

    File: /home/mikabl01/dotnet/runtime/src/coreclr/jit/emitarm64sve.cpp:5993
    Image: /home/mikabl01/dotnet/runtime/artifacts/tests/coreclr/linux.arm64.Checked/Tests/Core_Root/corerun


------------------- {'TailcallStress': '1'} -------------------
------------------- {'ReadyToRun': '0'} -------------------
===================Running jitstressregs===================
------------------- {'JitStressRegs': '1'} -------------------
------------------- {'JitStressRegs': '2'} -------------------
------------------- {'JitStressRegs': '3'} -------------------
Test failed:
..........................................
..........................................
..........................................
System.AccessViolationException: Attempted to read or write protected memory. This is often an indication that other memory is corrupt.
   at System.String.Concat(String str0, String str1)
   at JIT.HardwareIntrinsics.Arm._Sve.SimpleBinaryOpTest__Sve_AbsoluteCompareGreaterThan_float.ConditionalSelect_Op1() in /home/mikabl01/dotnet/runtime/artifacts/tests/coreclr/obj/linux.arm64.Checked/Managed/JIT/HardwareIntrinsics/Arm/Sve/Sve_ro/Sve_ro/gen/Sve.AbsoluteCompareGreaterThan.float.cs:line 286
   at JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_AbsoluteCompareGreaterThan_float() in /home/mikabl01/dotnet/runtime/artifacts/tests/coreclr/obj/linux.arm64.Checked/Managed/JIT/HardwareIntrinsics/Arm/Sve/Sve_ro/Sve_ro/gen/Sve.AbsoluteCompareGreaterThan.float.cs:line 55
   at Program.<<Main>$>g__TestExecutor2781|0_2782(StreamWriter tempLogSw, StreamWriter statsCsvSw, <>c__DisplayClass0_0&) in /home/mikabl01/dotnet/runtime/artifacts/tests/coreclr/obj/linux.arm64.Checked/Managed/JIT/HardwareIntrinsics/HardwareIntrinsics_Arm_ro/generated/XUnitWrapperGenerator/XUnitWrapperGenerator.XUnitWrapperGenerator/FullRunner.g.cs:line 69965
System.AccessViolationException: Attempted to read or write protected memory. This is often an indication that other memory is corrupt.
   at System.String.Concat(String str0, String str1)
   at JIT.HardwareIntrinsics.Arm._Sve.SimpleBinaryOpTest__Sve_AbsoluteCompareGreaterThan_double.ConditionalSelect_Op1() in /home/mikabl01/dotnet/runtime/artifacts/tests/coreclr/obj/linux.arm64.Checked/Managed/JIT/HardwareIntrinsics/Arm/Sve/Sve_ro/Sve_ro/gen/Sve.AbsoluteCompareGreaterThan.double.cs:line 286
   at JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_AbsoluteCompareGreaterThan_double() in /home/mikabl01/dotnet/runtime/artifacts/tests/coreclr/obj/linux.arm64.Checked/Managed/JIT/HardwareIntrinsics/Arm/Sve/Sve_ro/Sve_ro/gen/Sve.AbsoluteCompareGreaterThan.double.cs:line 55
   at Program.<<Main>$>g__TestExecutor2782|0_2783(StreamWriter tempLogSw, StreamWriter statsCsvSw, <>c__DisplayClass0_0&) in /home/mikabl01/dotnet/runtime/artifacts/tests/coreclr/obj/linux.arm64.Checked/Managed/JIT/HardwareIntrinsics/HardwareIntrinsics_Arm_ro/generated/XUnitWrapperGenerator/XUnitWrapperGenerator.XUnitWrapperGenerator/FullRunner.g.cs:line 69989
System.AccessViolationException: Attempted to read or write protected memory. This is often an indication that other memory is corrupt.
   at System.String.Concat(String str0, String str1)
   at JIT.HardwareIntrinsics.Arm._Sve.SimpleBinaryOpTest__Sve_AbsoluteCompareGreaterThanOrEqual_float.ConditionalSelect_Op1() in /home/mikabl01/dotnet/runtime/artifacts/tests/coreclr/obj/linux.arm64.Checked/Managed/JIT/HardwareIntrinsics/Arm/Sve/Sve_ro/Sve_ro/gen/Sve.AbsoluteCompareGreaterThanOrEqual.float.cs:line 286
   at JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_AbsoluteCompareGreaterThanOrEqual_float() in /home/mikabl01/dotnet/runtime/artifacts/tests/coreclr/obj/linux.arm64.Checked/Managed/JIT/HardwareIntrinsics/Arm/Sve/Sve_ro/Sve_ro/gen/Sve.AbsoluteCompareGreaterThanOrEqual.float.cs:line 55
   at Program.<<Main>$>g__TestExecutor2783|0_2784(StreamWriter tempLogSw, StreamWriter statsCsvSw, <>c__DisplayClass0_0&) in /home/mikabl01/dotnet/runtime/artifacts/tests/coreclr/obj/linux.arm64.Checked/Managed/JIT/HardwareIntrinsics/HardwareIntrinsics_Arm_ro/generated/XUnitWrapperGenerator/XUnitWrapperGenerator.XUnitWrapperGenerator/FullRunner.g.cs:line 70013
System.AccessViolationException: Attempted to read or write protected memory. This is often an indication that other memory is corrupt.
   at System.String.Concat(String str0, String str1)
   at JIT.HardwareIntrinsics.Arm._Sve.SimpleBinaryOpTest__Sve_AbsoluteCompareGreaterThanOrEqual_double.ConditionalSelect_Op1() in /home/mikabl01/dotnet/runtime/artifacts/tests/coreclr/obj/linux.arm64.Checked/Managed/JIT/HardwareIntrinsics/Arm/Sve/Sve_ro/Sve_ro/gen/Sve.AbsoluteCompareGreaterThanOrEqual.double.cs:line 286
   at JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_AbsoluteCompareGreaterThanOrEqual_double() in /home/mikabl01/dotnet/runtime/artifacts/tests/coreclr/obj/linux.arm64.Checked/Managed/JIT/HardwareIntrinsics/Arm/Sve/Sve_ro/Sve_ro/gen/Sve.AbsoluteCompareGreaterThanOrEqual.double.cs:line 55
   at Program.<<Main>$>g__TestExecutor2784|0_2785(StreamWriter tempLogSw, StreamWriter statsCsvSw, <>c__DisplayClass0_0&) in /home/mikabl01/dotnet/runtime/artifacts/tests/coreclr/obj/linux.arm64.Checked/Managed/JIT/HardwareIntrinsics/HardwareIntrinsics_Arm_ro/generated/XUnitWrapperGenerator/XUnitWrapperGenerator.XUnitWrapperGenerator/FullRunner.g.cs:line 70037
System.AccessViolationException: Attempted to read or write protected memory. This is often an indication that other memory is corrupt.
   at System.String.Concat(String str0, String str1)
   at JIT.HardwareIntrinsics.Arm._Sve.SimpleBinaryOpTest__Sve_AbsoluteCompareLessThan_float.ConditionalSelect_Op1() in /home/mikabl01/dotnet/runtime/artifacts/tests/coreclr/obj/linux.arm64.Checked/Managed/JIT/HardwareIntrinsics/Arm/Sve/Sve_ro/Sve_ro/gen/Sve.AbsoluteCompareLessThan.float.cs:line 286
   at JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_AbsoluteCompareLessThan_float() in /home/mikabl01/dotnet/runtime/artifacts/tests/coreclr/obj/linux.arm64.Checked/Managed/JIT/HardwareIntrinsics/Arm/Sve/Sve_ro/Sve_ro/gen/Sve.AbsoluteCompareLessThan.float.cs:line 55
   at Program.<<Main>$>g__TestExecutor2785|0_2786(StreamWriter tempLogSw, StreamWriter statsCsvSw, <>c__DisplayClass0_0&) in /home/mikabl01/dotnet/runtime/artifacts/tests/coreclr/obj/linux.arm64.Checked/Managed/JIT/HardwareIntrinsics/HardwareIntrinsics_Arm_ro/generated/XUnitWrapperGenerator/XUnitWrapperGenerator.XUnitWrapperGenerator/FullRunner.g.cs:line 70061
System.AccessViolationException: Attempted to read or write protected memory. This is often an indication that other memory is corrupt.
   at System.String.Concat(String str0, String str1)
   at JIT.HardwareIntrinsics.Arm._Sve.SimpleBinaryOpTest__Sve_AbsoluteCompareLessThan_double.ConditionalSelect_Op1() in /home/mikabl01/dotnet/runtime/artifacts/tests/coreclr/obj/linux.arm64.Checked/Managed/JIT/HardwareIntrinsics/Arm/Sve/Sve_ro/Sve_ro/gen/Sve.AbsoluteCompareLessThan.double.cs:line 286
   at JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_AbsoluteCompareLessThan_double() in /home/mikabl01/dotnet/runtime/artifacts/tests/coreclr/obj/linux.arm64.Checked/Managed/JIT/HardwareIntrinsics/Arm/Sve/Sve_ro/Sve_ro/gen/Sve.AbsoluteCompareLessThan.double.cs:line 55
   at Program.<<Main>$>g__TestExecutor2786|0_2787(StreamWriter tempLogSw, StreamWriter statsCsvSw, <>c__DisplayClass0_0&) in /home/mikabl01/dotnet/runtime/artifacts/tests/coreclr/obj/linux.arm64.Checked/Managed/JIT/HardwareIntrinsics/HardwareIntrinsics_Arm_ro/generated/XUnitWrapperGenerator/XUnitWrapperGenerator.XUnitWrapperGenerator/FullRunner.g.cs:line 70085
System.AccessViolationException: Attempted to read or write protected memory. This is often an indication that other memory is corrupt.
   at System.String.Concat(String str0, String str1)
   at JIT.HardwareIntrinsics.Arm._Sve.SimpleBinaryOpTest__Sve_AbsoluteCompareLessThanOrEqual_float.ConditionalSelect_Op1() in /home/mikabl01/dotnet/runtime/artifacts/tests/coreclr/obj/linux.arm64.Checked/Managed/JIT/HardwareIntrinsics/Arm/Sve/Sve_ro/Sve_ro/gen/Sve.AbsoluteCompareLessThanOrEqual.float.cs:line 286
   at JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_AbsoluteCompareLessThanOrEqual_float() in /home/mikabl01/dotnet/runtime/artifacts/tests/coreclr/obj/linux.arm64.Checked/Managed/JIT/HardwareIntrinsics/Arm/Sve/Sve_ro/Sve_ro/gen/Sve.AbsoluteCompareLessThanOrEqual.float.cs:line 55
   at Program.<<Main>$>g__TestExecutor2787|0_2788(StreamWriter tempLogSw, StreamWriter statsCsvSw, <>c__DisplayClass0_0&) in /home/mikabl01/dotnet/runtime/artifacts/tests/coreclr/obj/linux.arm64.Checked/Managed/JIT/HardwareIntrinsics/HardwareIntrinsics_Arm_ro/generated/XUnitWrapperGenerator/XUnitWrapperGenerator.XUnitWrapperGenerator/FullRunner.g.cs:line 70109
System.AccessViolationException: Attempted to read or write protected memory. This is often an indication that other memory is corrupt.
   at System.String.Concat(String str0, String str1)
   at JIT.HardwareIntrinsics.Arm._Sve.SimpleBinaryOpTest__Sve_AbsoluteCompareLessThanOrEqual_double.ConditionalSelect_Op1() in /home/mikabl01/dotnet/runtime/artifacts/tests/coreclr/obj/linux.arm64.Checked/Managed/JIT/HardwareIntrinsics/Arm/Sve/Sve_ro/Sve_ro/gen/Sve.AbsoluteCompareLessThanOrEqual.double.cs:line 286
   at JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_AbsoluteCompareLessThanOrEqual_double() in /home/mikabl01/dotnet/runtime/artifacts/tests/coreclr/obj/linux.arm64.Checked/Managed/JIT/HardwareIntrinsics/Arm/Sve/Sve_ro/Sve_ro/gen/Sve.AbsoluteCompareLessThanOrEqual.double.cs:line 55
   at Program.<<Main>$>g__TestExecutor2788|0_2789(StreamWriter tempLogSw, StreamWriter statsCsvSw, <>c__DisplayClass0_0&) in /home/mikabl01/dotnet/runtime/artifacts/tests/coreclr/obj/linux.arm64.Checked/Managed/JIT/HardwareIntrinsics/HardwareIntrinsics_Arm_ro/generated/XUnitWrapperGenerator/XUnitWrapperGenerator.XUnitWrapperGenerator/FullRunner.g.cs:line 70133
------------------- {'JitStressRegs': '4'} -------------------
------------------- {'JitStressRegs': '8'} -------------------
------------------- {'JitStressRegs': '0x10'} -------------------
------------------- {'JitStressRegs': '0x80'} -------------------
------------------- {'JitStressRegs': '0x1000'} -------------------
------------------- {'JitStressRegs': '0x2000'} -------------------
===================Running jitstress2-jitstressregs===================
------------------- {'JitStress': '2', 'JitStressRegs': '1'} -------------------
Test failed:
..........................................
..........................................
..........................................
..........................................
Sve.AbsoluteCompareGreaterThanOrEqual<Single>(Vector<Single>, Vector<Single>): ConditionalSelectScenario failed:
    mask: (0.8315919, 0.88797516, 0.23845755, 0.19791472)
    left: (0.10137935, 0.9799378, 0.935655, 0.30617934)
   right: (6.122E-42, 0.30880353, 0.45343152, 0.75392306)
 falseOp: (0.45845357, 0.30880353, 0.45343152, 0.75392306)
  result: (0, 1E-45, 1E-45, 0)
..........................................
System.Exception: One or more scenarios did not complete as expected.
  
   at JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_AbsoluteCompareGreaterThanOrEqual_float() in /home/mikabl01/dotnet/runtime/artifacts/tests/coreclr/obj/linux.arm64.Checked/Managed/JIT/HardwareIntrinsics/Arm/Sve/Sve_ro/Sve_ro/gen/Sve.AbsoluteCompareGreaterThanOrEqual.float.cs:line 74
   at Program.<<Main>$>g__TestExecutor2783|0_2784(StreamWriter tempLogSw, StreamWriter statsCsvSw, <>c__DisplayClass0_0&) in /home/mikabl01/dotnet/runtime/artifacts/tests/coreclr/obj/linux.arm64.Checked/Managed/JIT/HardwareIntrinsics/HardwareIntrinsics_Arm_ro/generated/XUnitWrapperGenerator/XUnitWrapperGenerator.XUnitWrapperGenerator/FullRunner.g.cs:line 70013
------------------- {'JitStress': '2', 'JitStressRegs': '2'} -------------------
------------------- {'JitStress': '2', 'JitStressRegs': '3'} -------------------
Test failed:
..........................................
..........................................
..........................................
System.AccessViolationException: Attempted to read or write protected memory. This is often an indication that other memory is corrupt.
   at System.String.Concat(String str0, String str1)
   at JIT.HardwareIntrinsics.Arm._Sve.SimpleBinaryOpTest__Sve_AbsoluteCompareGreaterThan_double.ConditionalSelect_FalseOp() in /home/mikabl01/dotnet/runtime/artifacts/tests/coreclr/obj/linux.arm64.Checked/Managed/JIT/HardwareIntrinsics/Arm/Sve/Sve_ro/Sve_ro/gen/Sve.AbsoluteCompareGreaterThan.double.cs:line 310
   at JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_AbsoluteCompareGreaterThan_double() in /home/mikabl01/dotnet/runtime/artifacts/tests/coreclr/obj/linux.arm64.Checked/Managed/JIT/HardwareIntrinsics/Arm/Sve/Sve_ro/Sve_ro/gen/Sve.AbsoluteCompareGreaterThan.double.cs:line 61
   at Program.<<Main>$>g__TestExecutor2782|0_2783(StreamWriter tempLogSw, StreamWriter statsCsvSw, <>c__DisplayClass0_0&) in /home/mikabl01/dotnet/runtime/artifacts/tests/coreclr/obj/linux.arm64.Checked/Managed/JIT/HardwareIntrinsics/HardwareIntrinsics_Arm_ro/generated/XUnitWrapperGenerator/XUnitWrapperGenerator.XUnitWrapperGenerator/FullRunner.g.cs:line 69989
------------------- {'JitStress': '2', 'JitStressRegs': '4'} -------------------
------------------- {'JitStress': '2', 'JitStressRegs': '8'} -------------------
------------------- {'JitStress': '2', 'JitStressRegs': '0x10'} -------------------
------------------- {'JitStress': '2', 'JitStressRegs': '0x80'} -------------------
------------------- {'JitStress': '2', 'JitStressRegs': '0x1000'} -------------------
------------------- {'JitStress': '2', 'JitStressRegs': '0x2000'} -------------------

@mikabl-arm
Copy link
Contributor Author

After 4b5a0f8 the stress tests now pass successfully.

The issue arises when emitting FAC<cc> instructions from CodeGen::genHWIntrinsic(). In this case, the targetReg of the ConditionalSelect node is passed as a destination vector register for the emitted instruction. However, for FAC<cc> instructions, the destination register must be a predicate register. This discrepancy causes an assertion failure in emitIns_R_R_R_R(). Previously, this was not an issue because we didn't have embedded masked operations that return per-element masks (that have both HW_Flag_EmbeddedMaskedOperation and HW_Flag_ReturnsPerElementMask flags set).

To address this issue, the commit 4b5a0f8 conditionally changes the type of the ConditionalSelect node to the predicate type if necessary. However, I'm unsure if this solution is appropriate. @kunalspathak, could you please review this change? I'm concerned that having ConditionalSelect nodes that can be either vector or predicate types might be a bad approach.

We've also discussed two other approaches with @a74nh and @SwapnilGaikwad . At the moment Lowering::LowerHWIntrinsic() transforms <user>(embMaskedOp()) into <user>(ConditionalSelect(trueMask, embMaskedOp(), zero)). Alternatively, we could transform it into either:

  1. ConditionalSelect(trueMask, <user>(embMaskedOp()), zero)). However, I'm unsure if modifying parent nodes in this way at that stage is allowed.
  2. <user>(V2M(CondtitionalSelect(trueMask, M2V(embMaskedOp()), zero))) where V2M and M2V are vector<->predicate conversion nodes. However, I'm concerned that introducing too many conversion nodes might result in suboptimal emitted code.

I haven't tested these approaches yet.

Stress tests' results:

dotnet/runtime$ ./stress_tester.py ./artifacts/tests/coreclr/linux.arm64.Checked/Tests/Core_Root/corerun -p System.Reflection.Metadata.MetadataUpdater.IsSupported=false -p System.Runtime.Serialization.EnableUnsafeBinaryFormatterSerialization=true ./artifacts/tests/coreclr/linux.arm64.Checked/JIT/HardwareIntrinsics/HardwareIntrinsics_Arm_ro/HardwareIntrinsics_Arm_ro.dll AbsoluteCompare

Starting test: /home/mikabl01/dotnet/runtime/artifacts/tests/coreclr/linux.arm64.Checked/Tests/Core_Root/corerun -p System.Reflection.Metadata.MetadataUpdater.IsSupported=false -p System.Runtime.Serialization.EnableUnsafeBinaryFormatterSerialization=true ./artifacts/tests/coreclr/linux.arm64.Checked/JIT/HardwareIntrinsics/HardwareIntrinsics_Arm_ro/HardwareIntrinsics_Arm_ro.dll Sve_AbsoluteCompare
===================Running default===================
------------------- {} -------------------
Passed test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_AbsoluteCompareGreaterThan_float() : 19
Passed test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_AbsoluteCompareGreaterThan_double() : 19
Passed test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_AbsoluteCompareGreaterThanOrEqual_float() : 19
Passed test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_AbsoluteCompareGreaterThanOrEqual_double() : 19
Passed test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_AbsoluteCompareLessThan_float() : 19
Passed test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_AbsoluteCompareLessThan_double() : 19
Passed test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_AbsoluteCompareLessThanOrEqual_float() : 19
Passed test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_AbsoluteCompareLessThanOrEqual_double() : 19
===================Running jitstress===================
------------------- {'JitMinOpts': '1'} -------------------
------------------- {'JitStress': '1'} -------------------
------------------- {'JitStress': '2'} -------------------
------------------- {'JitStress': '1', 'TieredCompilation': '1'} -------------------
------------------- {'JitStress': '2', 'TieredCompilation': '1'} -------------------
------------------- {'TailcallStress': '1'} -------------------
------------------- {'ReadyToRun': '0'} -------------------
===================Running jitstressregs===================
------------------- {'JitStressRegs': '1'} -------------------
------------------- {'JitStressRegs': '2'} -------------------
------------------- {'JitStressRegs': '3'} -------------------
------------------- {'JitStressRegs': '4'} -------------------
------------------- {'JitStressRegs': '8'} -------------------
------------------- {'JitStressRegs': '0x10'} -------------------
------------------- {'JitStressRegs': '0x80'} -------------------
------------------- {'JitStressRegs': '0x1000'} -------------------
------------------- {'JitStressRegs': '0x2000'} -------------------
===================Running jitstress2-jitstressregs===================
------------------- {'JitStress': '2', 'JitStressRegs': '1'} -------------------
------------------- {'JitStress': '2', 'JitStressRegs': '2'} -------------------
------------------- {'JitStress': '2', 'JitStressRegs': '3'} -------------------
------------------- {'JitStress': '2', 'JitStressRegs': '4'} -------------------
------------------- {'JitStress': '2', 'JitStressRegs': '8'} -------------------
------------------- {'JitStress': '2', 'JitStressRegs': '0x10'} -------------------
------------------- {'JitStress': '2', 'JitStressRegs': '0x80'} -------------------
------------------- {'JitStress': '2', 'JitStressRegs': '0x1000'} -------------------
------------------- {'JitStress': '2', 'JitStressRegs': '0x2000'} -------------------

@kunalspathak
Copy link
Member

@kunalspathak, could you please review this change?

This sounds reasonable to me. By doing this, can we add extra test coverage? For e.g. can we land up with https://docsmirror.github.io/A64/2023-06/sel_p_p_pp.html . Or what happens when we use something like this:

ConditionalSelect(AbsoluteCompare(left, right), left, right));

LoadVector(AbsoluteCompare(left, right), address));

@mikabl-arm
Copy link
Contributor Author

@kunalspathak , thank you for the review. I'm happy to add additional test for the APIs.

For e.g. can we land up with https://docsmirror.github.io/A64/2023-06/sel_p_p_pp.html

Are you asking for a tests that checks the emitted instructions for SEL (predicates) ? If so, is there a way to do something similar to LLVM regression tests in DotNet? https://llvm.org/docs/TestingGuide.html#platform-specific-tests
If that's not what your are asking for, please elaborate on this.

Or what happens when we use something like this:
ConditionalSelect(AbsoluteCompare(left, right), left, right));
LoadVector(AbsoluteCompare(left, right), address));

Do you want these two to be added for all APIs that use _SveBinaryOpTestTemplate.template or Sve_AbsoluteCompare* only? If the latter, I'll duplicate and modify _SveBinaryOpTestTemplate.template to accommodate for you request, are you okay with that?

@kunalspathak
Copy link
Member

Are you asking for a tests that checks the emitted instructions for SEL (predicates) ?

I think this falls in the category of #103078, so we can probably skip doing it.

Do you want these two to be added for all APIs that use _SveBinaryOpTestTemplate.template

I think we should do it at least for one of the Compare() because that way we will at least have a test other than using CreateTrue() for creating mask.

@mikabl-arm
Copy link
Contributor Author

Hello @kunalspathak , I've added the requested tests. Please let me know if I interpreted something incorrectly.

Copy link
Member

@kunalspathak kunalspathak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need to do some validation around a bit to make sure that it is handled correctly.

src/coreclr/jit/hwintrinsiccodegenarm64.cpp Outdated Show resolved Hide resolved
@@ -1311,9 +1311,14 @@ GenTree* Lowering::LowerHWIntrinsic(GenTreeHWIntrinsic* node)
GenTree* trueMask = comp->gtNewSimdAllTrueMaskNode(simdBaseJitType, simdSize);
GenTree* trueVal = node;
GenTree* falseVal = comp->gtNewZeroConNode(simdType);
var_types nodeType = simdType;
if (HWIntrinsicInfo::ReturnsPerElementMask(node->GetHWIntrinsicId()))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need to test this out to make sure this is correct. I will pull your branch once #103288 is merged.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am seeing the test DOTNET_TieredCompilation=0:

Assert failure(PID 18916 [0x000049e4], Thread: 28152 [0x6df8]): Assertion failed 'unreached' in 'JIT.HardwareIntrinsics.Arm._Sve.SimpleBinaryOpTest__Sve_AbsoluteCompareGreaterThan_float:ConditionalSelect_MethodMask():this' during 'Generate code' (IL size 109; hash 0x5f5a6ecd; FullOpts)

    File: D:\git\runtime\src\coreclr\jit\emitarm64sve.cpp:4396
    Image: D:\kpathak\Core_Root_absolutecompare\Core_Root\corerun.exe

This is missing a piece where if the cndSelNode is of type TYP_MASK, then the falseValue should also be TYP_MASK. Today, we do not support ZeroConNode of TYP_MASK, so I was thinking of wrapping falseVal with ConvertVectorToMask(), but for some reason that was failing in LIR validation that I didn't get a chance to verify. I will look tomorrow.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@a74nh, @mikabl-arm - I have fixed code to handle falseValue of TYP_MASK. PTAL.

Stress tests are passing: https://gist.github.com/kunalspathak/95ddd4913a6642bcbda7ddff9c33f752

The failure case RunLoadMask is because we are not preserving p0 across call (known problem).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@kunalspathak
Copy link
Member

Did you verify the test after latest merge? I see this failure on DOTNET_TieredCompilation=0:


Assert failure(PID 21544 [0x00005428], Thread: 22000 [0x55f0]): Assertion failed 'unreached' in 'JIT.HardwareIntrinsics.Arm._Sve.SimpleBinaryOpTest__Sve_AbsoluteCompareGreaterThan_float:ConditionalSelect_MethodMask():this' during 'Generate code' (IL size 109; hash 0x5f5a6ecd; FullOpts)

    File: D:\git\runtime\src\coreclr\jit\emitarm64sve.cpp:4396
    Image: D:\kpathak\Core_Root_absolutecompare\Core_Root\corerun.exe

…nalSelect in another ConditionalSelect if required
@mikabl-arm
Copy link
Contributor Author

Thank you for catching this, @kunalspathak ! Indeed, looks like I have missed something after adding the two requested new tests. I'll do my best to be more thorough next time 😄

Pushed a fix to the branch: 1a2329e. All stress tests for AbsoluteCompare pass successfully now:

  Starting test: /home/mikabl01/dotnet/runtime/artifacts/tests/coreclr/linux.arm64.Debug/Tests/Core_Root/corerun -p System.Reflection.Metadata.MetadataUpdater.IsSupported=false -p System.Runtime.Serialization.EnableUnsafeBinaryFormatterSerialization=true ./artifacts/tests/coreclr/linux.arm64.Debug/JIT/HardwareIntrinsics/HardwareIntrinsics_Arm_ro/HardwareIntrinsics_Arm_ro.dll AbsoluteCompare                                                                                                                                                                                                                                                 
  ===================Running default===================                                                                                                                                                                                                                                                                       
  ------------------- {} -------------------                                                                                                                                                                                                                                                                                  
  Passed test: _AdvSimd_Arm64_ro::JIT.HardwareIntrinsics.Arm._AdvSimd.Arm64.Program.AbsoluteCompareGreaterThan_Vector128_Double() : 7                                                                                                                                                                                         
  Passed test: _AdvSimd_Arm64_ro::JIT.HardwareIntrinsics.Arm._AdvSimd.Arm64.Program.AbsoluteCompareGreaterThanScalar_Vector64_Double() : 7                                                                                                                                                                                    
  Passed test: _AdvSimd_Arm64_ro::JIT.HardwareIntrinsics.Arm._AdvSimd.Arm64.Program.AbsoluteCompareGreaterThanScalar_Vector64_Single() : 7                                                                                                                                                                                    
  Passed test: _AdvSimd_Arm64_ro::JIT.HardwareIntrinsics.Arm._AdvSimd.Arm64.Program.AbsoluteCompareGreaterThanOrEqual_Vector128_Double() : 7                                                                                                                                                                                  
  Passed test: _AdvSimd_Arm64_ro::JIT.HardwareIntrinsics.Arm._AdvSimd.Arm64.Program.AbsoluteCompareGreaterThanOrEqualScalar_Vector64_Double() : 7                                                                                                                                                                             
  Passed test: _AdvSimd_Arm64_ro::JIT.HardwareIntrinsics.Arm._AdvSimd.Arm64.Program.AbsoluteCompareGreaterThanOrEqualScalar_Vector64_Single() : 7                                                                                                                                                                             
  Passed test: _AdvSimd_Arm64_ro::JIT.HardwareIntrinsics.Arm._AdvSimd.Arm64.Program.AbsoluteCompareLessThan_Vector128_Double() : 7                                                                                                                                                                                            
  Passed test: _AdvSimd_Arm64_ro::JIT.HardwareIntrinsics.Arm._AdvSimd.Arm64.Program.AbsoluteCompareLessThanScalar_Vector64_Double() : 7                                                                                                                                                                                       
  Passed test: _AdvSimd_Arm64_ro::JIT.HardwareIntrinsics.Arm._AdvSimd.Arm64.Program.AbsoluteCompareLessThanScalar_Vector64_Single() : 7                                                                                                                                                                                       
  Passed test: _AdvSimd_Arm64_ro::JIT.HardwareIntrinsics.Arm._AdvSimd.Arm64.Program.AbsoluteCompareLessThanOrEqual_Vector128_Double() : 7                                                                                                                                                                                     
  Passed test: _AdvSimd_Arm64_ro::JIT.HardwareIntrinsics.Arm._AdvSimd.Arm64.Program.AbsoluteCompareLessThanOrEqualScalar_Vector64_Double() : 7                                                                                                                                                                                
  Passed test: _AdvSimd_Arm64_ro::JIT.HardwareIntrinsics.Arm._AdvSimd.Arm64.Program.AbsoluteCompareLessThanOrEqualScalar_Vector64_Single() : 7                                                                                                                                                                                
  Passed test: _AdvSimd_ro::JIT.HardwareIntrinsics.Arm._AdvSimd.Program.AbsoluteCompareGreaterThan_Vector64_Single() : 7                                                                                                                                                                                                      
  Passed test: _AdvSimd_ro::JIT.HardwareIntrinsics.Arm._AdvSimd.Program.AbsoluteCompareGreaterThan_Vector128_Single() : 7                                                                                                                                                                                                     
  Passed test: _AdvSimd_ro::JIT.HardwareIntrinsics.Arm._AdvSimd.Program.AbsoluteCompareGreaterThanOrEqual_Vector64_Single() : 7                                                                                                                                                                                               
  Passed test: _AdvSimd_ro::JIT.HardwareIntrinsics.Arm._AdvSimd.Program.AbsoluteCompareGreaterThanOrEqual_Vector128_Single() : 7                                                                                                                                                                                              
  Passed test: _AdvSimd_ro::JIT.HardwareIntrinsics.Arm._AdvSimd.Program.AbsoluteCompareLessThan_Vector64_Single() : 7                                                                                                                                                                                                         
  Passed test: _AdvSimd_ro::JIT.HardwareIntrinsics.Arm._AdvSimd.Program.AbsoluteCompareLessThan_Vector128_Single() : 7                                                                                                                                                                                                        
  Passed test: _AdvSimd_ro::JIT.HardwareIntrinsics.Arm._AdvSimd.Program.AbsoluteCompareLessThanOrEqual_Vector64_Single() : 7                                                                                                                                                                                                  
  Passed test: _AdvSimd_ro::JIT.HardwareIntrinsics.Arm._AdvSimd.Program.AbsoluteCompareLessThanOrEqual_Vector128_Single() : 7                                                                                                                                                                                                 
  Passed test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_AbsoluteCompareGreaterThan_float() : 21                                                                                                                                                                                                                   
  Passed test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_AbsoluteCompareGreaterThan_double() : 21                                                                                                                                                                                                                  
  Passed test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_AbsoluteCompareGreaterThanOrEqual_float() : 21                                                                                                                                                                                                            
  Passed test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_AbsoluteCompareGreaterThanOrEqual_double() : 21                                                                                                                                                                                                           
  Passed test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_AbsoluteCompareLessThan_float() : 21                                                                                                                                                                                                                      
  Passed test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_AbsoluteCompareLessThan_double() : 21                                                                                                                                                                                                                     
  Passed test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_AbsoluteCompareLessThanOrEqual_float() : 21                                                                                                                                                                                                               
  Passed test: _Sve_ro::JIT.HardwareIntrinsics.Arm._Sve.Program.Sve_AbsoluteCompareLessThanOrEqual_double() : 21                                                                                                                                                                                                              
  ===================Running jitstress===================                                                                                                                                                                                                                                                                     
  ------------------- {'JitMinOpts': '1'} -------------------                                                                                                                                                                                                                                                                 
  ------------------- {'JitStress': '1'} -------------------                                                                                                                                                                                                                                                                  
  ------------------- {'JitStress': '2'} -------------------                                                                                                                                                                                                                                                                  
  ------------------- {'JitStress': '1', 'TieredCompilation': '1'} -------------------                                                                                                                                                                                                                                        
  ------------------- {'JitStress': '2', 'TieredCompilation': '1'} -------------------                                                                                                                                                                                                                                        
  ------------------- {'TailcallStress': '1'} -------------------                                                                                                                                                                                                                                                             
  ------------------- {'ReadyToRun': '0'} -------------------                                                                                                                                                                                                                                                                 
  ===================Running jitstressregs===================                                                                                                                                                                                                                                                                 
  ------------------- {'JitStressRegs': '1'} -------------------                                                                                                                                                                                                                                                              
  ------------------- {'JitStressRegs': '2'} -------------------                                                                                                                                                                                                                                                              
  ------------------- {'JitStressRegs': '3'} -------------------                                                                                                                                                                                                                                                              
  ------------------- {'JitStressRegs': '4'} -------------------                                                                                                                                                                                                                                                              
  ------------------- {'JitStressRegs': '8'} -------------------                                                                                                                                                                                                                                                              
  ------------------- {'JitStressRegs': '0x10'} -------------------                                                                                                                                                                                                                                                           
  ------------------- {'JitStressRegs': '0x80'} -------------------                                                                                                                                                                                                                                                           
  ------------------- {'JitStressRegs': '0x1000'} -------------------                                                                                                                                                                                                                                                         
  ------------------- {'JitStressRegs': '0x2000'} -------------------                                                                                                                                                                                                                                                         
  ===================Running jitstress2-jitstressregs===================                                                                                                                                                                                                                                                      
  ------------------- {'JitStress': '2', 'JitStressRegs': '1'} -------------------                                                                                                                                                                                                                                            
  ------------------- {'JitStress': '2', 'JitStressRegs': '2'} -------------------                                                                                                                                                                                                                                            
  ------------------- {'JitStress': '2', 'JitStressRegs': '3'} -------------------                                                                                                                                                                                                                                            
  ------------------- {'JitStress': '2', 'JitStressRegs': '4'} -------------------                                                                                                                                                                                                                                            
  ------------------- {'JitStress': '2', 'JitStressRegs': '8'} -------------------                                                                                                                                                                                                                                            
  ------------------- {'JitStress': '2', 'JitStressRegs': '0x10'} -------------------                                                                                                                                                                                                                                         
  ------------------- {'JitStress': '2', 'JitStressRegs': '0x80'} -------------------                                                                                                                                                                                                                                         
  ------------------- {'JitStress': '2', 'JitStressRegs': '0x1000'} -------------------                                                                                                                                                                                                                                       
  ------------------- {'JitStress': '2', 'JitStressRegs': '0x2000'} -------------------

Comment on lines 2000 to 2004
if (intrin.op2->OperIs(GT_HWINTRINSIC) &&
(intrin.op2->AsHWIntrinsic()->GetHWIntrinsicId() == NI_Sve_ConvertVectorToMask))
{
// Have RBM_ALLMASK candidates only if op2 is VectorToMask
assert(intrin.op2->gtType == TYP_MASK);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the reason for this? It's not clear why TYP_MASK would be restricted but VectorToMask would not...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without this, we end up assigning RBM_ALLMASK candidates for operands of AbsoluateCompare(), but they should be assigned RBM_ALLFLOAT candidates.

Comment on lines 1305 to +1306
// Wrap the intrinsic in ConditionalSelect only if it is not already inside another ConditionalSelect
if (!user->OperIsHWIntrinsic() || (user->AsHWIntrinsic()->GetHWIntrinsicId() != NI_Sve_ConditionalSelect))
// If it is inside ConditionalSelect, then make sure that it is the `mask` operation of it.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure I follow this entirely...

This is stating we need to wrap in a CndSel if the user isn't a CndSel or if it is but we're not the mask operand.

The !user->OperIsHWIntrinsic(NI_Sve_ConditionalSelect) seems fine with that regard but the HWIntrinsicInfo::ReturnsPerElementMask(node->GetHWIntrinsicId()) doesn't seem to fit.

Most notably we can have mask = CndSel(mask1, mask2, mask3) or vector CndSel(mask, vector1, vector2), but also I would have expected that intrinsics like Abs(x) which must be emitted with a mask to need CndSel regardless of where it appears and only that we could optimize it away in some cases like vector = CndSel(mask, vector1, CndSel(mask, vector2, vector3)) (where the outer and inner cndsel use identical masks that mean the latter CndSel becomes redundant as the computed operands would never be selected anyways).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I would have expected the latter condition to basically be conditioned differently, basically ensuring that intrinsics that require a CndSel wrapper always, optionally trying to fold in the case the mask usages can be detected as redundant

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's better to rewrite the comment to avoid confusion I guess:

          // Wrap the intrinsic in ConditionalSelect only if it is not already inside another ConditionalSelect
          // or if it is but it is the `mask` operand.

The goal of this block of code is to wrap HWIntrinsics which satisfy HWIntrinsicInfo::IsEmbeddedMaskedOperation(intrinsicId)) into ConditionalSelect. Usually this only needs to be done when the intrinsic is not already wrapped. But this logic doesn't hold when such intrinsic is used as the mask operand for ConditionalSelect - user->AsHWIntrinsic()->GetHWIntrinsicId() != NI_Sve_ConditionalSelect doesn't hold true any more, but the intrinsic still has to be wrapped ConditionalSelect (all that satisfy HWIntrinsicInfo::IsEmbeddedMaskedOperation(intrinsicId) have to).

@kunalspathak
Copy link
Member

Replaced by #104464

@github-actions github-actions bot locked and limited conversation to collaborators Aug 4, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-System.Runtime.Intrinsics arm-sve Work related to arm64 SVE/SVE2 support community-contribution Indicates that the PR has been added by a community member new-api-needs-documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants