-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Arm64] MultiplyHigh #43106
Labels
api-approved
API was approved in API review, it can be implemented
arch-arm64
area-System.Runtime.Intrinsics
Milestone
Comments
echesakov
added
arch-arm64
area-System.Runtime.Intrinsics
api-ready-for-review
API is ready for review, it is NOT ready for implementation
labels
Oct 6, 2020
Tagging subscribers to this area: @tannergooding, @jeffhandley |
Dotnet-GitSync-Bot
added
the
untriaged
New issue has not been triaged by the area owner
label
Oct 6, 2020
jeffschwMSFT
removed
the
untriaged
New issue has not been triaged by the area owner
label
Oct 6, 2020
Looks good as proposed: namespace System.Runtime.Intrinsics.Arm
{
public abstract class ArmBase
{
public abstract class Arm64
{
public static long MultiplyHigh(long left, long right);
public static ulong MultiplyHigh(ulong left, ulong right);
}
}
} |
terrajobst
added
api-approved
API was approved in API review, it can be implemented
and removed
api-ready-for-review
API is ready for review, it is NOT ready for implementation
labels
Oct 20, 2020
29 tasks
ghost
added
the
in-pr
There is an active PR which will close this issue when it is merged
label
Jan 23, 2021
58 tasks
monojenkins
pushed a commit
to monojenkins/mono
that referenced
this issue
Jan 27, 2021
Closes dotnet/runtime#43106 In addition to implementing the intrinsics I have updated `System.Math:BigMul(long,long,byref):long` implementation in System.Private.CoreLib. The following is the codegen of the methods: ```asm ; Assembly listing for method System.Math:BigMul(long,long,byref):long ; Emitting BLENDED_CODE for generic ARM64 CPU - Windows ; ReadyToRun compilation ; optimized code ; fp based frame ; partially interruptible ; Final local variable assignments ; ; V00 arg0 [V00,T00] ( 4, 4 ) long -> x0 ; V01 arg1 [V01,T01] ( 4, 4 ) long -> x1 ; V02 arg2 [V02,T02] ( 3, 3 ) byref -> x2 ;# V03 OutArgs [V03 ] ( 1, 1 ) lclBlk ( 0) [sp+0x00] "OutgoingArgSpace" ; ; Lcl frame size = 0 G_M18264_IG01: ;; offset=0000H A9BF7BFD stp fp, lr, [sp,#-16]! 910003FD mov fp, sp ;; bbWeight=1 PerfScore 1.50 G_M18264_IG02: ;; offset=0008H 9B017C03 mul x3, x0, x1 F9000043 str x3, [x2] 9BC17C00 umulh x0, x0, x1 ;; bbWeight=1 PerfScore 8.00 G_M18264_IG03: ;; offset=0014H A8C17BFD ldp fp, lr, [sp],mono#16 D65F03C0 ret lr ;; bbWeight=1 PerfScore 2.00 ; Total bytes of code 28, prolog size 8, PerfScore 14.30, instruction count 7, allocated bytes for code 28 (MethodHash=96edb8a7) for method System.Math:BigMul(long,long,byref):long ; ============================================================ ; Assembly listing for method System.Math:BigMul(long,long,byref):long ; Emitting BLENDED_CODE for generic ARM64 CPU - Windows ; ReadyToRun compilation ; optimized code ; fp based frame ; partially interruptible ; Final local variable assignments ; ; V00 arg0 [V00,T00] ( 4, 4 ) long -> x0 ; V01 arg1 [V01,T01] ( 4, 4 ) long -> x1 ; V02 arg2 [V02,T02] ( 3, 3 ) byref -> x2 ;* V03 loc0 [V03 ] ( 0, 0 ) long -> zero-ref ;* V04 loc1 [V04 ] ( 0, 0 ) long -> zero-ref ld-addr-op ;# V05 OutArgs [V05 ] ( 1, 1 ) lclBlk ( 0) [sp+0x00] "OutgoingArgSpace" ; ; Lcl frame size = 0 G_M18264_IG01: ;; offset=0000H A9BF7BFD stp fp, lr, [sp,#-16]! 910003FD mov fp, sp ;; bbWeight=1 PerfScore 1.50 G_M18264_IG02: ;; offset=0008H 9B017C03 mul x3, x0, x1 F9000043 str x3, [x2] 9B417C00 smulh x0, x0, x1 ;; bbWeight=1 PerfScore 8.00 G_M18264_IG03: ;; offset=0014H A8C17BFD ldp fp, lr, [sp],mono#16 D65F03C0 ret lr ;; bbWeight=1 PerfScore 2.00 ; Total bytes of code 28, prolog size 8, PerfScore 14.30, instruction count 7, allocated bytes for code 28 (MethodHash=96edb8a7) for method System.Math:BigMul(long,long,byref):long ; ============================================================ ```
ghost
removed
the
in-pr
There is an active PR which will close this issue when it is merged
label
Jan 28, 2021
imhameed
pushed a commit
to mono/mono
that referenced
this issue
Jan 28, 2021
Closes dotnet/runtime#43106 In addition to implementing the intrinsics I have updated `System.Math:BigMul(long,long,byref):long` implementation in System.Private.CoreLib. The following is the codegen of the methods: ```asm ; Assembly listing for method System.Math:BigMul(long,long,byref):long ; Emitting BLENDED_CODE for generic ARM64 CPU - Windows ; ReadyToRun compilation ; optimized code ; fp based frame ; partially interruptible ; Final local variable assignments ; ; V00 arg0 [V00,T00] ( 4, 4 ) long -> x0 ; V01 arg1 [V01,T01] ( 4, 4 ) long -> x1 ; V02 arg2 [V02,T02] ( 3, 3 ) byref -> x2 ;# V03 OutArgs [V03 ] ( 1, 1 ) lclBlk ( 0) [sp+0x00] "OutgoingArgSpace" ; ; Lcl frame size = 0 G_M18264_IG01: ;; offset=0000H A9BF7BFD stp fp, lr, [sp,#-16]! 910003FD mov fp, sp ;; bbWeight=1 PerfScore 1.50 G_M18264_IG02: ;; offset=0008H 9B017C03 mul x3, x0, x1 F9000043 str x3, [x2] 9BC17C00 umulh x0, x0, x1 ;; bbWeight=1 PerfScore 8.00 G_M18264_IG03: ;; offset=0014H A8C17BFD ldp fp, lr, [sp],#16 D65F03C0 ret lr ;; bbWeight=1 PerfScore 2.00 ; Total bytes of code 28, prolog size 8, PerfScore 14.30, instruction count 7, allocated bytes for code 28 (MethodHash=96edb8a7) for method System.Math:BigMul(long,long,byref):long ; ============================================================ ; Assembly listing for method System.Math:BigMul(long,long,byref):long ; Emitting BLENDED_CODE for generic ARM64 CPU - Windows ; ReadyToRun compilation ; optimized code ; fp based frame ; partially interruptible ; Final local variable assignments ; ; V00 arg0 [V00,T00] ( 4, 4 ) long -> x0 ; V01 arg1 [V01,T01] ( 4, 4 ) long -> x1 ; V02 arg2 [V02,T02] ( 3, 3 ) byref -> x2 ;* V03 loc0 [V03 ] ( 0, 0 ) long -> zero-ref ;* V04 loc1 [V04 ] ( 0, 0 ) long -> zero-ref ld-addr-op ;# V05 OutArgs [V05 ] ( 1, 1 ) lclBlk ( 0) [sp+0x00] "OutgoingArgSpace" ; ; Lcl frame size = 0 G_M18264_IG01: ;; offset=0000H A9BF7BFD stp fp, lr, [sp,#-16]! 910003FD mov fp, sp ;; bbWeight=1 PerfScore 1.50 G_M18264_IG02: ;; offset=0008H 9B017C03 mul x3, x0, x1 F9000043 str x3, [x2] 9B417C00 smulh x0, x0, x1 ;; bbWeight=1 PerfScore 8.00 G_M18264_IG03: ;; offset=0014H A8C17BFD ldp fp, lr, [sp],#16 D65F03C0 ret lr ;; bbWeight=1 PerfScore 2.00 ; Total bytes of code 28, prolog size 8, PerfScore 14.30, instruction count 7, allocated bytes for code 28 (MethodHash=96edb8a7) for method System.Math:BigMul(long,long,byref):long ; ============================================================ ``` Co-authored-by: echesakovMSFT <[email protected]>
ghost
locked as resolved and limited conversation to collaborators
Feb 27, 2021
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Labels
api-approved
API was approved in API review, it can be implemented
arch-arm64
area-System.Runtime.Intrinsics
Exposing
smulh
/umulh
as intrinsics on Arm64would allow to implement
System.Math.BigMul
ascc @CarolEidt @tannergooding @TamarChristinaArm
The text was updated successfully, but these errors were encountered: