-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CL/aarch64: implement the wasm SIMD i32x4.dot_i16x8_s
instruction
#2327
CL/aarch64: implement the wasm SIMD i32x4.dot_i16x8_s
instruction
#2327
Conversation
Subscribe to Label Actioncc @fitzgen, @peterhuene
This issue or pull request has been labeled: "cranelift", "cranelift:area:aarch64", "cranelift:area:peepmatic", "cranelift:meta", "cranelift:wasm", "fuzzing", "lightbeam", "wasmtime:api"
Thus the following users have been cc'd because of the following labels:
To subscribe or unsubscribe from this label, edit the |
@@ -290,6 +290,10 @@ pub enum VecALUOp { | |||
Umlal, | |||
/// Zip vectors (primary) [meaning, high halves] | |||
Zip1, | |||
/// Signed multiply long (low halves) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a comment (no need to take action) - it's probably time to introduce an Inst
variant for widening binary operations, so that the handling of the high and the low halves can be streamlined, for example, but we can do this separately. We previously took a shortcut here because Umlal
was the only odd man out.
cc @jgouly
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that sounds like a nice cleanup for a followup. I'm curious though .. when you say "handling of the high and the low halves can be streamlined", what did you have in mind, roughly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have a look at Inst::VecMiscNarrow
, which organizes things similarly, but for narrowing operations. TBH I don't have an exact code change in mind (as I said, neither Joey, nor I did anything yet because there was no real need), but having a separate boolean parameter to specify the half is probably going to be the way to do it.
I dislike the ISA's approach (via the vector shape), but it has constraints (encoding space) that are not applicable to the Inst
enum. Anyway, the Inst
variants don't necessarily map 1:1 to machine instructions, so we can use that to improve ergonomics (e.g. we support bitwise operations on arbitrary vector shapes, not just 8-bit elements).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, with proper type check/assertion.
f007e8b
to
4883749
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
This patch implements, for aarch64, the following wasm SIMD extensions i32x4.dot_i16x8_s instruction WebAssembly/simd#127 It also updates dependencies as follows, in order that the new instruction can be parsed, decoded, etc: wat to 1.0.27 wast to 26.0.1 wasmparser to 0.65.0 wasmprinter to 0.2.12 The changes are straightforward: * new CLIF instruction `widening_pairwise_dot_product_s` * translation from wasm into `widening_pairwise_dot_product_s` * new AArch64 instructions `smull`, `smull2` (part of the `VecRRR` group) * translation from `widening_pairwise_dot_product_s` to `smull ; smull2 ; addv` There is no testcase in this commit, because that is a separate repo. The implementation has been tested, nevertheless.
4883749
to
c5374ec
Compare
This patch implements, for aarch64, the following wasm SIMD extensions
i32x4.dot_i16x8_s instruction
WebAssembly/simd#127
It also updates dependencies as follows, in order that the new instruction can
be parsed, decoded, etc:
wat to 1.0.27
wast to 26.0.1
wasmparser to 0.65.0
wasmprinter to 0.2.12
The changes are straightforward:
new CLIF instruction
widening_pairwise_dot_product_s
translation from wasm into
widening_pairwise_dot_product_s
new AArch64 instructions
smull
,smull2
(part of theVecRRR
group)translation from
widening_pairwise_dot_product_s
tosmull ; smull2 ; addv
There is no testcase in this commit, because that is a separate repo. The
implementation has been tested, nevertheless.