Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CL/aarch64: implement the wasm SIMD i32x4.dot_i16x8_s instruction #2327

Merged

Conversation

julian-seward1
Copy link
Contributor

This patch implements, for aarch64, the following wasm SIMD extensions

i32x4.dot_i16x8_s instruction
WebAssembly/simd#127

It also updates dependencies as follows, in order that the new instruction can
be parsed, decoded, etc:

wat to 1.0.27
wast to 26.0.1
wasmparser to 0.65.0
wasmprinter to 0.2.12

The changes are straightforward:

  • new CLIF instruction widening_pairwise_dot_product_s

  • translation from wasm into widening_pairwise_dot_product_s

  • new AArch64 instructions smull, smull2 (part of the VecRRR group)

  • translation from widening_pairwise_dot_product_s to smull ; smull2 ; addv

There is no testcase in this commit, because that is a separate repo. The
implementation has been tested, nevertheless.

@github-actions github-actions bot added cranelift Issues related to the Cranelift code generator cranelift:area:aarch64 Issues related to AArch64 backend. cranelift:area:peepmatic cranelift:meta Everything related to the meta-language. cranelift:wasm fuzzing Issues related to our fuzzing infrastructure lightbeam Issues related to the Lightbeam compiler wasmtime:api Related to the API of the `wasmtime` crate itself labels Oct 27, 2020
@github-actions
Copy link

Subscribe to Label Action

cc @fitzgen, @peterhuene

This issue or pull request has been labeled: "cranelift", "cranelift:area:aarch64", "cranelift:area:peepmatic", "cranelift:meta", "cranelift:wasm", "fuzzing", "lightbeam", "wasmtime:api"

Thus the following users have been cc'd because of the following labels:

  • fitzgen: cranelift:area:peepmatic, fuzzing
  • peterhuene: wasmtime:api

To subscribe or unsubscribe from this label, edit the .github/subscribe-to-label.json configuration file.

Learn more.

@@ -290,6 +290,10 @@ pub enum VecALUOp {
Umlal,
/// Zip vectors (primary) [meaning, high halves]
Zip1,
/// Signed multiply long (low halves)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a comment (no need to take action) - it's probably time to introduce an Inst variant for widening binary operations, so that the handling of the high and the low halves can be streamlined, for example, but we can do this separately. We previously took a shortcut here because Umlal was the only odd man out.

cc @jgouly

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that sounds like a nice cleanup for a followup. I'm curious though .. when you say "handling of the high and the low halves can be streamlined", what did you have in mind, roughly?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have a look at Inst::VecMiscNarrow, which organizes things similarly, but for narrowing operations. TBH I don't have an exact code change in mind (as I said, neither Joey, nor I did anything yet because there was no real need), but having a separate boolean parameter to specify the half is probably going to be the way to do it.

I dislike the ISA's approach (via the vector shape), but it has constraints (encoding space) that are not applicable to the Inst enum. Anyway, the Inst variants don't necessarily map 1:1 to machine instructions, so we can use that to improve ergonomics (e.g. we support bitwise operations on arbitrary vector shapes, not just 8-bit elements).

@julian-seward1
Copy link
Contributor Author

cc @yurydelendik

Copy link
Contributor

@yurydelendik yurydelendik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, with proper type check/assertion.

@julian-seward1 julian-seward1 force-pushed the arm64-simd-dotmul branch 2 times, most recently from f007e8b to 4883749 Compare November 3, 2020 11:37
Copy link
Contributor

@akirilov-arm akirilov-arm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

This patch implements, for aarch64, the following wasm SIMD extensions

  i32x4.dot_i16x8_s instruction
  WebAssembly/simd#127

It also updates dependencies as follows, in order that the new instruction can
be parsed, decoded, etc:

  wat          to  1.0.27
  wast         to  26.0.1
  wasmparser   to  0.65.0
  wasmprinter  to  0.2.12

The changes are straightforward:

* new CLIF instruction `widening_pairwise_dot_product_s`

* translation from wasm into `widening_pairwise_dot_product_s`

* new AArch64 instructions `smull`, `smull2` (part of the `VecRRR` group)

* translation from `widening_pairwise_dot_product_s` to `smull ; smull2 ; addv`

There is no testcase in this commit, because that is a separate repo.  The
implementation has been tested, nevertheless.
@julian-seward1 julian-seward1 merged commit 5a5fb11 into bytecodealliance:main Nov 3, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cranelift:area:aarch64 Issues related to AArch64 backend. cranelift:meta Everything related to the meta-language. cranelift:wasm cranelift Issues related to the Cranelift code generator fuzzing Issues related to our fuzzing infrastructure lightbeam Issues related to the Lightbeam compiler wasmtime:api Related to the API of the `wasmtime` crate itself
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants