Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Text format revisions #884

Closed
tlively opened this issue Oct 2, 2018 · 23 comments
Closed

Text format revisions #884

tlively opened this issue Oct 2, 2018 · 23 comments

Comments

@tlively
Copy link
Member

tlively commented Oct 2, 2018

At the Oct. 2 CG meeting, there was unanimous consent to consider making changes to the text format to improve the consistency of instruction names and remove special characters. Specifically, interest has been expressed in removing slashes from all type conversion (wrap, trunc, extend, convert, demote, promote, and reinterpret) instructions and unifying the formats of memory and global instructions.

To minimize ecosystem disruption, we will plan to make all changes we agree on at once and update as many tools as possible at roughly the same time. We would like to resolve this issue quickly to avoid holding up the standardization process.

What final changes do we want to make?

@tlively
Copy link
Member Author

tlively commented Oct 2, 2018

I propose replacing slashes with underscores. For example, i32.wrap/i64 would become i32.wrap_i64.

Related discussion for nontrapping float to int conversions here: WebAssembly/nontrapping-float-to-int-conversions#4.

@rossberg
Copy link
Member

rossberg commented Oct 2, 2018

I'm fine with underscore for conversions. If we do that, I would propose switching around the _u/_s suffixes, so that they consistently go last in all mnemonics. That is,

  • i32.trunc_f32_u, i32.trunc_f32_s and so on

This makes it easier to consistently recognise e.g. all i32_xxx_u operating specifically on unsigned i32's.

The other changes I would propose, for consistency with memory and upcoming table instructions (see bulk operations proposal):

  • {get,set}_global -> global.{get,set}
  • {get,set,tee}_local -> local.{get,set,tee}

(For the record, one could also consider call -> func.call, but I won't be proposing that.)

@binji
Copy link
Member

binji commented Oct 2, 2018

OK, so then the full list would be like this, right?

;; table declaration
anyfunc -> funcref

;; core instructions
get_local -> local.get
set_local -> local.set
tee_local -> local.tee
get_global -> global.get
set_global -> global.set
i32.wrap/i64 -> i32.wrap_i64
i32.trunc_s/f32 -> i32.trunc_f32_s
i32.trunc_u/f32 -> i32.trunc_f32_u
i32.trunc_s/f64 -> i32.trunc_f64_s
i32.trunc_u/f64 -> i32.trunc_f64_u
i64.extend_s/i32 -> i64.extend_i32_s
i64.extend_u/i32 -> i64.extend_i32_u
i64.trunc_s/f32 -> i64.trunc_f32_s
i64.trunc_u/f32 -> i64.trunc_f32_u
i64.trunc_s/f64 -> i64.trunc_f64_s
i64.trunc_u/f64 -> i64.trunc_f64_u
f32.convert_s/i32 -> f32.convert_i32_s
f32.convert_u/i32 -> f32.convert_i32_u
f32.convert_s/i64 -> f32.convert_i64_s
f32.convert_u/i64 -> f32.convert_i64_u
f32.demote/f64 -> f32.demote_f64
f64.convert_s/i32 -> f64.convert_i32_s
f64.convert_u/i32 -> f64.convert_i32_u
f64.convert_s/i64 -> f64.convert_i64_s
f64.convert_u/i64 -> f64.convert_i64_u
f64.promote/f32 -> f64.promote_f32
i32.reinterpret/f32 -> i32.reinterpret_f32
i64.reinterpret/f64 -> i64.reinterpret_f64
f32.reinterpret/i32 -> f32.reinterpret_i32
f64.reinterpret/i64 -> f64.reinterpret_i64

;; saturating-float-to-int instructions
i32.trunc_s:sat/f32 -> i32.trunc_sat_f32_s
i32.trunc_u:sat/f32 -> i32.trunc_sat_f32_u
i32.trunc_s:sat/f64 -> i32.trunc_sat_f64_s
i32.trunc_u:sat/f64 -> i32.trunc_sat_f64_u
i64.trunc_s:sat/f32 -> i64.trunc_sat_f32_s
i64.trunc_u:sat/f32 -> i64.trunc_sat_f32_u
i64.trunc_s:sat/f64 -> i64.trunc_sat_f64_s
i64.trunc_u:sat/f64 -> i64.trunc_sat_f64_u

;; simd instructions
f32x4.convert_s/i32x4 -> f32x4.convert_i32x4_s
f32x4.convert_u/i32x4 -> f32x4.convert_i32x4_u
f64x2.convert_s/i64x2 -> f64x2.convert_i64x2_s
f64x2.convert_u/i64x2 -> f64x2.convert_i64x2_u
i32x4.trunc_s/f32x4:sat -> i32x4.trunc_sat_f32x4_s
i32x4.trunc_u/f32x4:sat -> i32x4.trunc_sat_f32x4_u
i64x2.trunc_s/f64x2:sat -> i64x2.trunc_sat_f64x2_s
i64x2.trunc_u/f64x2:sat -> i64x2.trunc_sat_f64x2_u

;; atomic instructions
i32.atomic.rmw8_u.add -> i32.atomic.rmw8.add_u
i32.atomic.rmw16_u.add -> i32.atomic.rmw16.add_u
i64.atomic.rmw8_u.add -> i64.atomic.rmw8.add_u
i64.atomic.rmw16_u.add -> i64.atomic.rmw16.add_u
i64.atomic.rmw32_u.add -> i64.atomic.rmw32.add_u
i32.atomic.rmw8_u.sub -> i32.atomic.rmw8.sub_u
i32.atomic.rmw16_u.sub -> i32.atomic.rmw16.sub_u
i64.atomic.rmw8_u.sub -> i64.atomic.rmw8.sub_u
i64.atomic.rmw16_u.sub -> i64.atomic.rmw16.sub_u
i64.atomic.rmw32_u.sub -> i64.atomic.rmw32.sub_u
i32.atomic.rmw8_u.and -> i32.atomic.rmw8.and_u
i32.atomic.rmw16_u.and -> i32.atomic.rmw16.and_u
i64.atomic.rmw8_u.and -> i64.atomic.rmw8.and_u
i64.atomic.rmw16_u.and -> i64.atomic.rmw16.and_u
i64.atomic.rmw32_u.and -> i64.atomic.rmw32.and_u
i32.atomic.rmw8_u.or -> i32.atomic.rmw8.or_u
i32.atomic.rmw16_u.or -> i32.atomic.rmw16.or_u
i64.atomic.rmw8_u.or -> i64.atomic.rmw8.or_u
i64.atomic.rmw16_u.or -> i64.atomic.rmw16.or_u
i64.atomic.rmw32_u.or -> i64.atomic.rmw32.or_u
i32.atomic.rmw8_u.xor -> i32.atomic.rmw8.xor_u
i32.atomic.rmw16_u.xor -> i32.atomic.rmw16.xor_u
i64.atomic.rmw8_u.xor -> i64.atomic.rmw8.xor_u
i64.atomic.rmw16_u.xor -> i64.atomic.rmw16.xor_u
i64.atomic.rmw32_u.xor -> i64.atomic.rmw32.xor_u
i32.atomic.rmw8_u.xchg -> i32.atomic.rmw8.xchg_u
i32.atomic.rmw16_u.xchg -> i32.atomic.rmw16.xchg_u
i64.atomic.rmw8_u.xchg -> i64.atomic.rmw8.xchg_u
i64.atomic.rmw16_u.xchg -> i64.atomic.rmw16.xchg_u
i64.atomic.rmw32_u.xchg -> i64.atomic.rmw32.xchg_u
i32.atomic.rmw8_u.cmpxchg -> i32.atomic.rmw8.cmpxchg_u
i32.atomic.rmw16_u.cmpxchg -> i32.atomic.rmw16.cmpxchg_u
i64.atomic.rmw8_u.cmpxchg -> i64.atomic.rmw8.cmpxchg_u
i64.atomic.rmw16_u.cmpxchg -> i64.atomic.rmw16.cmpxchg_u
i64.atomic.rmw32_u.cmpxchg -> i64.atomic.rmw32.cmpxchg_u

Note: Edited to reflect @rossberg's comment below.

@rossberg
Copy link
Member

rossberg commented Oct 2, 2018

It would be more consistent to have the _u/_s go after the _sat, for the same reasons mentioned above.

In fact, I suggest establishing the following consistent naming scheme for numeric instruction mnemonics:

  • t.xxx for all numeric instructions agnostic to signedness
  • t.xxx_u/t.xxx_s for all numeric instructions sensitive to signedness

where xxx describes the specific operation (e.g., add). (That already holds for everything but conversions.)

Conversions are then instantiations of this scheme with xxx = yyy_t, yielding:

  • t.yyy_t for conversions agnostic to signedness
  • t.yyy_t_u/t.yyy_t_s for conversions sensitive to sign

where yyy describes the kind of conversion (e.g., trunc or trunc_sat).

@binji
Copy link
Member

binji commented Oct 3, 2018

OK, I've updated the comment above to match this.

@lars-t-hansen
Copy link

Since we're just chatting anyway: It irks me that f64.lt (say) does not fit the general pattern; it should really be i32.lt_f64.

@lars-t-hansen
Copy link

Oh, and atomics:

i32.atomic.load8_u fits the proposed pattern.
i32.atomic.rmw8_u.add more or less does not.

@rossberg
Copy link
Member

rossberg commented Oct 3, 2018

@lars-t-hansen, I suppose you cannot interpret the type before the dot strictly as the result type of an operator. It's the type that the operator "is defined at" in some looser sense.

@lars-t-hansen
Copy link

I'm not sure why I wouldn't want to interpret the type strictly as the result type of the operator, though, since that pattern is used very broadly, just not in some cases where it's suddenly about the arguments and the result type is implied (in all cases I can think of the implied result type is i32). Of course there's a succinctness to the instructions that fall under that exception that would be lost if we applied the general pattern, but I don't know why that is an argument either.

(I'm not just trolling, I actually think the current syntax for comparisons is confusing and counterintuitive. Even lt_f64 -- without any type-designating prefix -- would have been better than the current f64.lt.)

@binji
Copy link
Member

binji commented Oct 3, 2018

I dunno, i32.lt_f64 looks super weird to me. If we think of the type before the . as a result type, many instructions don't make sense, such as i32.store or even memory.size. Personally I don't think it's that valuable to encode the types in the instruction name anyway, we just need a way to differentiate them.

@tlively
Copy link
Member Author

tlively commented Oct 4, 2018

I think of the type before the dot as signifying what type the generic instruction is specialized on. Since all lts return i32, the type before the dot tells me what type the operands are instead, since those still need to be disambiguated.

@alexcrichton
Copy link
Contributor

FWIW some possible breakage I think that may want to be considered when doing I believe is:

  • Libraries which want to be idiomatic with the current wasm spec will need breaking changes to update their type names to reflect these name changes. For example a parser library in Rust, parity-wasm, would need a breaking change to rename Instruction::GetLocal to Instruction::LocalGet.
  • We've got tests in Rust which verify that the correct instruction is generated for a particular function (namely for intrinsics like SIMD and such). These instruction names come from disassembly the wasm binary itself with wasm2wat and the instructions to assert for are embedded in the source code. If wasm2wat changes its output, these tests will start to break.
  • A final one is one we've talked about in the Rust project historically is the "blog post" problem. This is where (for example) there's blog posts, tutorials, reference materials, etc on the internet today which may reference wasm instructions by the old names. If the standard changes a number of instructions (especially common ones like get_local) then all this documentation becomes out of date and needs and update.

To be clear I'm not opposed to changing the instructions to be more consistent, but I wanted to be sure to point out some perhaps more subtle locations that "breakage" can arise in the non-traditional sense. They're all pretty minor and we're certainly more than willing to do the updates on the Rust side of things, but points that may want to be considered!

Also @binji your comment with an exhaustive list of renames is super helpful!

@binji
Copy link
Member

binji commented Oct 4, 2018

Right, we've already experienced some of the same issues when changing grow_memory to memory.grow and current_memory -> memory.size, and these instructions are much more common.

@lars-t-hansen is right about atomics too, these are odd now:

i32.atomic.rmw.add 
i64.atomic.rmw.add 
i32.atomic.rmw8_u.add
i32.atomic.rmw16_u.add
i64.atomic.rmw8_u.add
i64.atomic.rmw16_u.add
i64.atomic.rmw32_u.add
...

@rossberg
Copy link
Member

rossberg commented Oct 9, 2018

@binji, good point. I propose moving the _u/_s to the end for those as well (which arguably fits existing instructions).

@binji
Copy link
Member

binji commented Oct 9, 2018

OK, updated my comment above to include these as well.

@rossberg
Copy link
Member

Oh, that reminds me of one more thing that I'd like to suggest. In the light of the reference type and GC proposals it would be more consistent if we renamed anyfunc to funcref, so that it lines up with the likes of anyref, eqref, optref, and possibly other subtypes of ref.

@binji
Copy link
Member

binji commented Oct 16, 2018

OK, added that to the list.

I think as a good first step, we should file bugs in the various trackers where the name change is relevant and link to this issue.

@aardappel
Copy link

Apologies for the bike-shedding, but I don't see a whole lot of consistency in the use of . and _.

So far the most frequent use of . has been to separate the type operated upon (input types, not return type) before the . from the rest of the instruction. Which to me means operations like get_local which are not type specific shouldn't have a . in them. And I'm not sure why there are multiple . in atomic ops but not in others.

I'd suggest that either . has a very intuitive meaning like above, or maybe we should do away with . as well, and make _ the only separator.

In fact, while I am ranting, why does i32.add have a type and get_local does not? Both participate in completely static dataflow determined by the stack and locals, and both will have completely static types that are verified when bytecode is loaded. If get_local loads the wrong type, that is an error, but that is not obvious from the instruction. If i32.add receives a wrong operand that is also an error, but here it is more obvious. If instead the instruction was add it wouldn't really change anything, beyond knowing which of the 2 operands is wrong.

@tlively
Copy link
Member Author

tlively commented Dec 15, 2018

@binji it looks like we have generally settled on the list above. Is there any action we need to take to officially finalize it?

@binji
Copy link
Member

binji commented Dec 15, 2018

I don't think so, we agreed to the change in the CG meeting, and the change to the spec has landed. I think at this point we should update the tools to use the new names.

tlively added a commit to tlively/binaryen that referenced this issue Jan 7, 2019
Automated renaming according to
WebAssembly/spec#884 (comment).
tlively added a commit to tlively/binaryen that referenced this issue Jan 7, 2019
Automated renaming according to
WebAssembly/spec#884 (comment).
tlively added a commit to WebAssembly/binaryen that referenced this issue Jan 7, 2019
llvm-git-migration pushed a commit to llvm-git-prototype/llvm that referenced this issue Jan 8, 2019
Summary:
An automated renaming of all the instructions listed at
WebAssembly/spec#884 (comment)
as well as some similarly-named identifiers.

Reviewers: aheejin, dschuff, aardappel

Subscribers: sbc100, jgravelle-google, eraman, sunfish, jfb, llvm-commits

Differential Revision: https://reviews.llvm.org/D56338

llvm-svn: 350609
aheejin added a commit to aheejin/threads that referenced this issue Feb 18, 2019
This renames instructions according to
WebAssembly/spec#884 (comment).
binji pushed a commit to WebAssembly/threads that referenced this issue Feb 18, 2019
aheejin added a commit to aheejin/nontrapping-float-to-int-conversions that referenced this issue Mar 30, 2019
Renames instructions as discussed in
WebAssembly/spec#884 (comment).
aheejin added a commit to aheejin/nontrapping-float-to-int-conversions that referenced this issue Mar 30, 2019
Renames instructions as discussed in
WebAssembly/spec#884 (comment).
Closes WebAssembly#4 and fixes WebAssembly#6.
sunfishcode pushed a commit to WebAssembly/nontrapping-float-to-int-conversions that referenced this issue Apr 2, 2019
Renames instructions as discussed in
WebAssembly/spec#884 (comment).
Closes #4 and fixes #6.
uuhan pushed a commit to uuhan/v8 that referenced this issue May 27, 2019
These instructions were renamed in the October 2, WebAssembly CG meeting. The
issue describing the change is here:

WebAssembly/spec#884

Change-Id: Ia9e8733156b5ed5db7fc9ab1681c1a51b874dd71
Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/1620681
Reviewed-by: Clemens Hammacher <[email protected]>
Commit-Queue: Ben Smith <[email protected]>
Cr-Commit-Position: refs/heads/master@{#61711}
sunfishcode pushed a commit to WebAssembly/nontrapping-float-to-int-conversions that referenced this issue Aug 27, 2019
Renames instructions as discussed in
WebAssembly/spec#884 (comment).
Closes #4 and fixes #6.
sunfishcode added a commit to WebAssembly/design that referenced this issue Aug 30, 2019
tlively added a commit to WebAssembly/simd that referenced this issue Sep 13, 2019
This renaming was decided on in
WebAssembly/spec#884.
dtig pushed a commit to WebAssembly/simd that referenced this issue Sep 13, 2019
* Update conversion op names

This renaming was decided on in
WebAssembly/spec#884.
Honry pushed a commit to Honry/simd that referenced this issue Oct 19, 2019
Honry pushed a commit to Honry/simd that referenced this issue Oct 19, 2019
* Update conversion op names

This renaming was decided on in
WebAssembly/spec#884.
sunfishcode added a commit to WebAssembly/design that referenced this issue Feb 27, 2020
* Define non-trapping float-to-int conversions.

This also introduces the concept of prefix bytes, and defines the "numeric" prefix
byte, used for encodin the new conversion instructions.

* Add feature markers.

* Add the feature marker in more places.

* Rename "numeric" to "misc".

See WebAssembly/nontrapping-float-to-int-conversions#5.

* Rename opcodes.

See WebAssembly/spec#884 (comment).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants