[spec] Ambiguous if opcodes after the 0xFC and 0xFD prefix have multiple valid encodings #1526

MarkDerosier · 2022-08-24T20:46:29Z

https://webassembly.github.io/spec/core/appendix/index-instructions.html
lists the binary opcode of "i32.trunc_sat_f64_u" as 0xFC 0x03,

However, https://webassembly.github.io/spec/core/binary/instructions.html#numeric-instructions

lists "i32.trunc_sat_f64_u" as being encoded as 0xFC 3:u32.

The u32 is a link that goes to the LEB128 page of the spec, https://webassembly.github.io/spec/core/binary/values.html#binary-int ,
which I think notes that 'trailing zeros' are allowed in the encoding. It explicitly mentions that 0x03 and 0x83 0x00 are well formed encodings of the value 3.

Since unsigned integers have multiple encodings in LEB128, can't "i32.trunc_sat_f64_u" be encoded as 0xFC 0x03, and 0xFC 0x83 0x00 (among more encodings with trailing zeros)?

When the spec uses a an integer constant encoded as a LEB128 u32, does it intend the shortest encoding?
Similarly, select_t requires a vector of length 1, but there are multiple ways to encode 1.

In other places in the spec, such as the alignment field in memarg of memory instructions https://webassembly.github.io/spec/core/binary/instructions.html#memory-instructions, it would be useful to note that although the integers encoded are small, and there are only a few of them, it can still be longer than 1 byte due to trailing zeros encodings.

Personally I would prefer if there was only one way to encode said instructions as it would simplify my parser.

tlively · 2022-08-24T23:21:50Z

Your reading of the opcodes after the prefix being LEB128 is correct, so yes, there are multiple ways to encode these operations. Hopefully this doesn't complicate your parser too much, since LEB128s appear in so many other locations in the binary format as well.

rossberg · 2022-08-25T16:30:18Z

PR #1528 adds a clarifying note to the index.

rossberg · 2022-08-25T18:54:03Z

Closing via #1528.

rossberg mentioned this issue Aug 25, 2022

[spec] Add note to instruction index #1528

Merged

rossberg closed this as completed Aug 25, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[spec] Ambiguous if opcodes after the 0xFC and 0xFD prefix have multiple valid encodings #1526

[spec] Ambiguous if opcodes after the 0xFC and 0xFD prefix have multiple valid encodings #1526

MarkDerosier commented Aug 24, 2022

tlively commented Aug 24, 2022

rossberg commented Aug 25, 2022

rossberg commented Aug 25, 2022

[spec] Ambiguous if opcodes after the 0xFC and 0xFD prefix have multiple valid encodings #1526

[spec] Ambiguous if opcodes after the 0xFC and 0xFD prefix have multiple valid encodings #1526

Comments

MarkDerosier commented Aug 24, 2022

tlively commented Aug 24, 2022

rossberg commented Aug 25, 2022

rossberg commented Aug 25, 2022