Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Track non-0x00 tableindex as ref types use #3695

Merged
merged 1 commit into from
Aug 16, 2024

Conversation

jkrems
Copy link
Contributor

@jkrems jkrems commented Aug 7, 2024

Any use of a table index that isn't exactly a null byte (0x00) means
that the module makes use of the reference types proposal. This is
important to track because aot_compiler.c will blindly assume that all
table indices are a single byte long otherwise.

This fixes a crash in WAMR for modules that contain multi-byte encodings
of table indices in call_indirect but make no other use of reference
types features.

@jkrems
Copy link
Contributor Author

jkrems commented Aug 7, 2024

Actually, this should really fix the logic in core/iwasm/interpreter/wasm_loader.c instead to detect usage of multi-byte table indices as module->is_ref_types_used. I'll update this PR later today. Updated the PR.

@jkrems jkrems changed the title Allow all LEB128 encodings of zero for table index Track non-0x00 tableindex as ref types use Aug 7, 2024
if (*p != 0x00) {
module->is_ref_types_used = true;
}
#endif
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for reporting the issue, seems that it is invalid to read table_idx with read_uint8 even when REF_TYPES/GC isn't enabled, from what spec mentions and what other places handle (wasm_interp_classic.c CALL_INDIRECT also read table_idx with read_leb_uint32):
https://webassembly.github.io/spec/core/binary/modules.html#binary-tableidx,
we should always treat table_idx as uint32 and use read_leb_uint32 to read the data, and after that, check whether table_idx is 0: if not, then set module->is_ref_types_used = true:

                read_leb_uint32(p, p_end, table_idx);
#if WASM_ENABLE_REF_TYPES != 0 || WASM_ENABLE_GC != 0
#if WASM_ENABLE_WAMR_COMPILER != 0
                if (table_idx != 0)
                    module->is_ref_types_used = true;
#endif
#endif
                if (!check_table_index(module, table_idx, error_buf,
                                       error_buf_size)) {
                    goto fail;
                }

And should also change the handlings of CALL_INDIRECT and RETURN_CALL_INDIRECT in many other places: wasm_loader_find_block_addr in this file, wasm_mini_loader.c, aot_compiler.c and fast-jit/jit_frontend.c.

Could you help confirm it and fix the related code accordingly? Thanks.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for reporting the issue, seems that it is invalid to read table_idx with read_uint8 even when REF_TYPES/GC isn't enabled, from what spec mentions and what other places handle (wasm_interp_classic.c CALL_INDIRECT also read table_idx with read_leb_uint32): https://webassembly.github.io/spec/core/binary/modules.html#binary-tableidx, we should always treat table_idx as uint32 and use read_leb_uint32 to read the data, and after that, check whether table_idx is 0: if not, then set module->is_ref_types_used = true:

our REF_TYPES means https://github.com/WebAssembly/reference-types, right?
the url you are referring to is inappropriate because it already includes reference-types.
looking at the diff, (WebAssembly/spec@7fa2f20#diff-4343dbfdb9bd9737d2f391b4624ca72216a55d3ea74eca470aab7fd525ec3958)
the call_indirect imm before reference-types was just 00, not a leb u32 0.

my understanding is that there is no way to distinguish an invalid module (call_indirect .. 0x80 00 w/o reference-types) and a valid redundant encoding. (call_indirect .. 0x80 00 w/ reference-types)
as wamrc has an option to disable reference-types, (--disable-ref-types) i guess the best thing we can do is to make wasm_loader.c honor the option when WASM_ENABLE_WAMR_COMPILER is defined.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, thanks for the correction, it really easily makes confusing since the spec is evolving. You are right, in the earlier version of wasm spec without reference-types, the table index is 0 of one byte, this is the snapshot from wasm spec Release 1.1 pdf document:
image

Do you mean this PR's fixing of wasm_loader.c is good and we should also handle other places?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean this PR's fixing of wasm_loader.c is good and we should also handle other places?

i want to make wamrc bail out when enable_ref_types==false and is_ref_types_used==true at least.
but it doesn't need to be a part of this PR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, so maybe we can merge this PR first and then submit another PR to fix the wamrc issue and issues in other places.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure.

core/iwasm/interpreter/wasm_loader.c Show resolved Hide resolved
if (*p != 0x00) {
module->is_ref_types_used = true;
}
#endif
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean this PR's fixing of wasm_loader.c is good and we should also handle other places?

i want to make wamrc bail out when enable_ref_types==false and is_ref_types_used==true at least.
but it doesn't need to be a part of this PR.

@yamt
Copy link
Collaborator

yamt commented Aug 8, 2024

@wenyongh can you dismiss my approval? (it seems i can't do that by myself.) i noticed i misread the diff. maybe i will take a look later.

Any use of a table index that isn't exactly a null byte (`0x00`) means
that the module makes use of the reference types proposal. This is
important to track because `aot_compiler.c` will blindly assume that all
table indices are a single byte long otherwise.

This fixes a crash in WAMR for modules that contain multi-byte encodings
of table indices in `call_indirect` but make no other use of reference
types features.
@wenyongh
Copy link
Contributor

wenyongh commented Aug 9, 2024

@wenyongh can you dismiss my approval? (it seems i can't do that by myself.) i noticed i misread the diff. maybe i will take a look later.

I can't do it also, but I found @jkrems has added the comment according to your suggestion, is it good for you now? Shall we merge the PR?

@jkrems
Copy link
Contributor Author

jkrems commented Aug 9, 2024

One thing I wanted to call out - I'm not super familiar with the bounds check convention in the surrounding code and this code dereferences p without adding checks.

@wenyongh
Copy link
Contributor

wenyongh commented Aug 9, 2024

One thing I wanted to call out - I'm not super familiar with the bounds check convention in the surrounding code and this code dereferences p without adding checks.

Yes, had better add bounds check, how about changing the condition to if (p + 1 < p_end && *p != 0x00) {? And could you change the coding style of comment from // to /* .. */?

// Any non-0x00 byte requires the ref types proposal.
// This is different from checking the table_idx value
// since `0x80 0x00` etc. are all valid encodings of zero.
module->is_ref_types_used = true;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when reference-types is enabled, the encoding difference (0x08 0x00 and 0x00) should not make the output different. i'm not sure if it's a good idea to flip the feature bit (WASM_FEATURE_REF_TYPES) depends on the leb encoding.

how about:

  • if reference-types is disabled by the wamrc option, make the validation (wasm_loader.c) fail on 0x80 0x00
  • if reference-types is enabled, you don't need to set is_ref_types_used here because the encoding difference should not affect the aot compilation.
  • given that the module is already validated, another parser (aot_compile_func) can always parse it as leb128.

Copy link
Contributor Author

@jkrems jkrems Aug 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

given that the module is already validated, another parser (aot_compile_func) can always parse it as leb128.

For the last bullet point, do you mean that I should instead update aot_compiler.c to always parse this field as leb128 instead of checking comp_ctx->enable_ref_types? Afaict that should be sufficient. More concretely, remove the conditional in https://github.com/bytecodealliance/wasm-micro-runtime/blob/main/core/iwasm/compilation/aot_compiler.c#L1171-L1177.

if reference-types is disabled by the wamrc option, make the validation (wasm_loader.c) fail on 0x80 0x00

I believe the logic in wasm_loader.c currently only has access to WASMModule and WASMFunction but not to the options object. Would this require some refactoring to thread that option into wasm_loader.c? It looks like wamrc currently calls the loader pretty early on without passing any options (beside filename) but there appears to be wasm_runtime_load_ex with LoadArgs that could be expanded to forward that signal..?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, currently the aot compiler's options cannot be passed and handled in wasm loader, one workaround that I can find out is to make wamrc bail out as early as we can when creating the aot compiler context, like the handling of SIMD options:

diff --git a/core/iwasm/compilation/aot_llvm.c b/core/iwasm/compilation/aot_llvm.c
index fcf29191..7dac3e57 100644
--- a/core/iwasm/compilation/aot_llvm.c
+++ b/core/iwasm/compilation/aot_llvm.c
@@ -3108,6 +3108,16 @@ aot_create_comp_context(const AOTCompData *comp_data, aot_comp_option_t option)
         goto fail;
     }

+    /* Return error if ref-types and GC are disabled by command line but
+       ref-types instructions are used */
+    if (!option->enable_ref_types && !option->enable_gc
+        && wasm_module->is_ref_types_used) {
+        aot_set_last_error("ref-types instruction was found, "
+                           "try removing --disable-ref-types option "
+                           "or adding --enable-gc option.");
+        goto fail;
+    }
+
     /* Disable features when they are not actually used */
     if (!wasm_module->is_simd_used) {
         option->enable_simd = comp_ctx->enable_simd = false;

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

given that the module is already validated, another parser (aot_compile_func) can always parse it as leb128.

For the last bullet point, do you mean that I should instead update aot_compiler.c to always parse this field as leb128 instead of checking comp_ctx->enable_ref_types?

yes.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, currently the aot compiler's options cannot be passed and handled in wasm loader, one workaround that I can find out is to make wamrc bail out as early as we can when creating the aot compiler context, like the handling of SIMD options:

diff --git a/core/iwasm/compilation/aot_llvm.c b/core/iwasm/compilation/aot_llvm.c
index fcf29191..7dac3e57 100644
--- a/core/iwasm/compilation/aot_llvm.c
+++ b/core/iwasm/compilation/aot_llvm.c
@@ -3108,6 +3108,16 @@ aot_create_comp_context(const AOTCompData *comp_data, aot_comp_option_t option)
         goto fail;
     }

+    /* Return error if ref-types and GC are disabled by command line but
+       ref-types instructions are used */
+    if (!option->enable_ref_types && !option->enable_gc
+        && wasm_module->is_ref_types_used) {
+        aot_set_last_error("ref-types instruction was found, "
+                           "try removing --disable-ref-types option "
+                           "or adding --enable-gc option.");
+        goto fail;
+    }
+
     /* Disable features when they are not actually used */
     if (!wasm_module->is_simd_used) {
         option->enable_simd = comp_ctx->enable_simd = false;

i guess it works, yes.
for a longer term, it might be simpler to pass the enabled feature bits to the loader though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yamt I merged this PR and submitted PR #3726 to fix more issues, please help have a look when you are available. Thanks.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fast feedback and sorry for my delays in responding to it! :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jkrems welcome, it doesn't matter😊

@wenyongh wenyongh merged commit 59f761b into bytecodealliance:main Aug 16, 2024
385 checks passed
@jkrems jkrems deleted the patch-1 branch August 16, 2024 13:20
@wenyongh
Copy link
Contributor

wenyongh commented Aug 19, 2024

@yamt I merged this PR and some other PRs for a better testing for releasing WAMR-2.1.2, please let us know if you have comments:
https://github.com/bytecodealliance/wasm-micro-runtime/commits/main/
image

And BTW, the PR #3711 was merged into branch dev/merged_aot_data_text, as it impacts many platforms and targets, I think we had better merge it into branch main after the new release is created.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants