Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

wasmparser: optimize type section valiadation by specializing over Wasm feature subset #1906

Merged

Conversation

Robbepop
Copy link
Collaborator

@Robbepop Robbepop commented Nov 13, 2024

Closes #1701.

cc @alexcrichton : Are there any other Wasm proposals that might conflict?

This implements a lightweight type section validation for a subset of Wasm proposals.

Benchmarks

I compared main against PR and changed the used WasmFeatures set for both:

Validator::new_with_features(
    WasmFeatures::WASM2
        .union(WasmFeatures::CUSTOM_PAGE_SIZES)
        .union(WasmFeatures::EXTENDED_CONST)
        .union(WasmFeatures::MEMORY64)
        .union(WasmFeatures::MULTI_MEMORY)
        .union(WasmFeatures::RELAXED_SIMD)
        .union(WasmFeatures::TAIL_CALL)
        .union(WasmFeatures::THREADS)
        .union(WasmFeatures::WIDE_ARITHMETIC);
)

image

@Robbepop Robbepop changed the title Add specialized type section validation for Wasm feature subset wasmparser: Add specialized type section validation for Wasm feature subset Nov 13, 2024
@Robbepop Robbepop changed the title wasmparser: Add specialized type section validation for Wasm feature subset wasmparser: optimize type section valiadation by specializing over Wasm feature subset Nov 13, 2024
@Robbepop
Copy link
Collaborator Author

I also ran the wam-tools Wasm testsuite for validation and disabled the function-references and gc proposals to trigger the fast-validation path. As expected, the testsuite did not pass with the following failures:

failures:
    "tests/local/component-model/gc.wast"
    "tests/local/exnref/try-table.wast"
    "tests/local/folding/array_fill.wast"
    "tests/local/folding/array_init_data.wast"
    "tests/local/folding/array_copy.wast"
    "tests/local/folding/array_init_elem.wast"
    "tests/local/folding/array.wast"
    "tests/local/function-reference-structural-matching.wat"
    "tests/local/folding/struct.wast"
    "tests/local/folding/fold-stack-switching-gc.wast"
    "tests/local/folding/fold-stack-switching.wast"
    "tests/local/instance.wast"
    "tests/local/invalid-init-expr.wast"
    "tests/local/shared-everything-threads/polymorphism.wast"
    "tests/local/shared-everything-threads/gc-unreachable.wast"
    "tests/local/shared-everything-threads/builtins.wast"
    "tests/local/shared-everything-threads/funcs.wast"
    "tests/local/shared-everything-threads/rec-groups.wast"
    "tests/local/shared-everything-threads/i31.wast"
    "tests/local/shared-everything-threads/tables.wast"
    "tests/local/shared-everything-threads/ref-equivalence.wast"
    "tests/local/stack-switching/cont_bind.wast"
    "tests/local/stack-switching/actor-lwt.wast"
    "tests/local/shared-everything-threads/globals.wast"
    "tests/local/stack-switching/actor.wast"
    "tests/local/stack-switching/cont_canonicalisation.wast"
    "tests/local/stack-switching/async-await.wast"
    "tests/local/stack-switching/cont_new.wast"
    "tests/local/shared-everything-threads/structs.wast"
    "tests/local/stack-switching/cont_suspend.wast"
    "tests/local/stack-switching/cont_resume.wast"
    "tests/local/shared-everything-threads/types.wast"
    "tests/local/stack-switching/fun-state.wast"
    "tests/local/stack-switching/generators.wast"
    "tests/local/stack-switching/pipes.wast"
    "tests/local/stack-switching/fun-actor-lwt.wast"
    "tests/local/stack-switching/fun-lwt.wast"
    "tests/local/stack-switching/control-lwt.wast"
    "tests/local/stack-switching/lwt.wast"
    "tests/local/shared-everything-threads/arrays.wast"
    "tests/local/stack-switching/static-lwt.wast"
    "tests/local/stack-switching/cont.wast"
    "tests/local/stack-switching/validation_gc.wast"
    "tests/local/stack-switching/validation.wast"
    "tests/local/component-model/types.wast"

I have looked through a bunch of the reported errors and they seem to be expected from the turning-off of features. However, it would certainly be nicer if we had an automated way of testing this fast-validation path.

@Robbepop
Copy link
Collaborator Author

Robbepop commented Nov 14, 2024

I think is possible to simplify Wasm feature checking by making wasmparser aware of Wasm feature dependencies.
For example:

  • enable function-references if the gc proposal is enabled because gc depends on function-references
  • disable gc if function-references is disabled (same reason as above)
  • etc..

This way it was not possible to have Wasm features sets that do not make much sense.

Or are there problems that I have overlooked? 🤔

Copy link
Member

@alexcrichton alexcrichton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Do you think it would be possible to optimize the processing of types in-place though rather than conditionally gating on features? From a high-level I don't think we have any other logic where we significantly differ on features for what path to take at runtime (except to return an error). I'm worried that this will become a relatively poorly tested path due to not being the default and being relatively nontrivial. I haven't dug in much though to try to understand what the slow parts are here though so it may not be possible.

Otherwise for feature dependencies I'm not sure how to best manage that. We already primarily use method accessors for features rather than .contains(WasmFeatures::FEATURE) so it might not be too hard to update those accessors to automatically test for a set of bits rather than just a single bit.

@Robbepop
Copy link
Collaborator Author

Robbepop commented Nov 14, 2024

Thanks! Do you think it would be possible to optimize the processing of types in-place though rather than conditionally gating on features? From a high-level I don't think we have any other logic where we significantly differ on features for what path to take at runtime (except to return an error). I'm worried that this will become a relatively poorly tested path due to not being the default and being relatively nontrivial. I haven't dug in much though to try to understand what the slow parts are here though so it may not be possible.

I agree that this solution is not ideal. To me this is just a temporary solution to make wasmparser usable by Wasmi. Unfortunately, I was not able to come up with a more elegant, inline solution.

An idea for an inline solution:

Perform the light-weight validation introduced in this PR until a type is encountered that strictly requires all the type canonicalization, sub-type checking etc. Then, perform all those steps for all the types seen so far and continue with this more elaborate approach.

@alexcrichton Do you think this is feasible?


Slow parts of type section validation are:

  • type canonicalization
  • sub-type checks (depends on canonicalization)
  • internment (not sure)

All of those are prevented in this PR for the selected set of features.


I would really love to use upstream wasmparser in Wasmi and keep it up to date and be able to contribute to its development. As long as Wasmi sits on its own branch this development is kinda hindered. Fast type section validation is the last strict requirement for this to happen.

@Robbepop
Copy link
Collaborator Author

Robbepop commented Nov 16, 2024

I have experimented with an implemented for the idea presented in this comment. However, problems arised quickly due to certain invariants of other structures in the validator's greater scheme. For example resetting the TypeAlloc. Furthermore, compared to the solution presented in this PR, there was some overhead, even for the fast case.

All in all, I think a proper, more elegant approach would take a lot of more work and consideration. This PR implements a very fast type section validation at nearly no runtime cost (one bit-flags check). The feature gated allow list should be fairly maintainble since development of new features won't casually break the type validator and old features won't change their semantics. Once we figure out a more elegant and inline approach for fast type section validation, this solution can easily be replaced since it is neatly isolated and contained in a single spot.

Copy link
Member

@alexcrichton alexcrichton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for investigating! I think that's a reasonable approach myself and the benefits seem worth the theoretical risk here, so I'm happy to land this. It's also not like this has zero testing, the "feature frobbing" done in Wasmtime and wasm-tools fuzzing should still cover this case.

@alexcrichton alexcrichton added this pull request to the merge queue Nov 16, 2024
Merged via the queue into bytecodealliance:main with commit 9cebc6a Nov 16, 2024
30 checks passed
alexcrichton added a commit to alexcrichton/wasm-tools that referenced this pull request Dec 2, 2024
alexcrichton added a commit to alexcrichton/wasm-tools that referenced this pull request Dec 2, 2024
This commit fixes an issue from bytecodealliance#1906 which is preventing the upgrade of
wasm-tools in Wasmtime. That commit implemented a fast path by skipping
a function and reimplementing parts of it internally, but that then
caused Wasmtime to panic when other data structures weren't filled out.
The intention was that the user-facing interface doesn't change
depending on features, so this is an attempt at fixing the mistake in
that commit.

The fix here is to remove the fast path added and restructure it
differently. Instead now the fast and normal paths have much less
divergence which should prevent this issue from re-surfacing. This is
15% slower than `main` but it doesn't have the same bug as `main` so for
now that may be the best that can be done.
github-merge-queue bot pushed a commit that referenced this pull request Dec 2, 2024
* Add a benchmark for validating with older features

Benchmark the "fast path" in type interning.

* Reimplement fast-path validation of MVP types

This commit fixes an issue from #1906 which is preventing the upgrade of
wasm-tools in Wasmtime. That commit implemented a fast path by skipping
a function and reimplementing parts of it internally, but that then
caused Wasmtime to panic when other data structures weren't filled out.
The intention was that the user-facing interface doesn't change
depending on features, so this is an attempt at fixing the mistake in
that commit.

The fix here is to remove the fast path added and restructure it
differently. Instead now the fast and normal paths have much less
divergence which should prevent this issue from re-surfacing. This is
15% slower than `main` but it doesn't have the same bug as `main` so for
now that may be the best that can be done.

* Fix dead code warning
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

wasmparser: Performance regressions since v0.100.2
2 participants