Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace wasmparser-nostd fork with upstream wasmparser #1141

Merged
merged 31 commits into from
Nov 27, 2024

Conversation

Robbepop
Copy link
Member

@Robbepop Robbepop commented Jul 26, 2024

Thread discussing the performance regressions in wasmparser:
bytecodealliance/wasm-tools#1701


Unfortunately benchmarks conducted locally show significant performance regressions for Wasm validation:

We can see this clearly in the posted screenshot since translate/tiny_keccak/checked/eager regresses massively while translate/tiny_keccak/unchecked/eager only regresses by 3% which could be noise.

The only difference between these two cases is that translate/tiny_keccak/checked/eager validates the Wasm via wasmparser while translate/tiny_keccak/unchecked/eager skips Wasm validation and only translates the Wasm via Wasmi.

The huge 48% performance regression for the checked/lazy test case indicates that Wasm validation has a much higher constant cost bump for validation of non-function bodies, too. OR it indicates that the setup of the function validators became a lot more expensive.

Reproduce benchmarks locally:

  1. On main branch: cargo bench --bench benches -- --save-baseline main
  2. On PR branch: cargo bench --bench benches -- --baseline main

image

Copy link

codecov bot commented Jul 26, 2024

Codecov Report

Attention: Patch coverage is 59.86395% with 59 lines in your changes missing coverage. Please review.

Project coverage is 81.45%. Comparing base (e25e3a8) to head (0c4aca1).
Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
crates/wasmi/src/engine/translator/mod.rs 18.18% 27 Missing ⚠️
crates/wasmi/src/module/utils.rs 62.96% 10 Missing ⚠️
crates/wasmi/src/module/init_expr.rs 33.33% 8 Missing ⚠️
crates/wasmi/src/engine/mod.rs 12.50% 7 Missing ⚠️
crates/wasmi/src/module/element.rs 77.77% 4 Missing ⚠️
crates/wasmi/src/engine/config.rs 92.85% 2 Missing ⚠️
crates/wasmi/src/module/parser.rs 90.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1141      +/-   ##
==========================================
- Coverage   81.57%   81.45%   -0.13%     
==========================================
  Files         306      306              
  Lines       25271    25349      +78     
==========================================
+ Hits        20616    20647      +31     
- Misses       4655     4702      +47     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Robbepop added 2 commits July 26, 2024 17:04
The new WasmFeatures type requires a bit less stack space due to bitfield usage.
Ideally we use our own bitfield based type but this is more work.
@Robbepop
Copy link
Member Author

Robbepop commented Jul 26, 2024

I just found one reason why it regressed: I benchmarked without setting the no-hash-maps feature. Generally BTreeMap is faster than HashMap for Wasmi uses and the old wasmparser-nostd fork alsways uses BTreeMaps. With this feature set local benchmarks still indicate regressions but they are less severe, ranging between 10-25% instead of 20-35%. The unchecked (no validation) case now even has slightly improved performance which indicates very strongly to me that the regressions that we see come from the Wasm validation.

The biggest regression we see now is lazy-translation which further indicates that regressions are in the Wasm validation since lazy-translation eagerly validates the Wasm input and avoids translating it via Wasmi, thus more of the total time spent in the benchmark comes from wasmparser's parsing and validation.

image

@Robbepop
Copy link
Member Author

Running yet another benchmark because I forgot to use no-hash-maps on main. 🤦

This is the slowdown from main to this PR:
image

@Robbepop
Copy link
Member Author

Robbepop commented Oct 6, 2024

After improving translation benchmarks and updating wasmparser to v0.218.0 I reran benchmarks:

tl;dr Conclusions

  • Benchmarks with validation (checked) are affected a lot worse.
    • Wasm validation seems to have the largest regressions overall.
  • Benchmarks that perform validation only on non-code section parts are particularly affected. (lazy/checked)
    • Thus regressions are also outside of the code section validation.
  • Even benchmarks that perform no Wasm validation whatsoever are affected. (unchecked)
    • Thus parsing Wasm has also regressed.
    • Parsing outside of the code section has regressed more. (eager/unchecked vs lazy/unchecked)
  • Small benchmarks (tiny_keccak and erc20) are affected a lot worse than large ones (spidermonkey).
    • This implies that the base overhead for parsing and validation has been lifted/regressed.

Tiny-Keccak

image

Spidermonkey

image

ERC-20

image

@Robbepop Robbepop changed the title Replace wasmparser-nostd fork with upstream wasmparser as dependency Replace wasmparser-nostd fork with upstream wasmparser Oct 6, 2024
@Robbepop
Copy link
Member Author

With wasmparser v0.219.0 there is a new component-model crate feature which disables Wasm component-model support. Wasmi can disable this for some nice compile time improvements:

v0.218.0:

cargo build -p wasmi --profile bench  22.52s user 1.16s system 201% cpu 11.778 total

v0.219.0:

cargo build -p wasmi --profile bench  18.95s user 1.18s system 178% cpu 11.248 total

@Robbepop
Copy link
Member Author

Robbepop commented Oct 19, 2024

I just found out that wasmparser v0.219.0 has this API:
https://docs.rs/wasmparser/0.219.1/wasmparser/struct.FunctionBody.html#method.as_bytes

This will likely fix a workaround in Wasmi:

  • match payload {
    Payload::CodeSectionEntry(func_body) => {
    // Note: Unfortunately the `wasmparser` crate is missing an API
    // to return the byte slice for the respective code section
    // entry payload. Please remove this work around as soon as
    // such an API becomes available.
    let bytes = Self::consume_buffer(consumed, buffer);
    let remaining = func_body.get_binary_reader().bytes_remaining();
    let start = consumed - remaining;
    let bytes = &bytes[start..];
    self.process_code_entry(func_body, bytes, &header)?;
    }
  • pub fn new(
    offset: impl Into<Option<usize>>,
    bytes: &'parser [u8],
    translator: T,
    ) -> Result<Self, Error> {
    let offset = offset.into().unwrap_or(0);
    let func_body = FunctionBody::new(offset, bytes);
    Ok(Self {
    func_body,
    bytes,
    translator,
    })
    }

@danielstuart14
Copy link

danielstuart14 commented Nov 1, 2024

#1197 requires this change.

@Robbepop
Copy link
Member Author

Robbepop commented Nov 16, 2024

With bytecodealliance/wasm-tools#1906 merged we can also merge this PR once wasmparser v0.221 has been released which might happen next week. 🎉

cc @danielstuart14

@Robbepop Robbepop merged commit f732c6c into main Nov 27, 2024
17 of 19 checks passed
@Robbepop Robbepop deleted the rf-update-wasmparser branch November 27, 2024 18:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants