Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create a structure-aware JavaScript fuzzer to find deep bugs #1902

Closed
wants to merge 40 commits into from

Conversation

addisoncrump
Copy link
Contributor

@addisoncrump addisoncrump commented Mar 6, 2022

This PR adds two experimental fuzzers which generate valid JavaScript code from Arbitrary structs. These fuzzers (or variants thereof) were used to identify all of my previous PRs and issues. It does not generate identifiers which resolve to built-in types.

I will add documentation when possible, but I've been busy with work and wanted to offer this to y'all so I didn't have to back and forth every time a new PR was merged; possibly useful for OSS-Fuzz/CI in the future. It finds bugs very, very quickly.

If you want to test for yourself, I recommend using cargo fuzz run -s none interp_fuzzer -- -timeout=5.

@codecov
Copy link

codecov bot commented Mar 6, 2022

Codecov Report

Merging #1902 (dd85b72) into main (6498216) will decrease coverage by 0.82%.
The diff coverage is 0.85%.

@@            Coverage Diff             @@
##             main    #1902      +/-   ##
==========================================
- Coverage   45.87%   45.05%   -0.83%     
==========================================
  Files         206      208       +2     
  Lines       17102    17445     +343     
==========================================
+ Hits         7846     7860      +14     
- Misses       9256     9585     +329     
Impacted Files Coverage Δ
boa_engine/src/context/mod.rs 32.39% <0.00%> (-0.31%) ⬇️
boa_engine/src/lib.rs 79.31% <ø> (ø)
boa_engine/src/syntax/ast/constant.rs 42.85% <ø> (ø)
boa_engine/src/syntax/ast/node/array/mod.rs 28.57% <0.00%> (-4.77%) ⬇️
boa_engine/src/syntax/ast/node/await_expr/mod.rs 28.57% <0.00%> (-11.43%) ⬇️
boa_engine/src/syntax/ast/node/block/mod.rs 41.17% <0.00%> (-2.58%) ⬇️
boa_engine/src/syntax/ast/node/call/mod.rs 52.94% <0.00%> (-16.29%) ⬇️
.../syntax/ast/node/conditional/conditional_op/mod.rs 45.00% <0.00%> (-19.29%) ⬇️
...ine/src/syntax/ast/node/conditional/if_node/mod.rs 52.00% <0.00%> (-13.00%) ⬇️
...ax/ast/node/declaration/arrow_function_decl/mod.rs 25.92% <0.00%> (-5.90%) ⬇️
... and 76 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 6498216...dd85b72. Read the comment docs.

@jedel1043 jedel1043 added enhancement New feature or request test Issues and PRs related to the tests. labels Mar 7, 2022
@jedel1043 jedel1043 modified the milestones: v0.14.0, v0.15.0 Mar 7, 2022
Copy link
Member

@Razican Razican left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks very good now, just some minor things. I would also still like to have a file in the /docs folder with information on how to execute the fuzzer, how it works and so on.

boa_inputgen/Cargo.toml Outdated Show resolved Hide resolved
boa_inputgen/src/ident_walk.rs Outdated Show resolved Hide resolved
boa_inputgen/src/ident_walk.rs Show resolved Hide resolved
evil.js Outdated Show resolved Hide resolved
Comment on lines 32 to 34
fn extendo<T>(node: &mut T) -> &'static mut T {
unsafe { &mut *(node as *mut T) }
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it makes sense to make the function unsafe, and then use the unsafe block where it's used, to make sure that each call is safe (so, invariants are checked on each call)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I've revamped how extendo (now extend_lifetime -- the original name was more of a personal joke) handles lifetime extension so that it restricts the lifetime to a consistent, non-static lifetime that's decoupled from the original.

Copy link
Contributor Author

@addisoncrump addisoncrump Mar 17, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an unbounded lifetime. It's safe in this context, since we don't return the relevant Vec and all the references point to AST members.

Copy link
Member

@jedel1043 jedel1043 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still missing a review for the fuzz crate, I'll submit my current suggestions to give you time to fix or answer :)

boa_inputgen/src/data.rs Outdated Show resolved Hide resolved
}
}

fn replace_declpattern<'a>(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious, wouldn't a recursive function remove all instances of extend_lifetime? I'm asking in case you did try it and the borrow checker still complained

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was significantly slower. I originally implemented this as a recursive function which just entirely rebuilt the AST, then a recursive function which mutably descended the AST, and both were significantly slower -- to the point that it affected execution speed of the fuzzer. This walk is much faster.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huh, that's weird, since rustc uses a recursive visitor. Approximately what was the performance difference between the two implementations?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few ms per fuzzer test case, but enough to slow me down from about 500 inputs/s to 450 or less. I didn't measure it particularly thoroughly, swapped it over to test the performance difference and had a speed-up so I kept it. 😅

I'm gonna implement that visitor and it'll probably be quicker regardless. This is more slapped together, as you can tell from the use of extend_lifetime.

@@ -0,0 +1,576 @@
//! Identifier and symbol walker for ensuring that generated inputs do not fail with "string
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function signatures on this file are extremely similar. I would recommend some alternatives:

  • Create a ReplaceSym trait with a replace method, then implement that only to the statements that transitively contain a Sym.
  • Put all the functions inside replace_inner so that you can easily see which procedure corresponds to which statement.
  • Create a fold and a map function for our AST. Personally, I would try to implement this first and fallback to a trait otherwise.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whoof, implementing a type visitor would certainly be very helpful but I'm not sure if that should also be in this commit. ReplaceSym trait seems like the better option.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whoof, implementing a type visitor would certainly be very helpful but I'm not sure if that should also be in this commit. ReplaceSym trait seems like the better option.

Yeah, pretty much. You can however open another PR with the change 😉
Jokes aside, we really appreciate code cleanups in our codebase, so if you have any ideas on how to improve our internal APIs, please open up an issue or a PR, we could guide you through our codebase if you need to 😊

Comment on lines +301 to +309
#[cfg(not(feature = "fuzzer"))]
const fn as_raw(self) -> NonZeroUsize {
self.value
}

#[cfg(feature = "fuzzer")]
pub const fn as_raw(self) -> NonZeroUsize {
self.value
}
Copy link
Member

@jedel1043 jedel1043 Mar 17, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If users just need to use a feature to call a private function, I'd just expose it as public in the first place. It doesn't really matter in this case since getting a raw NonZeroUsize is not useful to interact with the interner.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change comes from a place of "change as little as possible of the underlying implementation". :) I'll just replace it.

@jedel1043
Copy link
Member

jedel1043 commented Mar 17, 2022

I'm noticing you're merging instead of rebasing when conflicts occur. That's not ideal, since a deletion diff would reappear with a merge, and that has happened to this PR several times already. Unfortunately you cannot switch to rebasing in this PR, or the commit tree will implode and it will be a pain to fix (I speak from personal experience 😅), but I would advise you to rebase instead of merging on your next contributions 😁

@addisoncrump
Copy link
Contributor Author

I'm noticing you're merging instead of rebasing when conflicts occur. That's not ideal, since a deletion diff would reappear with a merge, and that has happened to this PR several times already. Unfortunately you cannot switch to rebasing in this PR, or the commit tree would implode and it would be a pain to fix (I speak from personal experience sweat_smile), but I would advise you to rebase instead of merging on your next contributions grin

Good point. My git-fu needs training...

@addisoncrump
Copy link
Contributor Author

Switching over to AST-walking based Sym replacement in a separate PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request test Issues and PRs related to the tests.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants