-
-
Notifications
You must be signed in to change notification settings - Fork 411
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create a structure-aware JavaScript fuzzer to find deep bugs #1902
Conversation
Codecov Report
@@ Coverage Diff @@
## main #1902 +/- ##
==========================================
- Coverage 45.87% 45.05% -0.83%
==========================================
Files 206 208 +2
Lines 17102 17445 +343
==========================================
+ Hits 7846 7860 +14
- Misses 9256 9585 +329
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks very good now, just some minor things. I would also still like to have a file in the /docs folder with information on how to execute the fuzzer, how it works and so on.
boa_inputgen/src/ident_walk.rs
Outdated
fn extendo<T>(node: &mut T) -> &'static mut T { | ||
unsafe { &mut *(node as *mut T) } | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe it makes sense to make the function unsafe
, and then use the unsafe
block where it's used, to make sure that each call is safe (so, invariants are checked on each call)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, I've revamped how extendo (now extend_lifetime -- the original name was more of a personal joke) handles lifetime extension so that it restricts the lifetime to a consistent, non-static lifetime that's decoupled from the original.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an unbounded lifetime. It's safe in this context, since we don't return the relevant Vec and all the references point to AST members.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still missing a review for the fuzz crate, I'll submit my current suggestions to give you time to fix or answer :)
} | ||
} | ||
|
||
fn replace_declpattern<'a>( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm curious, wouldn't a recursive function remove all instances of extend_lifetime
? I'm asking in case you did try it and the borrow checker still complained
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was significantly slower. I originally implemented this as a recursive function which just entirely rebuilt the AST, then a recursive function which mutably descended the AST, and both were significantly slower -- to the point that it affected execution speed of the fuzzer. This walk is much faster.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Huh, that's weird, since rustc uses a recursive visitor. Approximately what was the performance difference between the two implementations?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few ms per fuzzer test case, but enough to slow me down from about 500 inputs/s to 450 or less. I didn't measure it particularly thoroughly, swapped it over to test the performance difference and had a speed-up so I kept it. 😅
I'm gonna implement that visitor and it'll probably be quicker regardless. This is more slapped together, as you can tell from the use of extend_lifetime
.
@@ -0,0 +1,576 @@ | |||
//! Identifier and symbol walker for ensuring that generated inputs do not fail with "string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The function signatures on this file are extremely similar. I would recommend some alternatives:
- Create a
ReplaceSym
trait with areplace
method, then implement that only to the statements that transitively contain aSym
. - Put all the functions inside
replace_inner
so that you can easily see which procedure corresponds to which statement. - Create a
fold
and amap
function for our AST. Personally, I would try to implement this first and fallback to a trait otherwise.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whoof, implementing a type visitor would certainly be very helpful but I'm not sure if that should also be in this commit. ReplaceSym trait seems like the better option.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whoof, implementing a type visitor would certainly be very helpful but I'm not sure if that should also be in this commit. ReplaceSym trait seems like the better option.
Yeah, pretty much. You can however open another PR with the change 😉
Jokes aside, we really appreciate code cleanups in our codebase, so if you have any ideas on how to improve our internal APIs, please open up an issue or a PR, we could guide you through our codebase if you need to 😊
#[cfg(not(feature = "fuzzer"))] | ||
const fn as_raw(self) -> NonZeroUsize { | ||
self.value | ||
} | ||
|
||
#[cfg(feature = "fuzzer")] | ||
pub const fn as_raw(self) -> NonZeroUsize { | ||
self.value | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If users just need to use a feature to call a private function, I'd just expose it as public in the first place. It doesn't really matter in this case since getting a raw NonZeroUsize
is not useful to interact with the interner.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change comes from a place of "change as little as possible of the underlying implementation". :) I'll just replace it.
I'm noticing you're merging instead of rebasing when conflicts occur. That's not ideal, since a deletion diff would reappear with a merge, and that has happened to this PR several times already. Unfortunately you cannot switch to rebasing in this PR, or the commit tree will implode and it will be a pain to fix (I speak from personal experience 😅), but I would advise you to rebase instead of merging on your next contributions 😁 |
Good point. My git-fu needs training... |
Co-authored-by: jedel1043 <[email protected]>
Switching over to AST-walking based Sym replacement in a separate PR. |
This PR adds two experimental fuzzers which generate valid JavaScript code from Arbitrary structs. These fuzzers (or variants thereof) were used to identify all of my previous PRs and issues. It does not generate identifiers which resolve to built-in types.
I will add documentation when possible, but I've been busy with work and wanted to offer this to y'all so I didn't have to back and forth every time a new PR was merged; possibly useful for OSS-Fuzz/CI in the future. It finds bugs very, very quickly.
If you want to test for yourself, I recommend using
cargo fuzz run -s none interp_fuzzer -- -timeout=5
.