Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Merged by Bors] - Parser Idempotency Fuzzer #2400

Closed
wants to merge 15 commits into from

Conversation

addisoncrump
Copy link
Contributor

This Pull Request offers a fuzzer which is capable of detecting faults in the parser and interner. It does so by ensuring that the parsed AST remains the same between a parsed source and the result of parsing the to_interned_string result of the first parsed source.

It changes the following:

  • Adds a fuzzer for the parser and interner.

Any issues I raise in association with this fuzzer will link back to this fuzzer.

You may run the fuzzer using the following commands:

$ cd boa_engine
$ cargo +nightly fuzz run -s none parser-idempotency

@codecov
Copy link

codecov bot commented Nov 2, 2022

Codecov Report

Merging #2400 (0f4223c) into main (b88736a) will decrease coverage by 0.04%.
The diff coverage is 0.00%.

@@            Coverage Diff             @@
##             main    #2400      +/-   ##
==========================================
- Coverage   38.74%   38.70%   -0.05%     
==========================================
  Files         313      314       +1     
  Lines       23856    23883      +27     
==========================================
  Hits         9244     9244              
- Misses      14612    14639      +27     
Impacted Files Coverage Δ
boa_ast/src/declaration/mod.rs 44.00% <ø> (ø)
boa_ast/src/declaration/variable.rs 46.47% <ø> (ø)
boa_ast/src/expression/access.rs 26.92% <ø> (ø)
boa_ast/src/expression/await.rs 36.36% <ø> (ø)
boa_ast/src/expression/call.rs 33.33% <ø> (ø)
boa_ast/src/expression/identifier.rs 11.76% <ø> (ø)
boa_ast/src/expression/literal/array.rs 19.44% <ø> (ø)
boa_ast/src/expression/literal/mod.rs 22.58% <ø> (ø)
boa_ast/src/expression/literal/object.rs 17.47% <ø> (ø)
boa_ast/src/expression/literal/template.rs 9.67% <ø> (ø)
... and 42 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

boa_engine/fuzz/Cargo.toml Outdated Show resolved Hide resolved
Comment on lines 10 to 15
pub struct FuzzData {
pub context: Context,
pub ast: StatementList,
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even if the usage of the structure could be considered straightforward, adding some documentation could be useful.

Comment on lines 18 to 23
let mut syms_available = Vec::with_capacity(8);
for c in 'a'..='h' {
syms_available.push(context.interner_mut().get_or_intern(&*String::from(c)));
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason to only have these symbols? Also, I see a TODO below requesting arbitrary string literals.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes -- this creates a fixed pool of symbols to use in the AST. The AST, when generated, just throws random symbol indices in; we need to fix them to well-known (non-keyword) items in the symbol interner. Additionally, we don't want to generate them dynamically from the fuzz data because this can introduce "noise" into our sample (basically, bytes used previously to make symbols now form part of the AST or vice versa, unaligning the AST-generating bytes). This is also the reason for the TODO -- string literals from the arbitrary data could introduce quite a bit of undesirable noise, but not doing so means we don't test string parsing/interning. I decided not to for this PR because it could introduce so much noise that it would render this unusable.

use std::error::Error;
use std::io::Cursor;

fn do_fuzz(mut data: FuzzData) -> Result<(), Box<dyn Error>> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even if the usage of the function could be considered straightforward, adding some documentation could be useful. A link to the libfuzzer documentation or to the cargo-fuzz documentation where this usage is explained would be nice.

@Razican Razican added this to the v0.17.0 milestone Nov 4, 2022
@Razican Razican added enhancement New feature or request parser Issues surrounding the parser test Issues and PRs related to the tests. labels Nov 4, 2022
@Razican
Copy link
Member

Razican commented Nov 4, 2022

The merge conflict is just because the ast module was moved to its own crate, but the files themselves should be pretty similar.

Copy link
Member

@jedel1043 jedel1043 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it should be better to lift the fuzz directory into the project root, then have a subdirectory for every type of fuzzer.

@addisoncrump
Copy link
Contributor Author

I think it should be better to lift the fuzz directory into the project root, then have a subdirectory for every type of fuzzer.

I'm not sure if this is possible for cargo-fuzz, but I'll see. I seem to remember it having some trouble executing from a virtual workspace root.

@jedel1043
Copy link
Member

You should be able to easily. For example, on gfx-extras they do precisely that :)

@addisoncrump
Copy link
Contributor Author

Yup -- the issue is that you cannot init the fuzzer directory in the virtual workspace root, but you can move it later. Interesting.

Copy link
Member

@jedel1043 jedel1043 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice work!

boa_ast/Cargo.toml Outdated Show resolved Hide resolved
boa_interner/src/sym.rs Outdated Show resolved Hide resolved
fuzz/Cargo.toml Outdated Show resolved Hide resolved
fuzz/Cargo.toml Outdated Show resolved Hide resolved
@addisoncrump
Copy link
Contributor Author

Perhaps in the future we could put this into CI? It catches problems very fast.

@jedel1043
Copy link
Member

jedel1043 commented Nov 6, 2022

Perhaps in the future we could put this into CI? It catches problems very fast.

Yeah! Though, we'll need a proper CI platform for it. We can't run it on every PR because it could throw random errors at any time, A CI action that runs it every hour or so would be nice.

Copy link
Member

@jedel1043 jedel1043 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should rebase this. I recently pushed a changed that extracted the parser from the engine into a crate, and the API of the Parser type changed slightly; it now requires only a &mut Interner reference to work.

This change should also speedup the fuzzer, because it would skip having to initialize the builtins in order to parse.

@addisoncrump
Copy link
Contributor Author

Perhaps in the future we could put this into CI? It catches problems very fast.

Yeah! Though, we'll need a proper CI platform for it. We can't run it on every PR because it could throw random errors at any time, A CI action that runs it every hour or so would be nice.

I can ask if this would be appropriate for OSS-Fuzz?

@jedel1043
Copy link
Member

Maybe! We would need to apply for it though.
cc @jasonwilliams to hear his opinion on this

Copy link
Contributor

@RageKnify RageKnify left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ty for the awesome contribution and your perseverance having had to redo it. ❤️

fuzz/fuzz_targets/common.rs Outdated Show resolved Hide resolved
Copy link
Member

@jedel1043 jedel1043 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No more nitpicking 😆 Great job!

@jedel1043
Copy link
Member

bors r+

@bors

This comment was marked as outdated.

@jedel1043
Copy link
Member

bors r+

bors bot pushed a commit that referenced this pull request Nov 6, 2022
This Pull Request offers a fuzzer which is capable of detecting faults in the parser and interner. It does so by ensuring that the parsed AST remains the same between a parsed source and the result of parsing the `to_interned_string` result of the first parsed source.

It changes the following:

- Adds a fuzzer for the parser and interner.

Any issues I raise in association with this fuzzer will link back to this fuzzer.

You may run the fuzzer using the following commands:
```bash
$ cd boa_engine
$ cargo +nightly fuzz run -s none parser-idempotency
```

Co-authored-by: Addison Crump <[email protected]>
@bors
Copy link

bors bot commented Nov 6, 2022

Pull request successfully merged into main.

Build succeeded:

@bors bors bot changed the title Parser Idempotency Fuzzer [Merged by Bors] - Parser Idempotency Fuzzer Nov 6, 2022
@bors bors bot closed this Nov 6, 2022
@jedel1043 jedel1043 linked an issue Nov 8, 2022 that may be closed by this pull request
bors bot pushed a commit that referenced this pull request Nov 15, 2022
This Pull Request offers a basic VM fuzzer which relies on implied oracles (namely, "does it crash or timeout?").

It changes the following:

- Adds an insns_remaining field to Context, denoting the number of instructions remaining to execute (only available when fuzzing)
- Adds a JsNativeError variant, denoting when the number of instructions has been exceeded (only available when fuzzing)
- Adds a VM fuzzer which looks for cases where Boa may crash on an input

This offers no guarantees about correctness, only assertion violations. Depends on #2400.

Any issues I raise in association with this fuzzer will link back to this fuzzer.

You may run the fuzzer using the following commands:
```bash
$ cd boa_engine
$ cargo +nightly fuzz run -s none vm-implied
```

Co-authored-by: Addison Crump <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request parser Issues surrounding the parser test Issues and PRs related to the tests.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Catching lexer/parser bugs with fuzzed input
4 participants