Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add benchmarking using arbitrary fuzzing #465

Merged
merged 24 commits into from
Aug 20, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
24 commits
Select commit Hold shift + click to select a range
fe9b493
Early prototyping with a typed arbitrary fuzzer (ser only so far) and…
juntyr Jul 16, 2023
a0d1672
Also fuzz the ser::PrettyConfig (identation-excluded)
juntyr Jul 16, 2023
634f6f8
Start implementing the arbitrary typed data deserialising fuzzing
juntyr Jul 16, 2023
53e032d
Fix None inside stack of implicit Some-s
juntyr Jul 16, 2023
5715e8f
Detect problematic Some inside deserialize_any with unwrap_variant_ne…
juntyr Jul 16, 2023
513eec4
Alternative to #413: Some is explicitly not a newtype variant
juntyr Jul 16, 2023
bca27b9
Fix clippy::useless_conversion lint
juntyr Jul 17, 2023
50ecb5f
Another alternative: allow newtype variant unwrapping in deserialize_…
juntyr Jul 17, 2023
0933897
Fix PartialOrd impls for Map and Float
juntyr Jul 18, 2023
4438244
Implement arbitrary tuple struct (static field names slice FIXME) fuz…
juntyr Jul 18, 2023
dc9f90a
Fully fix Float comparison with total_ord
juntyr Jul 18, 2023
6352107
Fix clippy lints
juntyr Jul 18, 2023
6208450
Finished arbitrary struct and enum deserialisation fuzzing
juntyr Jul 18, 2023
def15a5
Create CI workflow for benchmarking
juntyr Aug 16, 2023
2bb6bdf
Fix corpus download
juntyr Aug 16, 2023
9e2448e
Fix corpus unzip
juntyr Aug 16, 2023
028cc7d
Fix corpus unzip to existing extraction directory
juntyr Aug 16, 2023
b1db368
Give benchmark the comparison branch name
juntyr Aug 16, 2023
6ff0664
Restrict the benchmark to unique cases (ty, value, ron)
juntyr Aug 16, 2023
58c624e
Add test for the Serialize identifier validation
juntyr Aug 16, 2023
5598752
Add tests for further fuzzer-found bugs
juntyr Aug 17, 2023
dbf5e6c
Add the extensive CHANGELOG entry
juntyr Aug 17, 2023
25a52c4
Add the test and changelog entry from the subsumed #413
juntyr Aug 17, 2023
93d06a7
Add an early return + more tests for the expensive newtype or tuple c…
juntyr Aug 19, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
28 changes: 28 additions & 0 deletions .github/workflows/cibench.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
name: CIBench
on: [pull_request]
jobs:
bench:
name: Benchmark
runs-on: ubuntu-latest

steps:
- uses: actions/checkout@v3
- uses: ./.github/actions/setup
with:
key: bench
- uses: actions-rs/toolchain@v1
with:
toolchain: stable
profile: minimal
override: true
- name: Download the corpus
run: |
wget https://storage.googleapis.com/ron-backup.clusterfuzz-external.appspot.com/corpus/libFuzzer/ron_arbitrary/public.zip
mkdir -p fuzz/corpus/arbitrary
unzip public.zip -d fuzz/corpus/arbitrary
rm public.zip
- uses: boa-dev/criterion-compare-action@v3
with:
cwd: fuzz
benchName: bench
branchName: ${{ github.base_ref }}
7 changes: 7 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,13 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
- Add `compact_maps` and `compact_structs` options to `PrettyConfig` to allow serialising maps and structs on a single line ([#448](https://github.com/ron-rs/ron/pull/448))
- Add minimal support for `#[serde(flatten)]` with roundtripping through RON maps ([#455](https://github.com/ron-rs/ron/pull/455))
- Add minimal roundtripping support for `#[serde(tag = "tag")]`, `#[serde(tag = "tag", content = "content")]`, and `#[serde(untagged)]` enums ([#451](https://github.com/ron-rs/ron/pull/451))
- Fix parsing `r` as a self-describing struct or variant name (and not the start of a raw string) ([#465](https://github.com/ron-rs/ron/pull/465))
- Fix serialising raw strings containing a literal backslash ([#465](https://github.com/ron-rs/ron/pull/465))
- Fix serialising `None` inside a stack of nested `Option`s with `#![enable(implicit_some)]` enabled ([#465](https://github.com/ron-rs/ron/pull/465))
- [Non-API] Breaking: Treat `Some` like a newtype variant with `unwrap_variant_newtypes` ([#465](https://github.com/ron-rs/ron/pull/465))
- Fix deserialising deserialising `A('/')` into a `ron::Value` ([#465](https://github.com/ron-rs/ron/pull/465))
- Update the arbitrary fuzzer to check arbitrary serde data types, values, and `ron::ser::PrettyConfig`s ([#465](https://github.com/ron-rs/ron/pull/465))
- Add a benchmark for PRs that runs over the latest fuzzer corpus ([#465](https://github.com/ron-rs/ron/pull/465))

## [0.8.1] - 2023-08-17

Expand Down
2 changes: 1 addition & 1 deletion examples/decode_file.rs
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ struct Nested {

fn main() {
let input_path = format!("{}/examples/example.ron", env!("CARGO_MANIFEST_DIR"));
let f = File::open(&input_path).expect("Failed opening file");
let f = File::open(input_path).expect("Failed opening file");
let config: Config = match from_reader(f) {
Ok(x) => x,
Err(e) => {
Expand Down
7 changes: 7 additions & 0 deletions fuzz/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,8 @@ arbitrary = { version = "1.0", features = ["derive"] }
libfuzzer-sys = "0.4"
ron = { path = "..", features = ["integer128"] }
serde = { version = "1.0", features = ["derive"] }
anyhow = { version = "1.0" }
criterion = { version = "0.5" }

# Prevent this from interfering with workspaces
[workspace]
Expand All @@ -29,3 +31,8 @@ name = "arbitrary"
path = "fuzz_targets/arbitrary.rs"
test = false
doc = false

[[bench]]
name = "bench"
path = "fuzz_targets/bench/main.rs"
harness = false
222 changes: 4 additions & 218 deletions fuzz/fuzz_targets/arbitrary.rs
Original file line number Diff line number Diff line change
@@ -1,224 +1,10 @@
#![no_main]

use std::borrow::Cow;
use std::fmt;
use std::hash::{Hash, Hasher};
use std::marker::PhantomData;
use std::sync::atomic::{AtomicUsize, Ordering};

use arbitrary::{Arbitrary, Unstructured};
use libfuzzer_sys::fuzz_target;
use serde::{
de::{MapAccess, Visitor},
ser::SerializeMap,
Deserialize, Deserializer, Serialize, Serializer,
};

#[path = "bench/lib.rs"]
mod typed_data;

fuzz_target!(|data: &[u8]| {
if let Ok(value) = SerdeData::arbitrary(&mut Unstructured::new(data)) {
let ron = match ron::to_string(&value) {
Ok(ron) => ron,
Err(ron::error::Error::ExceededRecursionLimit) => return,
Err(err) => panic!("{:?} -! {:?}", value, err),
};
let de = match ron::from_str::<SerdeData>(&ron) {
Ok(de) => de,
Err(err) if err.code == ron::error::Error::ExceededRecursionLimit => return,
Err(err) => panic!("{:?} -> {:?} -! {:?}", value, ron, err),
};
assert_eq!(value, de, "{:?} -> {:?} -> {:?}", value, ron, de);
}
typed_data::roundtrip_arbitrary_typed_ron_or_panic(data);
});

#[derive(Debug, PartialEq, Eq, Hash, Serialize, Deserialize, Arbitrary)]
enum SerdeData<'a> {
Bool(bool),
I8(i8),
I16(i16),
I32(i32),
I64(i64),
I128(i128),
ISize(isize),
U8(u8),
U16(u16),
U32(u32),
U128(u128),
USize(usize),
F32(F32),
F64(F64),
Char(char),
#[serde(borrow)]
Str(Cow<'a, str>),
String(String),
#[serde(borrow)]
Bytes(Cow<'a, [u8]>),
ByteBuf(Vec<u8>),
Option(#[arbitrary(with = arbitrary_recursion_guard)] Option<Box<Self>>),
Unit(()),
#[serde(borrow)]
Map(#[arbitrary(with = arbitrary_recursion_guard)] SerdeMap<'a>),
Seq(#[arbitrary(with = arbitrary_recursion_guard)] Vec<Self>),
#[serde(borrow)]
Enum(#[arbitrary(with = arbitrary_recursion_guard)] SerdeEnum<'a>),
#[serde(borrow)]
Struct(#[arbitrary(with = arbitrary_recursion_guard)] SerdeStruct<'a>),
}

fn arbitrary_recursion_guard<'a, T: Arbitrary<'a> + Default>(
u: &mut Unstructured<'a>,
) -> arbitrary::Result<T> {
static RECURSION_DEPTH: AtomicUsize = AtomicUsize::new(0);

let max_depth = ron::Options::default()
.recursion_limit
.map_or(256, |limit| limit * 2);

let result = if RECURSION_DEPTH.fetch_add(1, Ordering::Relaxed) < max_depth {
T::arbitrary(u)
} else {
Ok(T::default())
};

RECURSION_DEPTH.fetch_sub(1, Ordering::Relaxed);

result
}

#[allow(clippy::enum_variant_names)]
#[derive(Debug, Default, PartialEq, Eq, Hash, Serialize, Deserialize, Arbitrary)]
enum SerdeEnum<'a> {
#[default]
UnitVariant,
#[serde(borrow)]
NewtypeVariant(Box<SerdeData<'a>>),
TupleVariant(Box<SerdeData<'a>>, Box<SerdeData<'a>>, Box<SerdeData<'a>>),
StructVariant {
a: Box<SerdeData<'a>>,
r#fn: Box<SerdeData<'a>>,
c: Box<SerdeData<'a>>,
},
}

#[derive(Debug, PartialEq, Eq, Hash, Serialize, Deserialize, Arbitrary)]
enum SerdeStruct<'a> {
Unit(SerdeUnitStruct),
#[serde(borrow)]
Newtype(SerdeNewtypeStruct<'a>),
#[serde(borrow)]
Tuple(SerdeTupleStruct<'a>),
#[serde(borrow)]
Struct(SerdeStructStruct<'a>),
}

impl<'a> Default for SerdeStruct<'a> {
fn default() -> Self {
Self::Unit(SerdeUnitStruct)
}
}

#[derive(Debug, PartialEq, Eq, Hash, Serialize, Deserialize, Arbitrary)]
struct SerdeUnitStruct;

#[derive(Debug, PartialEq, Eq, Hash, Serialize, Deserialize, Arbitrary)]
#[repr(transparent)]
struct SerdeNewtypeStruct<'a>(#[serde(borrow)] Box<SerdeData<'a>>);

#[derive(Debug, PartialEq, Eq, Hash, Serialize, Deserialize, Arbitrary)]
struct SerdeTupleStruct<'a>(
#[serde(borrow)] Box<SerdeData<'a>>,
Box<SerdeData<'a>>,
Box<SerdeData<'a>>,
);

#[derive(Debug, PartialEq, Eq, Hash, Serialize, Deserialize, Arbitrary)]
struct SerdeStructStruct<'a> {
#[serde(borrow)]
a: Box<SerdeData<'a>>,
#[serde(borrow)]
r#fn: Box<SerdeData<'a>>,
#[serde(borrow)]
c: Box<SerdeData<'a>>,
}

#[derive(Debug, Serialize, Deserialize, Arbitrary)]
#[repr(transparent)]
struct F32(f32);

impl PartialEq for F32 {
fn eq(&self, other: &Self) -> bool {
if self.0.is_nan() && other.0.is_nan() {
return true;
}
self.0.to_bits() == other.0.to_bits()
}
}

impl Eq for F32 {}

impl Hash for F32 {
fn hash<H: Hasher>(&self, state: &mut H) {
state.write_u32(self.0.to_bits())
}
}

#[derive(Debug, Serialize, Deserialize, Arbitrary)]
#[repr(transparent)]
struct F64(f64);

impl PartialEq for F64 {
fn eq(&self, other: &Self) -> bool {
if self.0.is_nan() && other.0.is_nan() {
return true;
}
self.0.to_bits() == other.0.to_bits()
}
}

impl Eq for F64 {}

impl Hash for F64 {
fn hash<H: Hasher>(&self, state: &mut H) {
state.write_u64(self.0.to_bits())
}
}

#[derive(Debug, Default, PartialEq, Eq, Hash, Arbitrary)]
struct SerdeMap<'a>(Vec<(SerdeData<'a>, SerdeData<'a>)>);

impl<'a> Serialize for SerdeMap<'a> {
fn serialize<S: Serializer>(&self, serializer: S) -> Result<S::Ok, S::Error> {
let mut map = serializer.serialize_map(Some(self.0.len()))?;

for (key, value) in &self.0 {
map.serialize_entry(key, value)?;
}

map.end()
}
}

impl<'a, 'de: 'a> Deserialize<'de> for SerdeMap<'a> {
fn deserialize<D: Deserializer<'de>>(deserializer: D) -> Result<Self, D::Error> {
struct SerdeMapVisitor<'a>(PhantomData<&'a ()>);

impl<'a, 'de: 'a> Visitor<'de> for SerdeMapVisitor<'a> {
type Value = SerdeMap<'a>;

fn expecting(&self, formatter: &mut fmt::Formatter) -> fmt::Result {
write!(formatter, "a map")
}

fn visit_map<A: MapAccess<'de>>(self, mut map: A) -> Result<Self::Value, A::Error> {
let mut values = Vec::with_capacity(map.size_hint().unwrap_or(0));

while let Some(entry) = map.next_entry()? {
values.push(entry);
}

Ok(SerdeMap(values))
}
}

deserializer.deserialize_map(SerdeMapVisitor(PhantomData))
}
}
Loading