-
-
Notifications
You must be signed in to change notification settings - Fork 322
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OnDiskCorpus files be configurable to contain a human readable representation of the input #2538
Comments
I don't fully understand: The OnDiskCorpus will contain the "content of the testcase/input that triggered the inputs"- that's what it's for, right? That being said, currently the correct(tm) way to add metadata to a Testcase is via custom Feedbacks that do nothing like here: LibAFL/libafl/src/feedbacks/stdio.rs Line 107 in e370e2f
|
Yes, the corpus will contain everything, of course. But it isn't written to disk, so when I kill the fuzzer, I lose everything but the metadata (found in the |
Why is the _ OnDisk_Corpus not written to disk? |
Ah, I see, seems like I missed something. If I understand correctly, the input content is serialised and written to disk in this method on /// Write this input to the file
fn to_file<P>(&self, path: P) -> Result<(), Error>
where
P: AsRef<Path>,
{
write_file_atomic(path, &postcard::to_allocvec(self)?)
} When initialising the corpus, a format can be passed, and while this leaves the metadata nicely formatted, the input itself is still serialised and thus not human readable. OnDiskCorpus::with_meta_format(
PathBuf::from("./crashes"),
OnDiskMetadataFormat::JsonPretty,
)
.unwrap(), So I guess I'm asking for an option for human-readable serialisation of the input when written to disk. |
I guess I could also just implement this for my input, so a global option may not be strictly necessary, but it would still be nice, just for consistency. |
Related question: All input types in the repo (at least as far as I can see) generate their testcase names ( Should there not just be a blanket implementation that does this for any input that implements |
For a human-readable serialization there is the |
Yes, this kind of does what I would want it to do, but
Depending on how large your corpus gets and the change-rate within it, the first point may annoying to a considerable downside. The second is not critical, just a bit of extra code, would just be easier without it :) Plus I would expect this kind of functionality in the corpus, especially |
Feel free to fix the first point :) Open for other suggestions of course. |
you can use |
Most fuzzers will likely use some form of
OnDiskCorpus
(incl.InMemoryOnDiskCorpus
,CachedOnDiskCorpus
, etc.) for their solutions. To then figure out, what the problem actually was, one would need to know the content of the testcase/input that triggered the feedbacks. Currently, corpora storing them on disk store a bunch of generic information in the file associated with the testcase/input (such as runtime), but no representation of the input.The only way to do add this without resorting to writing dummy-feedbacks that do nothing but add a new metadata with the input content, is by implementing the filename generating function on the input to extract the testcase from the corpus, and somehow stringify it:
However, file names have a length restriction, so this isn't usable for inputs that can get somewhat long. Plus, for structured inputs, it would be much easier to have the entire structure nicely formatted in the file.
The text was updated successfully, but these errors were encountered: