Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add fuzzing testing #1215

Open
baentsch opened this issue May 25, 2022 · 19 comments
Open

Add fuzzing testing #1215

baentsch opened this issue May 25, 2022 · 19 comments
Labels
enhancement New feature or request

Comments

@baentsch
Copy link
Member

Follow the approach taken by OpenSSL or another one: Suggestions welcome below.

@nathaniel-brough
Copy link
Contributor

I'd like to suggest and champion an effort integrating liboqs with google/oss-fuzz. If you aren't familiar with it, Google offers a free (for open source) continuous fuzzing service called OSS-fuzz.

I've opened up a draft pull request to add a super basic fuzz-testing harness here #1905. It needs a little bit of tidying before it's ready to go but I thought I'd guage interest before polishing it up.

The general process would look something like this.

  • Merge Add a basic fuzz testing harness for Dilithium2 #1905
  • Apply for integration into oss-fuzz, this is a pretty simple PR into the repo that I can do on your behalf. All I'd need is a comment with approval from someone with write access to this repo.
  • Integrate the project, this includes a Dockerfile and a shell script to build liboqs in the oss-fuzz environment.
  • OSS-fuzz integration would be complete and the fuzzer would run every night for a few hours on a distributed cluster.
  • Integrate clusterfuzzlite, this would run a stripped down version of oss-fuzz in github actions for X minutes (usually 10mins, but this is configurable) on every PR. The motivation behind this differs slightly in that it's intended to catch shallow bugs before they are merged. In comparison OSS-fuzz will run for significantly longer each night to attempt to find those harder to reach bugs (if they exist).

Let me know what your thoughts are on this :)

@baentsch
Copy link
Member Author

I'd like to suggest and champion an effort integrating liboqs with google/oss-fuzz.

This would be very welcome, @silvergasp ! Thanks a million for the suggestion and apparent commitment!! As you seem to be an Independent Contributor like me (trying to establish that notion towards the corporate/LF folks :) I shall provide any possibly needed assistance with this, e.g., helping #1905 move to merge-ability, so please be sure to tag me when needed.

Let me know what your thoughts are on this :)

In a nutshell: LGTM :) Details to follow once this moves forward, I guess.

@nathaniel-brough
Copy link
Contributor

This would be very welcome, @silvergasp ! Thanks a million for the suggestion and apparent commitment!! As you seem to be an Independent Contributor like me (trying to establish that notion towards the corporate/LF folks :) I shall provide any possibly needed assistance with this, e.g., helping #1905 move to merge-ability, so please be sure to tag me when needed.

Cheers mate. Yeah I'm an independent contributor. I've gone ahead and opened up a draft pull request over at oss-fuzz google/oss-fuzz#12408 that will function as both the integration and the "application" process. Everything seems to be working well locally and the CI is passing. A few things I'll need to move that PR forward (in the order that they need to happen);

  • Add a basic fuzz testing harness for Dilithium2 #1905 will need to be merged before the oss-fuzz PR goes through as I've currently got the oss-fuzz configuration fetching my personal fork as it contains the fuzz harness. I'll need to swap that back to the main fork before the oss-fuzz PR is merged.
  • I need an email address for the primary_contact field. This will be the email that will receive updates whenever a security vulnerability or other bug is found by oss-fuzz. It will also be what you'll use to login into the bug-tracker and dashboard for various tools analyzing fuzzing bugs. It is preferable that this is a gmail/google account, otherwise you won't be able to login to the bug-tracker. I can also add many other emails that will get CC'd into these bug reports, if you would like multiple people to be notified. Please note that these emails will be stored in the oss-fuzz repo in plain text.
  • You to comment on the oss-fuzz PR (once it's complete) saying that you approve of the integration.

I've gone ahead and polished up #1905 and I think it's ready for review. It's just a super-basic fuzzer that's mostly adapted from one of the examples. But the goal was just to get all the infrastructure in place so that more complex/useful fuzzer's are possible and worth the effort.

@baentsch
Copy link
Member Author

I need an email address for the primary_contact field.

Ideally you'd use [email protected] (listed at https://openquantumsafe.org/liboqs/security.html#reporting-security-bugs) that different people read. Please let us know if it must be a gmail account (not really ideal, though).

@dstebila
Copy link
Member

I need an email address for the primary_contact field.

Ideally you'd use [email protected] (listed at https://openquantumsafe.org/liboqs/security.html#reporting-security-bugs) that different people read. Please let us know if it must be a gmail account (not really ideal, though).

We can also set up a dedicated fuzz-related alias if it's helpful.

@nathaniel-brough
Copy link
Contributor

I need an email address for the primary_contact field.

Ideally you'd use [email protected] (listed at https://openquantumsafe.org/liboqs/security.html#reporting-security-bugs) that different people read. Please let us know if it must be a gmail account (not really ideal, though).

It doesn't need to be a gmail account you just won't have access to the dashboard and will only receive email updates. The dashboard has a bunch of useful features for analysing fuzzing performance and also automatically bisecting bugs to see when they where introduced. So I'd recommend adding at least one gmail account that someone on the core team has access to. I can add as many email accounts as you like, but only the gmail ones will have full access. I know some projects will setup a specific separate gmail account for this purpose so that it's separate from their personal accounts e.g. rhai.

We can also set up a dedicated fuzz-related alias if it's helpful.

This might be worthwhile as oss-fuzz will send off an email everytime a fuzzer crashes this includes both security and non-security related bugs. In some projects this can be a lot of emails and it's often hard to triage them all without the filtering tools on the dashboard. If you use a gmail account you can configure your notifications to only receive security updates rather than every crash report.

I'll leave that up to you to decide :)

@dstebila
Copy link
Member

@ryjones Do you have a preferred way to handle setting up a Gmail account for project use? As you can see in the comment above there is apparently a benefit to using a Gmail account for this fuzzing dashboard rather than just a generic email address.

@ryjones
Copy link
Contributor

ryjones commented Sep 17, 2024

@ryjones Do you have a preferred way to handle setting up a Gmail account for project use? As you can see in the comment above there is apparently a benefit to using a Gmail account for this fuzzing dashboard rather than just a generic email address.

Let me ask around. I don't think I can set up a gmail/google apps account for any of the domains PQCA controls.

@nathaniel-brough
Copy link
Contributor

To keep things moving along it sounds like it would be easiest to just use [email protected] for now. I can always add a list of gmail accounts in addition to the [email protected] later if/when that becomes worthwhile.

Anyone have any objections to me adding the following into the oss-fuzz integration?

  • [email protected] as the primary contact point
  • myself as a CC so that I can fine-tune/maintain the fuzzers that I'm writing

@nathaniel-brough
Copy link
Contributor

This diff is approximately what this will look like. All bug reports will be sent to [email protected] AND myself. I will have limited access to the oss-fuzz dashboard including bug and fuzzing performance information, but I will not have access to project admin level clusterfuzz features which is only available to the primary_contact (if it's a google/gmail account or linked to a github account).

@ryjones
Copy link
Contributor

ryjones commented Oct 16, 2024

@dstebila to complete this on LF's side, I need a transfer code to move the domain openquantumsafe.org to LF. I can then set up a mail reflector. @hartm knows more.

@dstebila
Copy link
Member

@nathaniel-brough: the [email protected] alias already exists, and currently forwards to me, @baentsch, and @christianpaquin. So it should be fine to start using it.

@ryjones: doing the domain transfer is on my to-do list, but has been low priority so I haven't gotten to it. There's some kind of LF form for this, right?

@ryjones
Copy link
Contributor

ryjones commented Oct 17, 2024

@dstebila I'm not sure about that. Right now, I just need a transfer code and a zone file, plus any reflectors you want me to port.

@nathaniel-brough
Copy link
Contributor

Now that #1905 is merged I've updated the integration configuration over at oss-fuzz google/oss-fuzz#12408, which is now ready to merge. I will need one of the maintainers of liboqs with merge access to review the changes in google/oss-fuzz#12408 and make a comment on the PR approving of the integration before the oss-fuzz team will proceed with the integration.

@nathaniel-brough
Copy link
Contributor

Now that oss-fuzz is set up, here is my plan for fuzz-testing going forward;

  1. Fine tune the existing dilithium fuzz harness, i.e. use a deterministic random number generator so that fuzzing is reproducible.
  2. Adapt the dilithium harness so that it can be used with code-gen so that each signature algorithm get's it's own fuzz harness.
  3. Expand the surface area of fuzzing so that the public/private key's are also generated from fuzzed data as discussed here Add a basic fuzz testing harness for Dilithium2 #1905 (comment).

Beyond this I don't really have much of a plan, except to just peruse the code-base and slowly chip away until I get to 80-100% code-coverage for fuzzing. If you have any specific requests or thoughts on how I should prioritise fuzzing of specific algorithms (or groups of algorithms) let me know and I'll keep that in mind moving forward.

@SWilson4
Copy link
Member

Beyond this I don't really have much of a plan, except to just peruse the code-base and slowly chip away until I get to 80-100% code-coverage for fuzzing. If you have any specific requests or thoughts on how I should prioritise fuzzing of specific algorithms (or groups of algorithms) let me know and I'll keep that in mind moving forward.

I think the ML-KEM implementation is probably highest priority. That said, extrapolating from your ongoing work in #1955, I imagine you'll be able to hit all the KEMs in one go. The most important thing (in my opinion) is that the test harness easily accommodates new and/or updated algorithms, ideally without manual maintenance.

@dstebila
Copy link
Member

Thanks Nathaniel!

1. Fine tune the existing dilithium fuzz harness, i.e. use a deterministic random number generator so that fuzzing is reproducible.

In case you don't already know, we do have an API for swapping out the normal RNG with another function. This is used in the kat_kem and kat_sig tests for the known answer tests. You should be able to swap in your own deterministic RNG if you need to.

@nathaniel-brough
Copy link
Contributor

In case you don't already know, we do have an API for swapping out the normal RNG with another function. This is used in the kat_kem and kat_sig tests for the known answer tests. You should be able to swap in your own deterministic RNG if you need to.

Cool, I'm working on a fuzzing optimised random(ish) number generator. I'm trying to estimate the amount of "random data" that will be used per algorithm so that I can partition out a section of the fuzzed data to drive the random(ish) number generator. As a naive guess (with minimal understanding of cryptography) I'm just using public_key_size+private_key_size. Is this a reasonable estimation or am I way off?

@dstebila
Copy link
Member

In case you don't already know, we do have an API for swapping out the normal RNG with another function. This is used in the kat_kem and kat_sig tests for the known answer tests. You should be able to swap in your own deterministic RNG if you need to.

Cool, I'm working on a fuzzing optimised random(ish) number generator. I'm trying to estimate the amount of "random data" that will be used per algorithm so that I can partition out a section of the fuzzed data to drive the random(ish) number generator. As a naive guess (with minimal understanding of cryptography) I'm just using public_key_size+private_key_size. Is this a reasonable estimation or am I way off?

In general I don't think there will be any correlation between public key + private key size and the amount of randomness consumed by an implementation. Moreover, some PQC implementations that need lots of randomness have decided to go the route of reading 32 bytes of randomness and expanding that seed internally, others keep reading from the RNG. One could go through the implementations we have in the library now and figure out how much each of them is currently reading, but there's no guarantee that wouldn't change. If you want to partition the output of the RNG, then maybe one way of doing it would be to use a PRF with relevant contact labels/counters to generate the randomness needed for different calls.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
Status: Todo
Development

No branches or pull requests

5 participants