-
Notifications
You must be signed in to change notification settings - Fork 712
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ci: run fuzz tests in parallel and generate coverage report #4960
ci: run fuzz tests in parallel and generate coverage report #4960
Conversation
run all fuzz tests fix syntax error run tests in parallel modify coverage summary modify coverage output update comment simplify output test shorter runtime remove comments
25b9682
to
224a017
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor callout; CodeBuild job names and batch status, to date, have not been update-able; (the job will need to be deleted and recreated)
@@ -77,12 +33,20 @@ phases: | |||
cmake . -Bbuild \ | |||
-DCMAKE_PREFIX_PATH=/usr/local/$S2N_LIBCRYPTO \ | |||
-DS2N_FUZZ_TEST=on \ | |||
-DFUZZ_TIMEOUT_SEC=27000 | |||
-DFUZZ_TIMEOUT_SEC=27000 \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Making this an environment variable makes it easy to over-ride later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. I had to make changes to this line to test for shorter duration. Will change this to an env var
# --timeout: override ctest's default timeout of 1500 | ||
- cmake --build build/ --target test -- ARGS="-L fuzz -R ${FUZZ_TESTS} --output-on-failure --timeout 28800" | ||
- cmake --build build/ --target test -- ARGS="-L fuzz --output-on-failure -j $(nproc) --timeout 28800" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same as above; if timeout is an env. var we can over-ride it later
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm I suspect we won't be running fuzz tests for more than 28800 secs though. I'm also not exactly sure how I make this into env, probably something like this?
cmake --build build/ --target test -- ARGS="-L fuzz --output-on-failure -j $(nproc) --timeout $TIMEOUT_VALUE"
artifacts: | ||
# upload all files in the fuzz_coverage_report directory | ||
files: | ||
- '**/*' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The broad file mask should keep this from failing, but have you tested the use of artifact uploads? (they have been flaky).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This wildcard match looks a little bit suspicious to me. Won't this upload everything? I'd expect the pattern to be something more like fuzz_coverage_report/*
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
**/*
means all files recursively, from the base-directory
which is currently set to fuzz_coverage_report
.
The generated report contains many nested folders so we want to make sure to include those.
documentation: https://docs.aws.amazon.com/codebuild/latest/userguide/build-spec-ref.html#:~:text=artifacts/base%2Ddirectory
Alternatively I could probably just specify fuzz_coverage_report/**/*
without the base-directory
option, but wanted to keep consistency with how mainline coverage specifies it:
s2n-tls/codebuild/spec/buildspec_unit_coverage.yml
Lines 41 to 45 in aaae641
artifacts: | |
# upload all files in the coverage_report directory | |
files: | |
- '**/*' | |
base-directory: coverage_report |
Either way, I will test this to confirm the behavior, and paste the result.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahh, I just missed the "base-directory" field 🤦
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tested uploading coverage files to S3 bucket and is working. I uploaded a folder test-fuzz-coverage
to s2n-build-artifacts
bucket and you should be able to see the report there
tests/fuzz/runFuzzTest.sh
Outdated
@@ -31,6 +31,9 @@ FUZZ_TIMEOUT_SEC=$2 | |||
BUILD_DIR_PATH=$3 | |||
CORPUS_UPLOAD_LOC=$4 | |||
ARTIFACT_UPLOAD_LOC=$5 | |||
FUZZ_COVERAGE=$6 | |||
S2N_ROOT=$7 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you using this variable anywhere? It's sort of a historical holddover from the Makefiles... and between CodeBuild and CMake, there should be existing equivalents defined.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ya, 7 CLI arguments is a lot, and I worry about the maintainability of this.
How can we make things simpler?
1 - Remove Unnecessary Options
My understanding is that right now BUILDER_DIR_PATH
, CORPUS_UPLOAD_LOC
, and ARTIFACT_UPLOAD_LOC
are never actually set to anything else. And I'd imagine that if they were set to something else, a whole host of things would break. That's a good sign that these aren't actually configurable, and are probably better off being represented as constant values.
2 - Separation of Concerns
Generally it's best to "separate concerns". Don't unnecessarily couple things that don't need to be coupled. This applies to shell scripts that same way that it applies to actual code
// tricky and complex, probably not the best option
fn run_fuzz_tests_and_generate_coverage_report(arg1, arg2, arg3, arg4, arg5, arg6, arg6)
// a bit more manageable
fn run_fuzz_tests(arg1, arg2, arg3, arg4)
fn generate_coverage_report(arg5, arg6, arg7)
I'd imagine an ideal "coverage pipeline" to look something like this
- build the fuzz test binaries with instrumentation enabled
- download the corpus files
- run the fuzz tests (which is mostly agnostic as to whether or not coverage is enabled)
- upload the improved corpus files
- merge the coverage information/generate the report
- upload the report
It seems odd that all of the coverage reporting is living inside a script called runFuzzTests
. Perhaps we could refactor coverage_report.sh to be slightly more generic, and then everything could use the same script?
3 - Scrutinize Coverage Assertions
I like the idea of asserting on some sort of coverage information. "feature coverage" from the fuzzer seems very nice and easy to assert on.
Asserting on coverage from the actual report seems to requires a big blob of sed
/grep
/regex
all squished together, and that is not sparking joy.
What benefit does this complicated stuff provide that feature coverage doesn't? And are our thresholds/assertions well-researched enough to actually be useful? I know we found that session ticket coverage issue when manually reviewing the code, so I question the value of these more complex assertions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe I should be able to remove BUILD_DIR_PATH
and just give a hardcoded path to the build directory.
But CORPUS_UPLOAD_LOC
, and ARTIFACT_UPLOAD_LOC
, are s3 bucket locations (For example, "s3:://bucketName"). There is probably no security concerns to make it public, as it requires correct IAM roles to be able to access the bucket. Is it fine making such resource info into hard-coded const?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did some refactoring so it only takes 3 CLI arguments instead of 7.
Since runFuzzTest.sh
is very large and hard to read, I have broken it into smaller components which are:
runFuzzTest.sh: responsible for running the actual fuzz test
fuzz_corpus_download.sh: responsible for downloading corpus files from S3
fuzz_corpus_upload.sh: responsible for uploading improved corpus files to S3
fuzz_coverage_report.sh: responsible for generating coverage information
On top of that, I removed couple of print statements that were hidden by CTest. I left the outputs that are useful for debugging in case fuzzing fails.
tests/fuzz/calcTotalCov.sh
Outdated
LINE_COV=$(echo $S2N_COV | cut -d' ' -f1) | ||
FUNC_COV=$(echo $S2N_COV | cut -d' ' -f2) | ||
BRANCH_COV=$(echo $S2N_COV | cut -d' ' -f3) | ||
printf "Coverage Report:\nLine coverage: %s\nFunction coverage: %s\nBranch coverage: %s\n" "$LINE_COV" "$FUNC_COV" "$BRANCH_COV" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These numbers are fuzz coverage for the entire repo(I assume). However, we don't fuzz test the entire repo. I would look for a more meaningful metric, like coverage of a single fuzz tested function. Otherwise it's probably better to leave this out?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it is the coverage stats for the entire repo.
Hmm yeah the numbers themselves are probably not very valuable without being able to see the details. I can remove this from the output. We will have access to these numbers from the report anyways.
For getting coverage of a single fuzz tested function, I'm not too sure how it will look like, but my guess is that each fuzz test has its own targeted functions (for example, s2n_deserialize_resumption_state_test
is targeting s2n_deserialize_resumption_state
) and we want to see how much of the targeted functions are covered by that test.
I'm afraid adding that might complicate this PR, so I will have a follow-up PR after this one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Created an issue for this: #4966
tests/fuzz/runFuzzTest.sh
Outdated
@@ -46,7 +49,7 @@ ASAN_OPTIONS+="symbolize=1" | |||
LSAN_OPTIONS+="log_threads=1" | |||
UBSAN_OPTIONS+="print_stacktrace=1" | |||
NUM_CPU_THREADS=$(nproc) | |||
LIBFUZZER_ARGS+="-timeout=5 -max_len=4096 -print_final_stats=1 -jobs=${NUM_CPU_THREADS} -workers=${NUM_CPU_THREADS} -max_total_time=${FUZZ_TIMEOUT_SEC}" | |||
LIBFUZZER_ARGS+="-timeout=5 -max_len=4096 -print_final_stats=1 -workers=${NUM_CPU_THREADS} -max_total_time=${FUZZ_TIMEOUT_SEC}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LIBFUZZER_ARGS+="-timeout=5 -max_len=4096 -print_final_stats=1 -workers=${NUM_CPU_THREADS} -max_total_time=${FUZZ_TIMEOUT_SEC}" | |
LIBFUZZER_ARGS+="-timeout=5 -max_len=4096 -print_final_stats=1 -max_total_time=${FUZZ_TIMEOUT_SEC}" |
The workers
arg should not longer be needed (because everything is single threaded/run on a single core).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed this in the latest commit
CMakeLists.txt
Outdated
@@ -696,6 +696,12 @@ if (BUILD_TESTING) | |||
set(ARTIFACT_UPLOAD_LOC "none") | |||
endif() | |||
|
|||
if(COVERAGE) | |||
set(FUZZ_COVERAGE "true") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason that we can't just use COVERAGE
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right, I can just use COVERAGE
😓. Thank you for pointing it out
artifacts: | ||
# upload all files in the fuzz_coverage_report directory | ||
files: | ||
- '**/*' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This wildcard match looks a little bit suspicious to me. Won't this upload everything? I'd expect the pattern to be something more like fuzz_coverage_report/*
.
tests/fuzz/runFuzzTest.sh
Outdated
@@ -31,6 +31,9 @@ FUZZ_TIMEOUT_SEC=$2 | |||
BUILD_DIR_PATH=$3 | |||
CORPUS_UPLOAD_LOC=$4 | |||
ARTIFACT_UPLOAD_LOC=$5 | |||
FUZZ_COVERAGE=$6 | |||
S2N_ROOT=$7 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ya, 7 CLI arguments is a lot, and I worry about the maintainability of this.
How can we make things simpler?
1 - Remove Unnecessary Options
My understanding is that right now BUILDER_DIR_PATH
, CORPUS_UPLOAD_LOC
, and ARTIFACT_UPLOAD_LOC
are never actually set to anything else. And I'd imagine that if they were set to something else, a whole host of things would break. That's a good sign that these aren't actually configurable, and are probably better off being represented as constant values.
2 - Separation of Concerns
Generally it's best to "separate concerns". Don't unnecessarily couple things that don't need to be coupled. This applies to shell scripts that same way that it applies to actual code
// tricky and complex, probably not the best option
fn run_fuzz_tests_and_generate_coverage_report(arg1, arg2, arg3, arg4, arg5, arg6, arg6)
// a bit more manageable
fn run_fuzz_tests(arg1, arg2, arg3, arg4)
fn generate_coverage_report(arg5, arg6, arg7)
I'd imagine an ideal "coverage pipeline" to look something like this
- build the fuzz test binaries with instrumentation enabled
- download the corpus files
- run the fuzz tests (which is mostly agnostic as to whether or not coverage is enabled)
- upload the improved corpus files
- merge the coverage information/generate the report
- upload the report
It seems odd that all of the coverage reporting is living inside a script called runFuzzTests
. Perhaps we could refactor coverage_report.sh to be slightly more generic, and then everything could use the same script?
3 - Scrutinize Coverage Assertions
I like the idea of asserting on some sort of coverage information. "feature coverage" from the fuzzer seems very nice and easy to assert on.
Asserting on coverage from the actual report seems to requires a big blob of sed
/grep
/regex
all squished together, and that is not sparking joy.
What benefit does this complicated stuff provide that feature coverage doesn't? And are our thresholds/assertions well-researched enough to actually be useful? I know we found that session ticket coverage issue when manually reviewing the code, so I question the value of these more complex assertions.
- split report generation logic from fuzz runner - remove unused output - use less arguments modify script name use correct file name refactor into smaller components add execute permissions suppress noisy output fix path to profraw fix profraw path fix profraw gen
2fc5827
to
ce73ffa
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a few comments, but overall a big improvement from the mega script 🥳
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With @jmayclin's changes...
- remove unset LD_PRELOADS - fix script name in the usage - use cmake to configure compiler - remove unused options
Release Summary:
Resolved issues:
N/A
Description of changes:
There are two main changes in this PR:
Use LLVM's coverage tool to generate a coverage report for the scheduled fuzz test. The report is in HTML format, allowing viewers to see line coverage, function coverage, and branch coverage for the s2n-tls library. This is enabled only for the scheduled fuzz job, which uses improved corpus files. Fuzz tests that run against each PR do not use the improved corpus files, so they do not accurately represent the coverage we achieve.
Currently we are running each fuzz test with multi-threads. We have data to indicate it does not effectively increase code coverage. Therefore we should run them in parallel with single core, which achieves as much code coverage as running them with multi-threads.
This is what a sample coverage looks like (generated locally):
runFuzzTest.sh
into smaller componentsCurrently
runFuzzTest.sh
script has too many logics living inside of it. This makes it very hard to understand what the script is actually doing, and is not very maintainable. I have broken it into smaller components which are:runFuzzTest.sh
: responsible for running the actual fuzz testfuzz_corpus_download.sh
: responsible for downloading corpus files from S3fuzz_corpus_upload.sh
: responsible for uploading improved corpus files to S3fuzz_coverage_report.sh
: responsible for generating coverage informationCall-outs:
We need to make few changes to
s2nFuzzBatchScheduled
job on CodeBuild after this PR merge.s2nFuzzBatchScheduled
tos2nFuzzScheduled
or similarbuildspec_fuzz_scheduled.yml
). We need to set the upload location on build project configuration.The file diff is a bit large, but most of it is moving existing logics into different places so hopefully it's not too difficult to read.
Testing:
Running the new scheduled fuzz test with shorter runtime using this PR: Link to CodeBuild job
Overriding regular "per PR" fuzz test using this PR: Link to CodeBuild job
Finished setting up a new scheduled job for uploading coverage report. Coverage report is now available at: https://dx1inn44oyl7n.cloudfront.net/fuzz-coverage-report/index.html
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.