merge from upstream #6

eric846 · 2020-06-28T02:25:08Z

No description provided.

When buildifier or buildozer can't be found, these instructions are printed: ``` ERROR: Command /usr/local/bin/buildifier not found. If you have buildifier installed, but the binary name is different or it's not available in $GOPATH/bin, please use BUILDIFIER_BIN environment variable to specify the path. Example: export BUILDIFIER_BIN=`which buildifier` If you don't have buildifier installed, you can install it by: go get -u github.com/bazelbuild/buildtools/buildifier ERROR: Command /usr/local/bin/buildozer not found. If you have buildozer installed, but the binary name is different or it's not available in $GOPATH/bin, please use BUILDOZER_BIN environment variable to specify the path. Example: export BUILDOZER_BIN=`which buildozer` If you don't have buildozer installed, you can install it by: go get -u github.com/bazelbuild/buildtools/buildozer ``` After doing the `go get` and `export`, we still get the same message. It turns out do_ci.sh unconditionally overwrites BUILDIFIER_BIN and BUILDOZER_BIN. We have been using this workaround: ``` ln -s $HOME/go/bin/buildifier /usr/local/bin/buildifier ln -s $HOME/go/bin/buildozer /usr/local/bin/buildozer ```

Signed-off-by: Otto van der Schaaf <[email protected]>

Signed-off-by: Yan Avlasov <[email protected]>

…338) If `echo_request_headers` option is set to `true`, then request headers dump is appended to the response body.

Signed-off-by: Otto van der Schaaf <[email protected]>

Avoid excepting in scenarios with many workers: - Log an error instead of throwing - Tweaks kMinimalWorkerDelay so that the problem may in practice resolve: allow more headroom as configured concurrency increases. Intends to fix #339 Signed-off-by: Otto van der Schaaf <[email protected]>

- Cap the number of decimals when rendering the percentiles column data. - Add a more detailed test for fixating output of the human readable output formatter, so we'll catch unintended changes better next time. Fixes #352 Signed-off-by: Otto van der Schaaf <[email protected]>

Signed-off-by: Otto van der Schaaf <[email protected]>

This PR has changes to deflake integration tests for sanitizer runs. - Allow servers more time to start up - Slow down some tests to make them less sensitive to timing issues - It is possible for some counter values to slightly exceed a configured termination threshold for them in edge cases. Update expectations to match that. Fixes #332

In preparation of sharing functionality for signal handling, extract what we have right now into a `SignalHandler` class. Preparation for #280 Signed-off-by: Otto van der Schaaf <[email protected]>

Adds `--no-duration`, which will be treated as a request to infinitely continue load generation. Other default predicates will still exist, so\ execution may still stop when connection failures or error response codes are observed. This leaves some small tech debt: - We need test coverage, but that's much easier to add once #280 or #344 goes in Fixes #333 Signed-off-by: Otto van der Schaaf <[email protected]>

This makes the gRPC service log more then one ExecutionStream requests. The log level used will be controlled by the options associated to ExecutionStream requests (possible candidate for improvement later on). This change also incorporates a change to avoid creating multiple instances of Envoy::ProcessWide in the Nighthawk gRPC service flow. This doesn't seem to address any know issues, but it seems to be more along the way Envoy::ProcessWide was intended to be used. Fixes #289 Signed-off-by: Otto van der Schaaf <[email protected]>

Fixes #364 Signed-off-by: Otto van der Schaaf <[email protected]>

Following some digging into stability of our coverage testing, creating this PR for testing CI with some fixes / enhancements, as well as covering a few more lines in our CLI tooling exception paths. There seems to be something racy going on where sometimes the coverage originating from our python integration tests is not counted. Eventually I learned that things might stabilise when we switch the toolchain to clang 10. We follow Envoy's lead on that, so for now stopping further efforts on this. ### Description of the changes in this PR: - Increase timeouts of ASAN/TSAN runs - Make `envoy_build_sha.sh`, which apparently broke in one of the Envoy updates - Cover some of the outer exception handling paths for Nighthawk's CLIs - Integration test for nighthawk_output_transform - Add coverage for code lines where it seemed trivial to do so (e.g. NullStatistic) - Fix `run_nighthawk_bazel_coverage.sh` to work on my machine - Stricter bash options in some scripts + changes to make that work ### Prerequisites - [x] Needs #357 to go in first. --------- Leaving some notes/learnings here: - there seems to be something racy going on where sometimes the coverage originating from our python integration tests is not counted. This is relatively infrequent in CI. - On my dev machine this structurally happens. This might be a toolchain/version issue. - On my dev machine, using the docker flow, I have difficulties: the docker container exits when running the full test suite, without leaving a trace why. bash traps in the container don't fire, no useful logging from docker. Eventually I was able to get past bazel fetching its deps, and then build just the python integration tests. As long as a single test is involved, things seem to work / the container gets to finish. It's remarkable that CI doesn't have this problem. Coverage structurally looks as expected in this scenario. - The link step is super memory hungry. Separating that step might be a ci perf opportunity. Coverage is the slowest CI job we have.

) Fixes #372 Signed-off-by: Otto van der Schaaf <[email protected]>

Fix benchmark_http_client_test.cc which has a bug in function testBasicFunctionality(). For example, if we have a test fixture with testBasicFunctionality(max_pending=0,connection_limit=15,amount_of_request=10), then EXPECT_FALSE(client_->tryStartRequest(f)) and EXPECT_EQ(max_pending == 0 ? 1 : max_pending, inflight_response_count) will fail. For test fixture with testBasicFunctionality(max_pending=2,connection_limit=3,amount_of_request=10), EXPECT_EQ(max_pending == 0 ? 1 : max_pending, inflight_response_count) will fail. Added 2 test cases to verify the expected behavior fixed by this change. Tested: Unit test.

Notable changes in this Update: - Changes related to header construction - CI moves from clang-9 to clang-10 - Changes related to satisfy the new clang-tidy version: - Fix superfluous initializers in the rate limiter code. - Suppress modernize-avoid-bind to get sequencer_test.cc to pass checks. (clang-tidy's auto generated suggested fix fails to build) - Move to use typed configs in `AddFilter()` in the server filter tests. This avoids asserts being triggered by the integration test framework as deprecated fields for which support was dropped get generated otherwise. - Drop the coverage threshold by 0.1% Hopefully addresses #362 by producing more reliable and accurate coverage reports. Signed-off-by: Otto van der Schaaf <[email protected]>

Teach the CLI to handle SIGTERM/SIGINT, and handle those as a request to gracefully cancel execution. Partially resolves #280

…384) In #366 we had to lower the coverage threshold as that PR didn't come with the means to easily cover testing with infinite excecution durations. Now that we have signal handling in place, we can easily pull this off. Fixes #370 Signed-off-by: Otto van der Schaaf <[email protected]>

This PR has changes that make it easy to consume our integration testing as a benchmarking tool framework for external repositories, allowing those to write custom tests for running in CI. This probably needs to be split up into multiple PR's, but this allows reviewers to have an early peek at what such PR's would lead up to. The `README.md` in here has more context, but the primary goals are te be able to: - run the tests against arbitrary Envoy revisions - persist profile dumps, flamegraphs, and latency numbers per test - run the nighthawk tools via docker - offer stock tests, but also allow scaffolding consumer-specific tests Example of a fully dockerized flow (more example in `README.md`): ``` # This script runs the dockerized benchmarking framework, which in # turn will pull Nighthawk and Envoy in via docker. set -eo pipefail set +x set -u # The benchmark logs and artifacts will be dropped here OUTDIR="/home/oschaaf/code/envoy-perf-vscode/nighthawk/benchmarks/tmp/" # Used to map the test that we want to see executed into the docker container TEST_DIR="/home/oschaaf/code/envoy-perf-vscode/nighthawk/benchmarks/test/" # Rebuild the docker in case something changed. ./docker_build.sh && docker run -it --rm \ -v "/var/run/docker.sock:/var/run/docker.sock:rw" \ -v "${OUTDIR}:${OUTDIR}:rw" \ -v "${TEST_DIR}:/usr/local/bin/benchmarks/benchmarks.runfiles/nighthawk/benchmarks/external_tests/" \ --network=host \ --env NH_DOCKER_IMAGE="envoyproxy/nighthawk-dev:latest" \ --env ENVOY_DOCKER_IMAGE_TO_TEST="envoyproxy/envoy-dev:f61b096f6a2dd3a9c74b9a9369a6ea398dbe1f0f" \ --env TMPDIR="${OUTDIR}" \ oschaaf/benchmark-dev:latest ./benchmarks --log-cli-level=info -vvvv ``` Part of #245

eric846 and others added 20 commits June 3, 2020 12:49

Update Envoy to ff40a02a838d123f4bf018858c61c983e1c5cce9 (#343)

2a6583e

Signed-off-by: Otto van der Schaaf <[email protected]>

Add explict dependecy on absl strings (#350)

1ad7768

Signed-off-by: Yan Avlasov <[email protected]>

Add echo_request_headers to the test server filter ResponseOptions. (#…

adfc004

…338) If `echo_request_headers` option is set to `true`, then request headers dump is appended to the response body.

Update Envoy to 8fd1a52e608f7223b4864ca913cc3ed82340991d (#353)

de5f1fd

Signed-off-by: Otto van der Schaaf <[email protected]>

Update Envoy to b612236bcd8ef3cd14d904c21348888a6755d5d6 (#359)

2da37e8

Signed-off-by: Otto van der Schaaf <[email protected]>

Extract SignalHandler class for re-use (#282)

a921fc9

In preparation of sharing functionality for signal handling, extract what we have right now into a `SignalHandler` class. Preparation for #280 Signed-off-by: Otto van der Schaaf <[email protected]>

Avoid log warning for tls_context if the user didn't set it (#365)

1f1634c

Fixes #364 Signed-off-by: Otto van der Schaaf <[email protected]>

TerminationPredicateTest: explicitly order std::move/get reference (#373

a599690

) Fixes #372 Signed-off-by: Otto van der Schaaf <[email protected]>

CLI - cancel execution upon signal reception (#367)

3fe461c

Teach the CLI to handle SIGTERM/SIGINT, and handle those as a request to gracefully cancel execution. Partially resolves #280

eric846 merged commit 5ac755a into eric846:master Jun 28, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

merge from upstream #6

merge from upstream #6

eric846 commented Jun 28, 2020

merge from upstream #6

merge from upstream #6

Conversation

eric846 commented Jun 28, 2020