Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test: add scaled timer integration test #14290

Merged
merged 2 commits into from
Dec 8, 2020

Conversation

akonradi
Copy link
Contributor

@akonradi akonradi commented Dec 4, 2020

Commit Message: Add an integration test for the HTTP connection idle scaled timeout
Additional Description:
Add an integration test that verifies that when the resource pressure
attached to the reduce_timeouts overload action changes, it affects
streams that were created previously.

This also sets up for creating integration tests for other scaled
timeouts.

Risk Level: low
Testing: ran new test
Docs Changes: n/a
Release Notes: n/a
Platform Specific Features: n/a

Set up only the actions relevant to each test. This will help ensure
that each test is only verifying the expected behavior.

Signed-off-by: Alex Konradi <[email protected]>
Add an integration test that verifies that when the resource pressure
attached to the reduce_timeouts overload action changes, it affects
streams that were created previously.

This also sets up for creating integration tests for other scaled
timeouts.

Signed-off-by: Alex Konradi <[email protected]>
EXPECT_TRUE(fake_upstream_connection_->waitForNewStream(*dispatcher_, upstream_request_));
ASSERT_TRUE(upstream_request_->waitForHeadersComplete());
ASSERT_TRUE(upstream_request_->waitForData(*dispatcher_, 10));
upstream_request_->encodeHeaders(Http::TestResponseHeaderMapImpl{{":status", "202"}}, true);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like the meaning of this test changed, the request is now getting to the upstream.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was intentional. This test was checking two overload behaviors at once: first that connections would not be accepted, then that when they were, new requests would be rejected. This seemed like a good way to disentangle those two.

// For HTTP1, Envoy will start draining but will wait to close the
// connection. If a new stream comes in, it will set the connection header
// to "close" on the response and close the connection after.
auto response = codec_client_->makeRequestWithBody(request_headers, 10);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems strange. downstream_cx_idle_timeout was incremented but the downstream connection is still open and accepting requests?

How long does Envoy keep the downstream connection in the draining state?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a configurable drain timeout, applied here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like drain timeout defaults to 5 seconds. I'm not sure if it is possible to disable. Should we consider scaling the drain timeout?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's so small that I'm not sure it's necessary.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's relative. If we allow the downstream connection timeout to scale to 0 secs, 5 second drain timeout may seem like an eternity.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough. It would be easy enough to add. Do you have a strong preference?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add a comment about it to the issue for scaled timeouts. We can decide later what to do about it.

@ggreenway ggreenway merged commit 7fe3a89 into envoyproxy:master Dec 8, 2020
@antoniovicente
Copy link
Contributor

TSAN failure that looks related to this PR found while running tests for #14282

https://dev.azure.com/cncf/4684fb3d-0389-4e0b-8251-221942316e06/_apis/build/builds/60163/logs/145

WARNING: ThreadSanitizer: data race (pid=15)
Write of size 1 at 0x7b5c00027290 by main thread:
__tsan_mutex_destroy /home/brian/src/final/llvm-project/compiler-rt/lib/tsan/rtl/tsan_interface_ann.cpp:472:3 (overload_integration_test+0x2d6100a)
absl::Mutex::~Mutex() /proc/self/cwd/external/com_google_absl/absl/synchronization/mutex.cc:727:3 (overload_integration_test+0x853dd82)
Envoy::FakeStream::~FakeStream() /proc/self/cwd/./test/integration/fake_upstream.h:51:7 (overload_integration_test+0x3072cc2)
Envoy::FakeStream::~FakeStream() /proc/self/cwd/./test/integration/fake_upstream.h:51:7 (overload_integration_test+0x3074946)
Envoy::FakeStream::~FakeStream() /proc/self/cwd/./test/integration/fake_upstream.h:51:7 (overload_integration_test+0x307498f)
std::__1::default_deleteEnvoy::FakeStream::operator()(Envoy::FakeStream*) const /opt/llvm/bin/../include/c++/v1/memory:2363:5 (overload_integration_test+0x2f05036)
std::__1::unique_ptr<Envoy::FakeStream, std::__1::default_deleteEnvoy::FakeStream >::reset(Envoy::FakeStream*) /opt/llvm/bin/../include/c++/v1/memory:2618:7 (overload_integration_test+0x2db4930)
Envoy::OverloadScaledTimerIntegrationTest_CloseIdleHttpConnections_Test::TestBody() /proc/self/cwd/test/integration/overload_integration_test.cc:317:15 (overload_integration_test+0x2daf2c0)

@akonradi
Copy link
Contributor Author

akonradi commented Dec 8, 2020

Sorry, that should be fixed by #14327

mpuncel added a commit to mpuncel/envoy that referenced this pull request Dec 8, 2020
* master: (41 commits)
  event: Remove a source of non-determinism by always running deferred deletion before post callbacks (envoyproxy#14293)
  Fix TSAN bug in integration test (envoyproxy#14327)
  tracing: Add hostname to Zipkin config.  (envoyproxy#14186) (envoyproxy#14187)
  [conn_pool] fix use after free in H/1 connection pool (envoyproxy#14220)
  lua: update deprecated lua_open to luaL_newstate (envoyproxy#14297)
  extension: use bool_flag to control extension link (envoyproxy#14240)
  stats: Factor out creation of cluster-stats StatNames from creation of the stats, to save CPU during xDS updates (envoyproxy#14028)
  test: add scaled timer integration test (envoyproxy#14290)
  [Win32 Signals] Add term and ctrl-c signal handlers (envoyproxy#13954)
  config: v2 transport API fatal-by-default. (envoyproxy#14223)
  matcher: fix UB bug caused by dereferencing a bad optional (envoyproxy#14271)
  test: putting fake upstream config in a struct (envoyproxy#14266)
  wasm: use Bazel rules from Proxy-Wasm Rust SDK. (envoyproxy#14292)
  docs: fix typo (envoyproxy#14237)
  dependencies: allowlist CVE-2018-21270 to prevent false positives. (envoyproxy#14294)
  typo in redis doc (envoyproxy#14248)
  access_loggers: removed redundant dep (envoyproxy#14274)
  fix http2 flaky test (envoyproxy#14261)
  test: disable flaky xds_integration_test. (envoyproxy#14287)
  http: add functionality to configure kill header in KillRequest proto (envoyproxy#14288)
  ...

Signed-off-by: Michael Puncel <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants