-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
honour routes timeout if max stream duration is set with out stream duration #15585
Conversation
Signed-off-by: Rama Chavali <[email protected]>
@mattklein123 @markdroth PTAL |
Signed-off-by: Rama Chavali <[email protected]>
@adamsjob FYI |
Assigning over to @alyssawilk who I think reviewed the other timeout PRs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for catching and fixing this!
source/common/router/router.cc
Outdated
@@ -156,6 +156,13 @@ FilterUtility::finalTimeout(const RouteEntry& route, Http::RequestHeaderMap& req | |||
} else { | |||
timeout.global_timeout_ = route.timeout(); | |||
} | |||
} else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When we switched timeouts, we moved the instantiation of the timeout to
ConnectionManagerImpl::ActiveStream::refreshDurationTimeout()
Mind moving this logic there?
Signed-off-by: Rama Chavali <[email protected]>
Signed-off-by: Rama Chavali <[email protected]>
Signed-off-by: Rama Chavali <[email protected]>
// Test timeout when route timeout needs to be used if max_stream_duration is not specified. | ||
// Regression test for https://github.com/envoyproxy/envoy/issues/15530. | ||
// TODO(ramaraochavali): move this test to DurationTimeout test. | ||
TEST_F(HttpConnectionManagerImplTest, DurationTimeoutFromRouteTimeout) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had to create a separate test for this because clusterInfo
is repeatedly called and it keeps triggering the clearRouteCache so the timer gets fired multiple times. The same problem exists for https://github.com/envoyproxy/envoy/blob/main/test/common/http/conn_manager_impl_test.cc#L2173 where timer is repeatedly enabled. Having two such tests in the same test is not working. So I made it as separate test and TODO to look at the issue later.
@alyssawilk moved to |
Signed-off-by: Rama Chavali <[email protected]>
Signed-off-by: Rama Chavali <[email protected]>
Alyssa knows this code a lot better than I do, so I'll let her review. |
Signed-off-by: Rama Chavali <[email protected]>
@alyssawilk can you PTAL when you get chance? Also the failure seems not related to this change |
// If we are using new timeouts and MaxStreamDuration is not set, use the route's timeout | ||
// value. This is the case when route's MaxStreamDuration is specified without inner | ||
// MaxStreamDuration i.e. only grpc_timeout_header_max is set. | ||
if (!route->maxStreamDuration()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so if grpc_timeout_header_max is set when do we use it? usingNewTimeouts would be true and we'd ignore all the code below, no?
Can you please fix and regression test?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, Yeah.. Missed that. Will add a test
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@alyssawilk Can you PTAL at the fix to see if this is correct? One thing I found was this needs changes to most of the test cases because now we will be initiating request timer most of the times. So I just want to ensure this is the right direction before I make changes the tests(some of which I have already done - please take a look at the changes)
Signed-off-by: Rama Chavali <[email protected]>
std::chrono::milliseconds timeout; | ||
bool disable_timer = false; | ||
auto grpc_timeout = Grpc::Common::getGrpcTimeout(*request_headers_); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Offhand I recall debate around the original PR about how the new fields should be applied and it was the consensus from the gRPC folks that the difference wasn't "is this a gRPC request or not" but "is the gRPC timeout header present or not" so I think switching to this is against the intent of the folks who added this to the API.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it. This is good to know. I will change it accordingly
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome. It's super complicated and confusing, so thanks for continuing to plug away at it!
Signed-off-by: Rama Chavali <[email protected]>
@alyssawilk Sorry for the delay. Fixed. PTAL. I have added a regression test for this and left a TODO to merge to DurationTimeout test later as I am having some issues with mock timer (even for existing tests also this problem exists). |
Signed-off-by: Rama Chavali <[email protected]>
Signed-off-by: Rama Chavali <[email protected]>
Signed-off-by: Rama Chavali <[email protected]>
Signed-off-by: Rama Chavali <[email protected]>
Signed-off-by: Rama Chavali <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the fix to timeout config handling.
@alyssawilk Can you PTAL when you get chance? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah sorry, I thought you meant to address the test TODO in this PR. I'm fine with it being a follow-up!
@alyssawilk @antoniovicente one thing I realized is when timeout happens with new timeouts we call https://github.com/envoyproxy/envoy/blob/main/source/common/http/conn_manager_impl.cc#L762 and that sets http status to 408 (RequestTimeout) instead of gateway timeout (504) that router sets. Is this expected ? or do we need to change it to return 504 for timeouts? if it is expected - should we document it some where? |
Adds a runtime flag for disabling the behavior introduced in envoyproxy#15585 Signed-off-by: Snow Pettersen <[email protected]>
…stream duration (envoyproxy#15585)" This reverts commit ffa1680. Signed-off-by: Snow Pettersen <[email protected]>
…stream duration (#15585)" (#16133) This reverts commit ffa1680. Signed-off-by: Snow Pettersen <[email protected]>
FYI this broke web sockets for us (by forcing them to terminate earlier), wondering if we can contribute more integration tests here to catch these sort of regressions. |
…stream duration (#15585)" (#16289) This reverts commit ffa1680. Signed-off-by: Dmitri Dolguikh <[email protected]>
…stream duration (envoyproxy#15585)" (envoyproxy#16133) This reverts commit ffa1680. Signed-off-by: Snow Pettersen <[email protected]> Signed-off-by: Gokul Nair <[email protected]>
Commit Message: honour routes timeout if maxstream duration is set with out specifying max stream duration inside it.
Additional Description:
Risk Level: Low
Testing: Added
Docs Changes: N/A
Release Notes: N/A (There is already a bug fix mentioned for MaxStreamDuration in current.rst)
Platform Specific Features: N/A
Fixes #15530