feat: propogate local_request_timeout_ms to downstream #14639

huikang · 2022-09-15T14:33:38Z

Description

Situation: "If an application is configured with a local_request_timeout_ms value, would it be a good idea to auto-propagate this timeout to consumers/downstreams of that service. That way we don't have to configure the timeout in 2 different places. Maybe downstreams would want to timeout sooner. In that case they could specify the timeout and this would just be a way to modify the default timeout value."

The implementation of this PR is to loop over the upstream proxy services to find the smallest local_request_timeout_ms. If the value is smaller than the default, will propagate to the downstream proxy service.
- call makeInboundListener after looping all the upstream
doc is updated
unit test is added: to make the upstream service to have its own local_request_timeout_ms, a new parameter is used to configure the proxy.Config of the upstream proxy service.

Testing & Reproduction steps

Links

Address #12553

PR Checklist

updated test coverage
external facing docs updated
not a security concern

- call makeInboundListener after looping all the upstream to get the min local_request_timeout_ms - update test

mkeeler

I think it would be good for someone else to take a second look here so I have tagged the consul service mesh team.

My hesitation is that this seems like it could introduce issues with requests to upstreams that have higher timeouts than others.

Take the following scenario:

flowchart TD
   %% Basic States
   otherServices(Other Services)
   subgraph foo
      fooService(foo)
      subgraph fooSidecar [foo sidecar proxy]
         direction TB
         fooInbound(Inbound Listener)
         fooBarUpstream(Bar Upstream)
         fooBazUpstream(Baz Upstream)
      end
   end
   
   subgraph bar
      direction LR
      barService(bar)
      subgraph barSidecar [bar sidecar proxy]
         barInbound(Inbound Listener)
      end
   end
   
   subgraph baz
      direction LR
      bazService(baz)
      subgraph bazSidecar [baz sidecar proxy]
         direction LR         
         bazInbound(Inbound Listener)
      end
   end
   %% classDef default fill:#f9decd,stroke:#e06e2d,stroke-width:4px,color:black;

 %% State Transitions
   otherServices-- Timeout: 5s  -->fooInbound;
   fooInbound-->fooService;
   fooService-->fooBarUpstream;
   fooService-->fooBazUpstream;
   fooBarUpstream-->barInbound;
   fooBazUpstream-->bazInbound;
   barInbound-- Timeout: 60s -->barService
   bazInbound-- Timeout: 5s -->bazService

Here there are 2 upstreams and the min request timeout for them to their local app is 5s. I think this means that for the flow where some service makes a request to foo which needs to make a request to bar, the timeout may be too short and kill the connections too early.

Ideally we would love the diagram to look more like the following:

flowchart TD
   %% Basic States
   otherServices(Other Services)
   subgraph foo
      fooService(foo)
      subgraph fooSidecar [foo sidecar proxy]
         direction TB
         fooInbound(Inbound Listener)
         fooBarUpstream(Bar Upstream)
         fooBazUpstream(Baz Upstream)
      end
   end
   
   subgraph bar
      direction LR
      barService(bar)
      subgraph barSidecar [bar sidecar proxy]
         barInbound(Inbound Listener)
      end
   end
   
   subgraph baz
      direction LR
      bazService(baz)
      subgraph bazSidecar [baz sidecar proxy]
         direction LR         
         bazInbound(Inbound Listener)
      end
   end

 %% State Transitions
   otherServices-- Timeout: none -->fooInbound;
   fooInbound-->fooService;
   fooService-- Timeout: 60s -->fooBarUpstream;
   fooService-- Timeout: 5s -->fooBazUpstream;
   fooBarUpstream-- Timeout: 60s -->barInbound;
   fooBazUpstream-- Timeout: 5s -->bazInbound;
   barInbound-- Timeout: 60s -->barService
   bazInbound-- Timeout: 5s -->bazService

Here the timeouts are configured per-upstream so requests from the foo app of the bar service will use the bar services local app timeout.

github-actions · 2022-11-22T01:35:54Z

This pull request has been automatically flagged for inactivity because it has not been acted upon in the last 60 days. It will be closed if no new activity occurs in the next 30 days. Please feel free to re-open to resurrect the change if you feel this has happened by mistake. Thank you for your contributions.

feat: propogate local_request_timeout_ms to downstream

b0e05ee

- call makeInboundListener after looping all the upstream to get the min local_request_timeout_ms - update test

huikang requested a review from a team as a code owner September 15, 2022 14:33

github-actions bot added theme/envoy/xds Related to Envoy support type/docs Documentation needs to be created/updated/clarified labels Sep 15, 2022

huikang requested a review from mkeeler September 15, 2022 14:37

changelog

8a3a94a

huikang force-pushed the propogate-request-time-downstream branch from d08b96a to 8a3a94a Compare September 15, 2022 14:43

huikang added backport/1.13 labels Sep 15, 2022

mkeeler requested a review from a team September 22, 2022 17:53

mkeeler reviewed Sep 22, 2022

View reviewed changes

github-actions bot added the meta/stale Automatically flagged for inactivity by stalebot label Nov 22, 2022

DanStough removed the backport/1.11 label Feb 14, 2023

zalimeni closed this Jun 10, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: propogate local_request_timeout_ms to downstream #14639

feat: propogate local_request_timeout_ms to downstream #14639

huikang commented Sep 15, 2022 •

edited

Loading

mkeeler left a comment

github-actions bot commented Nov 22, 2022

feat: propogate local_request_timeout_ms to downstream #14639

feat: propogate local_request_timeout_ms to downstream #14639

Conversation

huikang commented Sep 15, 2022 • edited Loading

Description

Testing & Reproduction steps

Links

PR Checklist

mkeeler left a comment

Choose a reason for hiding this comment

github-actions bot commented Nov 22, 2022

huikang commented Sep 15, 2022 •

edited

Loading