-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
listener: performance degradation when exact balance used with original dst #15146
Comments
IMHO the rebalance should be applied to the second listener. In most of the cases, the first listener doesn't own any connection |
I also expect rebalance to be applied to the second listener. But(correct me if i am wrong) it seems rebalance can only happen at the first listener, see comments here. If |
@caitong93 It won't work under the current code. This is a reasonable scenario that to apply to the second listener. @mattklein123 I can change if you agree. |
Yeah I agree we probably need to specially handle this case where for forwarded connections we do the rebalance at that point and not initially. |
Plan to fix this along with #15126 |
Will the fix mean that users have to configure exact_balance on the first catch_all listener 0.0.0.0:15001, or do users have to configure it on the next listener 0.0.0.0:9080 (in case the upstream service/cluster is at 9080)? Related to istio/istio#18152, where @hobbytp tried to apply the exact_balance on the second listener. From a end-user perspective, when somebody is digging into this setting, it is because of performance tuning in high throughput and low latency environments and I would assume that he/she thereby expects to have to only tune this setting once, instead of once for every target cluster, handled by a separate 2nd in line 0.0.0.0:<svc_port> listener? Or do you foresee that users should have the ability to configure this per second_in_line_listener/upstream_service pair? |
Sorry for the late reply. Yeah, you can also use exact_balancer for 9080 listener and not to use balancer for 9070. |
…15842) If listener1 redirects the connection to listener2, the balancer field in listener2 decides whether to rebalance. Previously we rely on the rebalancing at listener1, however, the rebalance is weak because listener1 is likely to not own any connection and the rebalance is no-op. Risk Level: MID. Rebalance may introduce latency. User needs to clear rebalancer field of listener2 to recover the original behavior. Fix #15146 #16113 Signed-off-by: Yuchen Dai <[email protected]>
…nvoyproxy#15842) If listener1 redirects the connection to listener2, the balancer field in listener2 decides whether to rebalance. Previously we rely on the rebalancing at listener1, however, the rebalance is weak because listener1 is likely to not own any connection and the rebalance is no-op. Risk Level: MID. Rebalance may introduce latency. User needs to clear rebalancer field of listener2 to recover the original behavior. Fix envoyproxy#15146 envoyproxy#16113 Signed-off-by: Yuchen Dai <[email protected]> Signed-off-by: Gokul Nair <[email protected]>
…nvoyproxy#15842) If listener1 redirects the connection to listener2, the balancer field in listener2 decides whether to rebalance. Previously we rely on the rebalancing at listener1, however, the rebalance is weak because listener1 is likely to not own any connection and the rebalance is no-op. Risk Level: MID. Rebalance may introduce latency. User needs to clear rebalancer field of listener2 to recover the original behavior. Fix envoyproxy#15146 envoyproxy#16113 Signed-off-by: Yuchen Dai <[email protected]> Signed-off-by: Gokul Nair <[email protected]>
Title: listener:performance degradation when exact balance used with original dst
Description:
We use Envoy as a sidecar, all out bound traffic is first redirected to 127.0.0.1:15001(Envoy) by iptables, then forwarded to different listeners by original dst. When
exact balance
is enabled(on all listeners), we found the connection balance being worse. Tested with 32 downstream connections, 2 workers, connection distribution between the two handlers is always 1:31(sometimes 2:30) by observingdownstream_cx_active
.When a new connection being received, it will be handled by
ExactConnectionBalancer
first, and increasenumConnections()
of the selected handler by one. Then connection is forwarded to a new listener inConnectionHandlerImpl::ActiveTcpSocket::newConnection()
, and decrease that gauge immediately.If connections arrive quickly, there is a high chance that the first handler is always selected since
numConnections()
of all handlers are zero.The text was updated successfully, but these errors were encountered: