You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Right now, proxy and agent are bundled in the same container. Their lifecycle cannot be controller by k8s native, it's the agent controlling the proxy process, and it's flaky.
Proxy should be the same container in ingress, sidecar or middle proxy.
Ingress controller is not separate from the proxy, it runs in the same container. Does ingress controller need to restart the envoy process also? If yes, in what scenarios?
Yes, pilot agent is triggering restarts since LDS does not cover TLS contexts (yet). We can't do TLS cert rotation without it at the moment. Lyft uses a similar set-up running a little agent next to the proxy.
Ingress controller is separate from the proxy. Discovery service computes the ingress routes and then sends them to the proxy. Again, TLS changes require restarts.
If we never need to restart Envoy then there is simply no need for the agent. If we do need to restart Envoy then it has to be in the same container to coordinate socket exchange.
The point is there will always be parts of Envoy config that cannot be
loaded through any DS service. We have decided to do this explicitly in
Envoy for global settings, for the sake of stability and simplicity of code
base.
And to be able to change these without disruption, we need hot restarts and
thus an in-container agent.
I don't think our agent has been flaky. Numerous people have run the demos
where tasks involved reloads (fault filters for example). With the
introduction of LDS we have eliminated 95% of frequent restarts which might
have caused some race conditions in Envoy in the past. These race
conditions occurred at listener level iirc, and LDS circumvents these
issues completely.
What remains to be seen is whether we encounter instability for the global
configs that need infrequent hot restarts.
On Sat, Jul 29, 2017 at 1:57 AM Andra Cismaru [email protected]
wrote:
So the ingress controller is bundled with the pilot discovery?
I do not believe in the "always" in the above sentence. Let's just wait from input from the people working in envoy on that.
And the restart has been flaky, pilot e2e routing tests fail randomly when run independently. It's good LDS will solve 95% of the problem, but even with LDS I see random failures. I am looking for an architecture where we solve the problem completely. If it is possible, let's do it.
In early February I asked Kuat about these frequent restarts and why the communication between agent and envoy is done via config, and in general, why is this so complicated, and the answer I got was "this is what it is, all proxies work this way, there is nothing we can do". We could do xDS little time after. So I am not buying the "always" unless there is a technical impossibility or a solid reason for not being able to do it.
The text was updated successfully, but these errors were encountered:
we have sufficiently decoupled the agent from envoy and the only thing it does is restart on certs, which will also go away when SDS kicks in. Closing this issue for the moment
@andraxylia commented on Fri Jul 28 2017
Right now, proxy and agent are bundled in the same container. Their lifecycle cannot be controller by k8s native, it's the agent controlling the proxy process, and it's flaky.
Proxy should be the same container in ingress, sidecar or middle proxy.
This will also simplify the build.
@kyessenov commented on Fri Jul 28 2017
k8s cannot coordinate envoy restarts (passing TCP sockets between restarts)
Ingress controller is separated from proxy. Discovery service implements k8s ingress controller.
Done already.
@andraxylia commented on Fri Jul 28 2017
Does pilot agent still triggers envoy restart even after LDS? In what scenarios?
@andraxylia commented on Fri Jul 28 2017
Ingress controller is not separate from the proxy, it runs in the same container. Does ingress controller need to restart the envoy process also? If yes, in what scenarios?
@kyessenov commented on Fri Jul 28 2017
I feel that there is some confusion here.
Yes, pilot agent is triggering restarts since LDS does not cover TLS contexts (yet). We can't do TLS cert rotation without it at the moment. Lyft uses a similar set-up running a little agent next to the proxy.
Ingress controller is separate from the proxy. Discovery service computes the ingress routes and then sends them to the proxy. Again, TLS changes require restarts.
If we never need to restart Envoy then there is simply no need for the agent. If we do need to restart Envoy then it has to be in the same container to coordinate socket exchange.
Not sure what the point of this issue.
@andraxylia commented on Fri Jul 28 2017
I thought we no longer need to restart Envoy. If this will be true in the future, let's keep this issue open and re-visit it for 0.3 release.
Envoy restart for TLS changes has been brought up here:
envoyproxy/envoy#95
@PiotrSikora @myidpt @wattli are there any chances Envoy will support TLS certificates rotation without requiring restart?
@andraxylia commented on Fri Jul 28 2017
Also here:
envoyproxy/envoy#1194
@andraxylia commented on Fri Jul 28 2017
So the ingress controller is bundled with the pilot discovery?
@rshriram commented on Sun Jul 30 2017
The point is there will always be parts of Envoy config that cannot be
loaded through any DS service. We have decided to do this explicitly in
Envoy for global settings, for the sake of stability and simplicity of code
base.
And to be able to change these without disruption, we need hot restarts and
thus an in-container agent.
@kyessenov keep me honest here..
I don't think our agent has been flaky. Numerous people have run the demos
where tasks involved reloads (fault filters for example). With the
introduction of LDS we have eliminated 95% of frequent restarts which might
have caused some race conditions in Envoy in the past. These race
conditions occurred at listener level iirc, and LDS circumvents these
issues completely.
What remains to be seen is whether we encounter instability for the global
configs that need infrequent hot restarts.
On Sat, Jul 29, 2017 at 1:57 AM Andra Cismaru [email protected]
wrote:
--
~shriram
@andraxylia commented on Sun Jul 30 2017
I do not believe in the "always" in the above sentence. Let's just wait from input from the people working in envoy on that.
And the restart has been flaky, pilot e2e routing tests fail randomly when run independently. It's good LDS will solve 95% of the problem, but even with LDS I see random failures. I am looking for an architecture where we solve the problem completely. If it is possible, let's do it.
In early February I asked Kuat about these frequent restarts and why the communication between agent and envoy is done via config, and in general, why is this so complicated, and the answer I got was "this is what it is, all proxies work this way, there is nothing we can do". We could do xDS little time after. So I am not buying the "always" unless there is a technical impossibility or a solid reason for not being able to do it.
The text was updated successfully, but these errors were encountered: