Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Decouple proxy from pilot agent #1393

Closed
kyessenov opened this issue Nov 3, 2017 · 1 comment
Closed

Decouple proxy from pilot agent #1393

kyessenov opened this issue Nov 3, 2017 · 1 comment
Assignees

Comments

@kyessenov
Copy link
Contributor

@andraxylia commented on Fri Jul 28 2017

Right now, proxy and agent are bundled in the same container. Their lifecycle cannot be controller by k8s native, it's the agent controlling the proxy process, and it's flaky.

Proxy should be the same container in ingress, sidecar or middle proxy.

This will also simplify the build.


@kyessenov commented on Fri Jul 28 2017

  1. k8s cannot coordinate envoy restarts (passing TCP sockets between restarts)

  2. Ingress controller is separated from proxy. Discovery service implements k8s ingress controller.

  3. Done already.


@andraxylia commented on Fri Jul 28 2017

Does pilot agent still triggers envoy restart even after LDS? In what scenarios?


@andraxylia commented on Fri Jul 28 2017

Ingress controller is not separate from the proxy, it runs in the same container. Does ingress controller need to restart the envoy process also? If yes, in what scenarios?


@kyessenov commented on Fri Jul 28 2017

I feel that there is some confusion here.

Yes, pilot agent is triggering restarts since LDS does not cover TLS contexts (yet). We can't do TLS cert rotation without it at the moment. Lyft uses a similar set-up running a little agent next to the proxy.

Ingress controller is separate from the proxy. Discovery service computes the ingress routes and then sends them to the proxy. Again, TLS changes require restarts.

If we never need to restart Envoy then there is simply no need for the agent. If we do need to restart Envoy then it has to be in the same container to coordinate socket exchange.

Not sure what the point of this issue.


@andraxylia commented on Fri Jul 28 2017

I thought we no longer need to restart Envoy. If this will be true in the future, let's keep this issue open and re-visit it for 0.3 release.

Envoy restart for TLS changes has been brought up here:
envoyproxy/envoy#95

@PiotrSikora @myidpt @wattli are there any chances Envoy will support TLS certificates rotation without requiring restart?


@andraxylia commented on Fri Jul 28 2017

Also here:
envoyproxy/envoy#1194


@andraxylia commented on Fri Jul 28 2017

So the ingress controller is bundled with the pilot discovery?


@rshriram commented on Sun Jul 30 2017

The point is there will always be parts of Envoy config that cannot be
loaded through any DS service. We have decided to do this explicitly in
Envoy for global settings, for the sake of stability and simplicity of code
base.

And to be able to change these without disruption, we need hot restarts and
thus an in-container agent.

@kyessenov keep me honest here..

I don't think our agent has been flaky. Numerous people have run the demos
where tasks involved reloads (fault filters for example). With the
introduction of LDS we have eliminated 95% of frequent restarts which might
have caused some race conditions in Envoy in the past. These race
conditions occurred at listener level iirc, and LDS circumvents these
issues completely.

What remains to be seen is whether we encounter instability for the global
configs that need infrequent hot restarts.
On Sat, Jul 29, 2017 at 1:57 AM Andra Cismaru [email protected]
wrote:

So the ingress controller is bundled with the pilot discovery?


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
istio/old_pilot_repo#968 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/AH0qd6dv4hc_vP6XozpFs833WB0KHgrgks5sSsnfgaJpZM4OnFAl
.

--

~shriram


@andraxylia commented on Sun Jul 30 2017

I do not believe in the "always" in the above sentence. Let's just wait from input from the people working in envoy on that.
And the restart has been flaky, pilot e2e routing tests fail randomly when run independently. It's good LDS will solve 95% of the problem, but even with LDS I see random failures. I am looking for an architecture where we solve the problem completely. If it is possible, let's do it.

In early February I asked Kuat about these frequent restarts and why the communication between agent and envoy is done via config, and in general, why is this so complicated, and the answer I got was "this is what it is, all proxies work this way, there is nothing we can do". We could do xDS little time after. So I am not buying the "always" unless there is a technical impossibility or a solid reason for not being able to do it.

@rshriram
Copy link
Member

we have sufficiently decoupled the agent from envoy and the only thing it does is restart on certs, which will also go away when SDS kicks in. Closing this issue for the moment

0x01001011 pushed a commit to thedemodrive/istio that referenced this issue Jul 16, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants