-
Notifications
You must be signed in to change notification settings - Fork 687
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Configuration updates stop on Azure Kubernetes Service (AKS) #1039
Comments
+1 I was just about to write a bug report myself. I have the exact same issue in the exact same configuration (v0.40.2 and AKS with Kubernetes 1.11.5) I even built the Ambassador Docker image myself and added a few extra log messages in kubewatch.py. It appears that the events from the Kubernetes API server don't reach kubewatch. It can't be a permission issue because it works for a few minutes after redeploying Ambassador. |
We've seen this on v1.11.13, v1.11.14, and 1.9.11, in both RBAC and non-RBAC mode. This appears to be an issue with clusters deployed more recently, i.e., clusters deployed in September do not have this issue. We're pinging AKS engineering on this. If others on this thread can open up AKS support tickets on this issue that would be helpful. This issue is easily reproducible on AKS, and does not seem to exist on other hosted Kubernetes providers. |
This slack bot that watches the kube-apiserver does not appear to have any issue receiving events. https://github.com/bitnami-labs/kubewatch As someone above reported, Ambassador's kubewatch does not appear to be receiving events on Azure after a couple of minute with no errors. |
@richarddli Did you check if all of those clusters are running on the Moby engine? Azure/acs-engine#3896 The following will output the docker engine version: Per here, Moby went GA on all new node deployments on December 5th. I opened a support case with Microsoft on the issue. |
@HoveringHalibut Interesting find. I just checked the other environments I've run Ambassador on to see the Docker version they're running. |
Steps to reproduce:
|
I've run into this issue too. However deleting all ambassador pods renews ambassabor routing table when they recreate. it worked and stoped updating after few minutes.
|
Just a quick update. It's not Moby, but working with the Azure engineering team we believe we are zeroing in on the root cause. We hope to provide a more detailed update soon. |
The underlying reason for this issue is Ambassador talks to the kube-apiserver via a series of proxies. It seems that at some point, one of these proxies is dropping the connection with the python-client Ambassador uses. The fix we are evaluating is taking advantage of the mutating webhook admissions controller feature AKS recently implemented to bypass this series of proxies with the go-client. We will provide more details as progress is made. |
@nbkrause and @richarddli Thanks for the update and work on this issue. I'm continuing to push on my support case with this issue. A problem with a proxy timeout makes me nervous about its affect on other services with similar hooks as Ambassador. |
Testing seems to have shown that #1087 fixes this issue. Targeting for |
Describe the bug
After deploying Ambassador on an AKS cluster, service configuration changes stop updating after 5-10 minutes for the Ambassador service.
To Reproduce
Expected behavior
Service configurations continue to update.
Versions (please complete the following information):
The text was updated successfully, but these errors were encountered: