You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Please describe your use case / problem.
When an ambassador pod is unable to communicate with the Kubernetes API to monitor service changes, the ambassador pod continues healthy and in service, even though its configuration might be stale/out of date. After some time it should report as being unhealthy as to attempt some sort of self heal (restarting, becoming unready for some time, etc). If a new mapping appears or is changed, the faulty ambassador pod will not be working as it should causing a possible outage.
Log from an ambassador pod not being able to communicate with Kubernetes API and still in service
2018-10-02 09:44:36 kubewatch 0.35.2 ERROR: could not watch for Kubernetes service changes
Traceback (most recent call last):
File "/ambassador/kubewatch.py", line 517, in main
watch_loop(restarter)
File "/ambassador/kubewatch.py", line 418, in watch_loop
for evt in watched:
File "/usr/lib/python3.6/site-packages/kubernetes/watch/watch.py", line 122, in stream
resp = func(*args, **kwargs)
File "/usr/lib/python3.6/site-packages/kubernetes/client/apis/core_v1_api.py", line 14358, in list_service_for_all_namespaces
(data) = self.list_service_for_all_namespaces_with_http_info(**kwargs)
File "/usr/lib/python3.6/site-packages/kubernetes/client/apis/core_v1_api.py", line 14455, in list_service_for_all_namespaces_with_http_info
collection_formats=collection_formats)
File "/usr/lib/python3.6/site-packages/kubernetes/client/api_client.py", line 321, in call_api
_return_http_data_only, collection_formats, _preload_content, _request_timeout)
File "/usr/lib/python3.6/site-packages/kubernetes/client/api_client.py", line 155, in __call_api
_request_timeout=_request_timeout)
File "/usr/lib/python3.6/site-packages/kubernetes/client/api_client.py", line 342, in request
headers=headers)
File "/usr/lib/python3.6/site-packages/kubernetes/client/rest.py", line 231, in GET
query_params=query_params)
File "/usr/lib/python3.6/site-packages/kubernetes/client/rest.py", line 222, in request
raise ApiException(http_resp=r)
kubernetes.client.rest.ApiException: (500)
Reason: Internal Server Error
HTTP response headers: HTTPHeaderDict({'Content-Type': 'application/json', 'Date': 'Tue, 02 Oct 2018 09:44:36 GMT', 'Content-Length': '186'})
HTTP response body: b'{"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"resourceVersion: Invalid value: \\"None\\": strconv.ParseUint: parsing \\"None\\": invalid syntax","code":500}\n'
Describe the solution you'd like
I would like the liveness or readiness probe of ambassador deployment to report as unhealthy when it can't communicate with the Kubernetes API.
Describe alternatives you've considered
The ambassador pod can also consider this a fatal error and simply exit, causing the pod to be restarted.
The text was updated successfully, but these errors were encountered:
jvosantos
changed the title
Ambassador should report being unhealthy when unable to assure its configuration is not stale.
Ambassador should report being unhealthy when unable to ensure its configuration is not stale.
Oct 5, 2018
I'm seeing these errors constantly on GKE and it's certainly troubling. I'm not sure why ambassador is making a request that results in a 500 error in the first place. We're on ambassador version 0.40.2 and kubernetes master version 1.11.3-gke.23.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Please describe your use case / problem.
When an ambassador pod is unable to communicate with the Kubernetes API to monitor service changes, the ambassador pod continues healthy and in service, even though its configuration might be stale/out of date. After some time it should report as being unhealthy as to attempt some sort of self heal (restarting, becoming unready for some time, etc). If a new mapping appears or is changed, the faulty ambassador pod will not be working as it should causing a possible outage.
Log from an ambassador pod not being able to communicate with Kubernetes API and still in service
Describe the solution you'd like
I would like the liveness or readiness probe of ambassador deployment to report as unhealthy when it can't communicate with the Kubernetes API.
Describe alternatives you've considered
The ambassador pod can also consider this a fatal error and simply exit, causing the pod to be restarted.
The text was updated successfully, but these errors were encountered: