Skip to content
This repository has been archived by the owner on Feb 27, 2023. It is now read-only.

Gimbal doesn't remove OpenStack endpoint from service with health check setting #208

Open
yutaokaz opened this issue Aug 3, 2018 · 6 comments
Assignees
Labels
discoverer kind/question Categorizes an issue as a user question.
Milestone

Comments

@yutaokaz
Copy link
Contributor

yutaokaz commented Aug 3, 2018

What steps did you take and what happened:

  1. I configured route consist of 1 OpenStack service (has 2 endpoints) with health check to ingressroute.yaml file and deployed it.
  2. I started continuous requesting to the defined route, and I get response from 2 endpoints correctly.
  3. I deleted 1 OpenStack VM from LBaaS by neutron lbaas-member-delete command.
  4. But I still get response from 2 endpoints.

What did you expect to happen:

I expect to get response from 1 (remained) endpoint.

Environment:

  • Backend Cluster versions (OpenStack and/or Kubernetes):
    OpenStack. In terms of Kubernetes, I haven't checked.
  • Gimbal version (Gimbal git repository tag):
    https://github.com/heptio/gimbal/tree/release-0.3.0-beta.2
  • Contour version:
    gcr.io/heptio-images/contour:v0.6.0-beta.2
  • openstack-discoverer version (if applicable):
    gcr.io/heptio-images/gimbal-discoverer:v0.3.0-beta.2
@alexbrand
Copy link
Contributor

Thank you @yutaokaz. Do you recall what reconciliation period was configured in the OpenStack discoverer? Did the OpenStack discoverer run a reconciliation after you deleted the LBaaS member?

My initial thinking around this issue is that the Endpoints object in the Gimbal cluster still had the IP addresses of the OpenStack VMs. Given that you did not destroy the VMs, but instead removed them from the load balancer, the VMs are still reachable for traffic.

We should expect the removed endpoint to stop responding if either a) the OpenStack discoverer reconciles the OpenStack state with the gimbal state, or b) the VM itself is destroyed, triggering the health check to fail.

@alexbrand alexbrand added kind/question Categorizes an issue as a user question. discoverer labels Aug 3, 2018
@yutaokaz
Copy link
Contributor Author

yutaokaz commented Aug 3, 2018

Hi @alexbrand, thank you for your reply.
Yes, I checked reconciliation updates, kubectl get endpoints and contour cli eds command returns only remained 1 endpoint. In addition, no health check setting for the service, this issue isn't occurred.
I also found regenerate (delete and create) ingressroute or apply commented out health check lines could stop requesting to the removed VM.
Didn't you reproduce this issue?

@alexbrand
Copy link
Contributor

Thanks for the additional information @yutaokaz. I will try to reproduce this on our side.

@alexbrand
Copy link
Contributor

@yutaokaz This seems to be an issue with Envoy, so I have opened projectcontour/contour#603.

@rosskukulinski rosskukulinski added this to the v0.4.0 milestone Aug 21, 2018
@rosskukulinski
Copy link
Contributor

Contour will fix this in 0.7.0, moving to Gimbal 0.4.0 milestone.

@rosskukulinski
Copy link
Contributor

Punted to Contour 0.9

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
discoverer kind/question Categorizes an issue as a user question.
Projects
None yet
Development

No branches or pull requests

4 participants