Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flannel should respond to k8s node delete events by cleaning up lease #954

Closed
davidmccormick opened this issue Feb 20, 2018 · 1 comment
Closed
Labels

Comments

@davidmccormick
Copy link

Expected Behavior

When a node a purposely removed/deleted from a k8s cluster, we want its network lease to be freed up immediately rather than having to wait for lease to expire. We have a number of large clusters with a reduced pod network/17 which means we are running out of leases when we roll/upgrade them. We feel that it is a waste to keep leases around for ephemeral nodes that are not going to return when we need faster lease recycling.

Current Behavior

Tested on kubernetes 1.8.4 with flannel 0.9.1. When I run kubectl delete node ABC then ABC is removed from k8s but checking etcd contents the lease remains behind.

Possible Solution

Does kubernetes pass a delete event through or could flannel watch for node delete events like an operator?

Steps to Reproduce (for bugs)

  1. use kubectl and get nodes list.
  2. use etcdctl (v2 api) to list the leases under /coreos.com/network/subnets/*.
  3. use kubectl delete node ABC to delete one of the nodes.
  4. use etcdctl to list the leases again.
  5. check that lease for deleted node has been removed.

Context

Rolling out updates to large clusters or clusters with few leases left is problematic because nodes are terminated and replacements spun up but flannel runs of leases to give out and so the new nodes are unusable - whilst there are a number of leases blocked until their 24h expiration times out.

Your Environment

  • Flannel version: 0.9.1
  • Backend used vxlan
  • Etcd version: v3.2.10
  • Kubernetes version (if used): 1.8.4.coreos
  • Operating System and version: CoreOS (beta branch)
  • Link to your project (optional):
mumoshu pushed a commit to kubernetes-retired/kube-aws that referenced this issue Apr 6, 2018
Deploy Calico and Flannel networking as a daemonset with Kubernetes API as the backing store.

Removes the need for nodes connecting to etcd and frees up node podCIDR leases faster -addressing cluster role issue: flannel-io/flannel#954.

This is an experimental feature, disabled by default.
Kubernetes controllers become responsible for allocating node CIDRs.
Switch between Calico+Flannel (Canal) or Flannel.

Fast roll out into existing clusters with minimal disruption.

Optional calico Typha service for easing load on apiservers in large clusters.

Resolves #909
@stale
Copy link

stale bot commented Jan 26, 2023

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the wontfix label Jan 26, 2023
@stale stale bot closed this as completed Feb 16, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant