-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Drain nodes when they terminate #7119
Comments
I would personally suggest with the easiest approach, which looks the termination script and eventually iterate on it over the time. |
Issues go stale after 90d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
/remove-lifecycle stale
…On Thu, Sep 5, 2019 at 10:26 AM fejta-bot ***@***.***> wrote:
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually
close.
If this issue is safe to close now please do so with /close.
Send feedback to sig-testing, kubernetes/test-infra and/or fejta
<https://github.com/fejta>.
/lifecycle stale
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#7119?email_source=notifications&email_token=AAACYZ7OEMEBFG2QE6QQ2NDQIE6MJA5CNFSM4HVYMZNKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD6ABAWQ#issuecomment-528486490>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAACYZ5WORU3IMVVNVPFKT3QIE6MJANCNFSM4HVYMZNA>
.
|
I did a detailed write up on setting up LCH / SQS / node-drainer here https://mofoseng.github.io/posts/eks-migration/#node-maintenance-autoscaling-and-lifecyclehooks I compared kube-aws drainer (which has a single replica deployment updating a configmap and ds on each node grepping the configmap... ) to a pure AWS lambda based approach and this SQS approach seemed the most robust it's not kops specific but it may help adoption (I moved company and now I'm back on kops after migrating out of it at my last company) |
FYI We use a similar workflow but instead of a configmap set a label on the
node to mark it as draining ... Also looking at moving to a single
deployment to have less pods flying around.
…On Tue, Sep 10, 2019, 8:29 PM so0k ***@***.***> wrote:
I did a detailed write up on setting up LCH / SQS / node-drainer here
https://mofoseng.github.io/posts/eks-migration/#node-maintenance-autoscaling-and-lifecyclehooks
I compared kube-aws drainer (which has a single replica deployment
updating a configmap and ds on each node grepping the configmap... ) to a
pure AWS lambda based approach and this SQS approach seemed the most robust
it's not kops specific but it may help adoption (I moved company and now
I'm back on kops after migrating out of it at my last company)
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#7119?email_source=notifications&email_token=AAACYZ3LZ4NMWBVEGCUL2BDQJBQYJA5CNFSM4HVYMZNKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD6NEM2Y#issuecomment-530204267>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAACYZ62FXSQSK2VO333W7TQJBQYJANCNFSM4HVYMZNA>
.
|
@grosser - how did you manage LCH creation for kops currently? do you generate TF and write config around that? Did you fork kops to add that functionality? |
we are not doing that for kops yet, just our old clusters that we create
with cloudformation, still in the progress of figuring that out :(
…On Thu, Sep 12, 2019 at 8:01 PM so0k ***@***.***> wrote:
@grosser <https://github.com/grosser> - how did you manage LCH creation
for kops currently? do you generate TF and write config around that? Did
you fork kops to add that functionality?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#7119?email_source=notifications&email_token=AAACYZ3L2FVN4SSHFTDBNR3QJL7A3A5CNFSM4HVYMZNKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD6T2LUQ#issuecomment-531080658>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAACYZ2PPCAIFGCTUDWQNULQJL7A3ANCNFSM4HVYMZNA>
.
|
sorry for off-topic I'm also trying to taint spot instances in a mixedInstancePolicy - for this I see 2 approaches (without modifying kops):
The difference is that option 1 would ensure taints are set before node registration, option 2 would take effect at some point in time later (maybe after some workloads which should not tolerate the taint have already started running on the tainted node...) with EKS bootstrap it is quite simple to ensure nodes are labeled / tainted properly before they register with the API (and kops only supports labels/taints at the iG level, not at the per node level) |
For now we implemented option 2: https://github.com/compareasiagroup/kube-node-lifecycle-labeller |
Issues go stale after 90d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
/remove-lifecycle stale |
Another way of handling node draining on termination taken from Zalando's kubernetes-on-aws: https://github.com/zalando-incubator/kubernetes-on-aws/blob/449f8f3bf5c60e0d319be538460ff91266337abc/cluster/userdata-worker.yaml#L92-L120 |
I have just implemented node drain via systemd. The systemd unit gets provisioned via kops hooks. The kubeconfg is written to disk by a daemonset. Works like a charm. Thanks @thomaspeitz for your support. |
@kforsthoevel , that sounds very interesting. Do you mind sharing the implementation details? |
@paalkr I will write a little blog post about it and let you know. |
The kops hook I saw by following links was a good proof of concept, but it had the problem that it assumed the container runtime was Docker. |
@paalkr Here is the blog post: https://tech.ivx.com/how-to-drain-nodes-before-they-get-terminated-by-aws |
Thx! |
Issues go stale after 90d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
Stale issues rot after 30d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
/lifecycle frozen |
Related Cluster API issues:
And things we may want to look at: |
I've chatted with @olemarkus on Slack about adding aws-node-termination-handler (NTH) as an add-on for kops. NTH operates in two different modes: IMDS Processor (old-way) and Queue-Processor. I think the queue-processor mode would be a good fit for kops since it's more resilient and can respond to more types of events (spot ITNs, spot rebalance recommendations, ec2 instance status changes, ASG termination lifecycle hooks, and more coming). NTH queue-processor mode work by listening to an SQS queue that is sourced with events from Amazon EventBridge. kops can do a lot of the heavy lifting of setting up the Amazon EventBridge rules appropriately and creating an SQS queue. |
Sounds good. I have used CloudFormation to create the EventBridge rules and SQS queue for now, but integrating closer into Kops would be a nice addition. |
BTW, I ended up using the |
So first step would be to add support for provisioning SQS + EventBridge rules. Then have cloudup provision those if NTH is enabled. There is a number of changes to the template that is needed too. I think it is probably best to just have two separate templates and pick one based on which mode NTH is set to use. I would run the NTH deployment on masters and use the master IAM role for authenticating to SQS. |
Also a AGS lifecycle hook need to be added to each ASG, so that nodes stays in the ASG until NTH has finished draining the node and can send a AGS lifecycle continue signal. And the behavior of NTH needs to be coordinated with kops rolling-update cluster commands. Which component is responsible for sending the AGS lifecycle continue signal during rolling update? NTH or Kops? And both Kops and NTH will also try to drain nodes during rolling updates. |
I think kops should just let NTH do drain and terminate. I can see a problem when using But we can implement some of this also without the ASG lifecycle hooks, and adding things in increment may be a benefit here. It looks like you both have a much better overview on how to approach this than I do. I can definitely help out with how to add the various bits. Would any of you be able to try your hand at a PR for (parts of) this? |
I can try to get something out. I'm not too familiar with the kops codebase, so I'll need some help as well. I'll be on vacation much of December, so probably can't do anything soon. But hopefully we can hammer out a plan and make sure this is going to work nicely. How do the other cloud providers work for draining? I was going to say that if NTH is the termination handling component, then the kops controller should just let NTH do the draining, but I'm not sure that's really possible since the kops controller would still need to finish the drain in the other providers. It's probably not the end of the world if both drain since the eviction api calls are idempotent. Whoever completes the ASG lifecycle hook first would win, but it's kind of awkward. |
Sounds good! The code that drains a node can be found here: https://github.com/kubernetes/kops/blob/master/pkg/instancegroups/instancegroups.go#L328 One bit in the code linked above that you need to take into consideration is that it deletes the k8s node object. That bit also needs to be skipped. I assume NTH also does that on its own. For the AWS provisioning piece, see https://github.com/kubernetes/kops/tree/master/pkg/model |
Unfortunately I'm not a developer, but I can contribute with testing and discussing design and implementation specs in general. |
Since draining is idempotent, it is not necessary to disable the drain code in rolling update. It is advantageous to do the bulk of the draining from rolling update as that makes what is going on more visible in the rolling update logs. |
@bwagner5 had any time to look into this yet? |
👋 hey @olemarkus I'm going to take this over from @bwagner5 as he's caught up with some other work for now. If I have questions where's the best place to ask them? Here, slack, maybe a incomplete/draft pr? |
Hey @haugenj. Any way you like. I can imagine slack is easiest in the beginning, and then a draft PR once you have a rough implementation. |
Hi @haugenj. Have you had the chance to look into this yet? Anything I can help out with? |
@olemarkus yeah, I've got the SQS provisioning done but I'm still working on the Eventbridge rules. Once I've got those done I'll open a draft pr, hopefully by end of this week 🤞 |
Seems like #10995 resolved this issue. Can we close this? |
yeah thx, that looks like it solves it :) |
When the ASG scales down the node should be drained and not just terminated (killing pods / not respecting poddisruptionbudget)
Either via "Amazon EC2 Auto Scaling Lifecycle Hooks" (up to 60 min) or with a termination script https://github.com/miglen/aws/blob/master/ec2/run-script-on-ec2-instance-termination.md (2 min max, so will be tight)
Flow could be Scale-Down -> SQS -> Drainer or Scale-Down -> SQS -> Node status -> https://github.com/planetlabs/draino
We might be able to contribute this, but need some "yes that's a good idea" / "yes we want this" first :)
The text was updated successfully, but these errors were encountered: