-
Notifications
You must be signed in to change notification settings - Fork 30
locksmith: CoreOS autoupdate & Kubernetes node drain (klocksmith) #1274
Comments
Hey Mark! I have been wanting to write a design doc on this. Here is a first draft: https://docs.google.com/document/d/1DHiB2UDBYRU6QSa2e9mCNla1qBivZDqYjBVn_DvzDWc/edit# |
Hi, thanks for the quick response!
Also on Loop 1, step 2: you meant to tag no more than N nodes with the ok-to-reboot tag? Mark PS if you are still in Berlin, maybe we can have a quick chat? |
@skinny Happy to chat. I am in Berlin until Saturday. |
@skinny Still interested in working on this? |
+1 for this! |
Hi, this idea is very cool, like locksmith will update k8s to evacuate containers. Something like coreos to etcd "I wanna restart", etcd "ok, hold on i will inform k8s" etc to k8 "hey node 8 want to restart mark as not for deploy and rollupdate/restart containers", k8s "Yes my master, done ", etc ->coreos "restart". and coreos will report that he come back to cluster and etcd will unmark not schedule from node. |
This will help not only coreos with locksmith but other distros to schedule updates on infrastructure. [VPS and bare metal] and decrease down time for service, etc... |
We should probably not try and use the existing locksmith codebase and instead call this "klocksmith" or something. The deployment method (containers), backend (kubernetes), etc are all completely different here. |
@philips I read through your Doc and it looks like a great idea. I'll throw this though out here just in case. Forgive me if I'm missing part of the picture. What about just modifying locksmith itself to support preStop hooks? It could optionally run a command or httpGet a URL, blocking the reboot signal until that command or URL returns? The command could obviously then be anybody's custom anything, and for the case of K8s, the command could be a simple bash script which runs |
Adding preStop hooks seems like a simple/quick solution to the problem at hand? |
I would also like to be able to prevent a node from rebooting if a ReplicaSet or StatefulSet is not running the desired number of replicas. This is to prevent downtime for an application (like some data stores) that requires a minimum number of nodes to be running. The scenario goes like this: Let's say that you're running something like ElasticSearch in a StatefulSet. let's say that one the SS's pods experiences a fatal event, like database or disk corruption, free space exhaustion, etc., and fails a liveness probe. It's broken and won't come up w/o manual intervention. The SS is now running with less than the desired number of replicas. If locksmith were to initiate a reboot on a node running a pod from this StatefulSet, this could compromise application/cluster availability. We should be able to prevent node reboots when there is a compromised ReplicaSet or StatefulSet. Maybe there's a way to do this already? I don't know, but this seems an appropriate place to mention it. We're working around this very same situation w/ Cassandra running under Fleet (obviously less than ideal). |
+1 looks great, currently our cluster reboots entirely way too fast. No time for the applications to become available again. As this currently happens during nighttime this is not that big of a problem but could be better. @chrissnell for compromised replica & statefullsets the PodDisruptionBudget would be the indicator. Can someone comment on the current status? |
We are working on a kubernetes-aware version of locksmith (lovingly called "klocksmith"). The plan is to deploy this component (consisting of a daemon set and controller) onto the cluster and allow that to manage the reboots. We don't have anything to announce just yet, but we are getting close. |
For those following along, we released https://github.com/coreos/container-linux-update-operator which replaces Locksmith in Kubernetes clusters. |
The Container Linux Update Operator (the new name for "klocksmith") is now deployed by default on Tectonic clusters. It should also function just as well on regular Kubernetes clusters. For specific enhancements related to it, please open additional issues here, against Tectonic, or against Kubernetes as appropriate. @chrissnell |
Yep, for example, plain-old Kubernetes clusters like the Matchbox bootkube-install example cluster (noo-tectonic) now use the Container Linux Update Operator too. https://github.com/coreos/matchbox/blob/master/Documentation/cluster-addons.md |
When running certain multi-pod applications (Redis cluster in our case) sometimes during an CoreOS update run (installing & rebooting every node) the majority or all pods (3 in our example) end up one one physical machine. When that machine is rebooted, the Redis cluster is lost and requires (for now) manual intervention to get back up.
I learned that Kubernetes 1.2 introduced the Node-Drain functionality, this would be a great feature to use before rebooting a Kubernetes enabled CoreOS node.
Are there any plans on implenting this kind of behaviour (relocating all the pods before a reboot) or does anyone know another way of avoiding this kind of scenario.
Mark
The text was updated successfully, but these errors were encountered: