Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inhibitor support for graceful node shutdown #3648

Closed
tanvp112 opened this issue Jun 6, 2024 · 12 comments
Closed

Inhibitor support for graceful node shutdown #3648

tanvp112 opened this issue Jun 6, 2024 · 12 comments
Labels
kind/support Categorizes issue or PR as a support question.

Comments

@tanvp112
Copy link

tanvp112 commented Jun 6, 2024

Hi,

I wanted to do some tests for graceful shutdown. Noticed that systemd-inhibitor was not included in the kind image. The command systemd-inhibit --list would return error due to no systemd --user is running.

Is this currently not supported or am I missing any configuration here?

Thanks.

@tanvp112 tanvp112 added the kind/support Categorizes issue or PR as a support question. label Jun 6, 2024
@stmcginnis
Copy link
Contributor

I'm not sure how that would work with kind.

Graceful shutdown happens when the host is shutting down. IIRC, kubelet gets registered with systemd-inhibitor so that when shutdown is happening, systemd will wait a given period of time for any registered processes to exit before continuing with the shutdown.

Since kind nodes are containers, they do not have their own systemd process, and shutting down is done by stopping the container. So it wouldn't ever hit this condition.

Can you explain a little more about what you are trying to test here?

@tanvp112
Copy link
Author

tanvp112 commented Jun 6, 2024

They do have systemd process but was reasonably slim down, I reckon systemd-inhibitor-locks can be added similarly? https://github.com/kubernetes-sigs/kind/blob/main/images/base/Dockerfile#L49

I am guessing system shutdown event can be handled? https://github.com/kubernetes-sigs/kind/blob/main/images/base/Dockerfile#L241

Example I have a long running pod on a spot node that about to be reclaim. I want to test to ensure the pod can stop within the time limit, including the volume attached (not the default local storage).

@stmcginnis
Copy link
Contributor

Ah, sorry, I should have done a little more research before commenting. You are totally right.

It may be possible to add systemd-logind to the base image. That should pull in systemd-inhibit. I'm not sure when I can get to it, but I will try later. You could also try building your own base image with that modification, then create you own node image.

@aojea
Copy link
Contributor

aojea commented Jun 11, 2024

I don't know if kind is the right project to test node shutdown, @BenTheElder you are much knowledgeable on this area PTAL

@tanvp112
Copy link
Author

Graceful shutdown or non-graceful shutdown of node can be simulate with docker container stop kind-<name> --time <duration>, but it needs inhibitor lock to take advantage of the time allowance. Given default KIND image already has systemd running, what's the reason KIND is not the right project to test node shutdown?

@aojea
Copy link
Contributor

aojea commented Jun 12, 2024

kind does a lot of hacks to emulate a VM, are you sure all the behaviors triggered by a VM shutdown can be simulated with docker container stop? ... kind does not have dbus per example

@tanvp112
Copy link
Author

Yes. But I think the bigger issue here is DBus is needed for kubelet to obtain the lock inside the node.

@aojea
Copy link
Contributor

aojea commented Jun 12, 2024

so, it is worth it to add it to kind or just test this in a vm? the later seems more appropriate to me

@tanvp112
Copy link
Author

Why not? Note it is not impossible to add dbus to container. rke2 got this done for sometime already. Worth or not, guess it depends on how difficult to maintain for long run, certainly not because of dbus or systemd.

Feel free to close this request if graceful shutdown is too much a trouble that KIND doesn't see a value, you are certainly right that there are many other options out there.

@aojea
Copy link
Contributor

aojea commented Jun 13, 2024

We don't have lifecycle on the container, and the VM emulation fails in areas as the abstraction of the host resources... Adding something incomplete that will not provide full coverage seems too much cost for small roi ...

@BenTheElder
Copy link
Member

As a general statement: System integration / node level isolation is going to be problematic with kind, because fundamentally we're sharing a host kernel. It's pretty OK for testing distributed behavior, manifests, controller logic, etc, but when we get into kubelet/node stuff it may or may not be appropriate. We try to enable these but it's sometimes pretty messy and it hasn't been the main focus. (see e.g. #1963 and the problems there)

Why not? Note it is not impossible to add dbus to container. rke2 got this done for sometime already. Worth or not, guess it depends on how difficult to maintain for long run, certainly not because of dbus or systemd.

I'm not that familiar with dbus, but this looks plausible to me. We'd have to investigate this more, it needs to be isolated from the other containers / host.

We don't have lifecycle on the container,

I think this is the bigger problem, we usually have users simulate node addition/removal using taints which is sufficient for testing most applications. For testing Kubernetes with disruptive node behaviors we typically still use cloud VMs where we have better isolation and control.

kind only has "create cluster" and "delete cluster" currently (and ... introducing other mutations would get pretty complex)

Inhibiting shutdown here would be somewhat at odds with any container deletion behavior (e.g. this thread #2272 (comment)), but that's not necessarily a deal-breaker for tests if we inhibit for less time than the timeout, or simulate shutdown signal without actually calling kind delete cluster (maybe we signal systemd instead?)

More importantly: If you're writing tests for graceful shutdown in github.com/kubernetes/kubernetes, I'd say these should probably be under node_e2e and SIG node pretty much only supports node_e2e with kube-up.sh currently, to my knowledge.

I'd love to enable these someday, but I'm not sure if that's even something the SIG node maintainers are interested in, overall, and if not then we should probably continue to focus on cluster e2e (test/e2e).

If I've misread and you have another use case please elaborate 😅 .

@BenTheElder
Copy link
Member

(also this week is KEP freeze amongst other things so please bear with response times ...)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/support Categorizes issue or PR as a support question.
Projects
None yet
Development

No branches or pull requests

4 participants