New command: kube stop cluster and kube start cluster #1867

felipecrs · 2020-09-24T21:52:23Z

This command would simply run docker stop and docker start against all the nodes. Despite I can do it by myself, they seem to restart automatically when stopped. Perhaps kind is creating the containers with --restart always instead of --restart unless-stopped?

It would be better to exist in kind since multiple clusters can co-exist, and kind knows exactly what are the containers which belong to a given cluster.

I use kind in my development environment, which has limited resources. I have a testing cluster set up that I would not like to lose, but I don't use it always, so I could simply start the cluster as needed and keep my laptop cold otherwise. :)

The text was updated successfully, but these errors were encountered:

BenTheElder · 2020-09-24T21:59:39Z

kind is not setting restart = always.

kind/pkg/cluster/internal/providers/docker/provision.go

Lines 166 to 185 in 1b4b217

    
           // https://docs.docker.com/engine/reference/commandline/run/#restart-policies---restart 
        
           // 
        
           // What we desire is: 
        
           // - restart on host / dockerd reboot 
        
           // - don't restart for any other reason 
        
           // 
        
           // This means: 
        
           // - no is out of the question ... it never restarts 
        
           // - always is a poor choice, we'll keep trying to restart nodes that were 
        
           // never going to work 
        
           // - unless-stopped will also retry failures indefinitely, similar to always 
        
           // except that it won't restart when the container is `docker stop`ed 
        
           // - on-failure is not great, we're only interested in restarting on 
        
           // reboots, not failures. *however* we can limit the number of retries 
        
           // *and* it forgets all state on dockerd restart and retries anyhow. 
        
           // - on-failure:0 is what we want .. restart on failures, except max 
        
           // retries is 0, so only restart on reboots. 
        
           // however this _actually_ means the same thing as always 
        
           // so the closest thing is on-failure:1, which will retry *once* 
        
           "--restart=on-failure:1",

You can trivially run start/stop against the node containers yourself (kind get nodes | xargs docker stop), but I think you'll find that doesn't behave as you'd expect because of the nested containers.

The podman backend also currently does not support restart.

BenTheElder · 2020-09-24T22:01:10Z

I use kind in my development environment, which has limited resources. I have a testing cluster set up that I would not like to lose, but I don't use it always, so I could simply start the cluster as needed and keep my laptop could otherwise. :)

I would strongly prefer to improve the experience of creating new clusters anyhow though. We do not want users becoming highly attached to their kind clusters. Testing should be from a clean state and critical data should not be stored permanently in these clusters. They should start quickly and be disposable.

felipecrs · 2020-09-24T22:50:24Z

I wonder if it's possible to change the restart policy dynamically.

BenTheElder · 2020-09-25T10:01:26Z

Docker container configuration is not dynamically changeable.

…

On Thu, Sep 24, 2020, 15:50 Felipe Santos ***@***.***> wrote: I wonder if it's possible to change the restart policy dynamically. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#1867 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAHADK34Y653GADWYRVPU5DSHPELZANCNFSM4RYZ6VZQ> .

felipecrs · 2020-09-25T15:01:10Z

Actually, it's doable:

docker update --restart=unless-stopped kind-control-plane

https://docs.docker.com/engine/reference/commandline/update/#update-a-containers-restart-policy

But for some reason it fails:

$ docker update --restart=unless-stopped kind-control-plane
Error response from daemon: Cannot update container 2912eb63c333ce9395d428452474e7d7109167aeffb47196a6484c741705cfcd: runc did not terminate sucessfully: failed to write "a *:* rwm" to "/sys/fs/cgroup/devices/docker/2912eb63c333ce9395d428452474e7d7109167aeffb47196a6484c741705cfcd/devices.allow": write /sys/fs/cgroup/devices/docker/2912eb63c333ce9395d428452474e7d7109167aeffb47196a6484c741705cfcd/devices.allow: invalid argument
: unknown

BenTheElder · 2020-09-25T19:16:09Z

In any case on-failure should only be restarting when the container exits uncleanly, and it should only start once. unless-stopped is not a desirable policy, see the code comment above.

It also starts on bootup. If it stops and you reboot it will start, but that's the case with all restart policies except none.

felipecrs · 2020-09-25T20:10:26Z

You're right. The unless-stopped does not help anyhow (the container is indeed restarted when the system starts).

So, for kind stop cluster, we need to first call docker update --restart=no kind-control-plane then docker stop kind control plane.

For kind start cluster:docker update --restart=on-failure:1 kind-control-plane and docker start kind-control-plane.

felipecrs · 2020-09-25T20:12:12Z

However, for some reason docker update does not work with the kind node containers (they work in other containers though):

$ docker update --restart=no test
test
$ docker update --restart=no kind-control-plane
Error response from daemon: Cannot update container 2912eb63c333ce9395d428452474e7d7109167aeffb47196a6484c741705cfcd: runc did not terminate sucessfully: failed to write "a *:* rwm" to "/sys/fs/cgroup/devices/docker/2912eb63c333ce9395d428452474e7d7109167aeffb47196a6484c741705cfcd/devices.allow": write /sys/fs/cgroup/devices/docker/2912eb63c333ce9395d428452474e7d7109167aeffb47196a6484c741705cfcd/devices.allow: invalid argument
: unknown

BenTheElder · 2020-09-25T23:22:40Z

This is likely because of the cgroups changes in the entrypoint. KIND is not a normal container (nor are each node actually one container).

…

On Fri, Sep 25, 2020, 13:12 Felipe Santos ***@***.***> wrote: However, for some reason docker update does not work with the kind node containers (they work in other containers though): $ docker update --restart=no testtest $ docker update --restart=no kind-control-planeError response from daemon: Cannot update container 2912eb63c333ce9395d428452474e7d7109167aeffb47196a6484c741705cfcd: runc did not terminate sucessfully: failed to write "a *:* rwm" to "/sys/fs/cgroup/devices/docker/2912eb63c333ce9395d428452474e7d7109167aeffb47196a6484c741705cfcd/devices.allow": write /sys/fs/cgroup/devices/docker/2912eb63c333ce9395d428452474e7d7109167aeffb47196a6484c741705cfcd/devices.allow: invalid argument: unknown — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#1867 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAHADK6BIXWSC7UZV6LLOX3SHT2SXANCNFSM4RYZ6VZQ> .

BenTheElder · 2020-11-03T04:23:25Z

xref: #1913

fejta-bot · 2021-02-01T05:16:26Z

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

fejta-bot · 2021-03-03T06:02:18Z

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle rotten

fejta-bot · 2021-04-02T06:46:05Z

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-contributor-experience at kubernetes/community.
/close

k8s-ci-robot · 2021-04-02T06:46:10Z

@fejta-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-contributor-experience at kubernetes/community.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

DylanBowden · 2022-07-01T09:59:27Z

FYI Running things the other way around works for me

docker stop kind control plane then
docker update --restart=no kind-control-plane

BenTheElder · 2022-07-27T17:24:11Z

xref: #2715

there's been more work on restarts for docker (podman is lacking some functionality), there's more recent discussion in #2715

felipecrs added the kind/feature Categorizes issue or PR as related to a new feature. label Sep 24, 2020

BenTheElder added the priority/backlog Higher priority than priority/awaiting-more-evidence. label Sep 24, 2020

BenTheElder added the kind/design Categorizes issue or PR as related to design. label Sep 25, 2020

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Feb 1, 2021

k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Mar 3, 2021

k8s-ci-robot closed this as completed Apr 2, 2021

BenTheElder mentioned this issue May 25, 2021

Podman restart support #2272

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

New command: kube stop cluster and kube start cluster #1867

New command: kube stop cluster and kube start cluster #1867

felipecrs commented Sep 24, 2020 •

edited

Loading

BenTheElder commented Sep 24, 2020

BenTheElder commented Sep 24, 2020

felipecrs commented Sep 24, 2020

BenTheElder commented Sep 25, 2020 via email

felipecrs commented Sep 25, 2020

BenTheElder commented Sep 25, 2020

felipecrs commented Sep 25, 2020

felipecrs commented Sep 25, 2020

BenTheElder commented Sep 25, 2020 via email

BenTheElder commented Nov 3, 2020

fejta-bot commented Feb 1, 2021

fejta-bot commented Mar 3, 2021

fejta-bot commented Apr 2, 2021

k8s-ci-robot commented Apr 2, 2021

DylanBowden commented Jul 1, 2022 •

edited

Loading

BenTheElder commented Jul 27, 2022

New command: kube stop cluster and kube start cluster #1867

New command: kube stop cluster and kube start cluster #1867

Comments

felipecrs commented Sep 24, 2020 • edited Loading

BenTheElder commented Sep 24, 2020

BenTheElder commented Sep 24, 2020

felipecrs commented Sep 24, 2020

BenTheElder commented Sep 25, 2020 via email

felipecrs commented Sep 25, 2020

BenTheElder commented Sep 25, 2020

felipecrs commented Sep 25, 2020

felipecrs commented Sep 25, 2020

BenTheElder commented Sep 25, 2020 via email

BenTheElder commented Nov 3, 2020

fejta-bot commented Feb 1, 2021

fejta-bot commented Mar 3, 2021

fejta-bot commented Apr 2, 2021

k8s-ci-robot commented Apr 2, 2021

DylanBowden commented Jul 1, 2022 • edited Loading

BenTheElder commented Jul 27, 2022

felipecrs commented Sep 24, 2020 •

edited

Loading

DylanBowden commented Jul 1, 2022 •

edited

Loading