Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE] k3d IPAM (to prevent etcd failures) #550

Closed
mindkeep opened this issue Apr 7, 2021 · 26 comments
Closed

[FEATURE] k3d IPAM (to prevent etcd failures) #550

mindkeep opened this issue Apr 7, 2021 · 26 comments
Assignees
Labels
enhancement New feature or request priority/high runtime Issue with the container runtime (docker)
Milestone

Comments

@mindkeep
Copy link

mindkeep commented Apr 7, 2021

Feature Request

  • IPAM to keep static IPs at least for the server node IPs
  • Ensure that they stay static across cluster, container and host restarts

Original Bug Report

What did you do

I was toying with system resiliency around restarting servers and agents, and found that the Init server didn't come back after the following sequence:

% cat break-k3d.sh
#!/bin/bash
set -x
k3d cluster create --servers 3 --agents 3 test-k3d
docker stop k3d-test-k3d-server-0
docker stop k3d-test-k3d-agent-1
docker start k3d-test-k3d-agent-1
docker start k3d-test-k3d-server-0
sleep 120
docker logs k3d-test-k3d-server-0
%

where we wind up with this message repeated:

time="2021-04-06T21:20:30.180484636Z" level=info msg="Failed to test data store connection: this server is a not a member of the etcd cluster. Found [k3d-test-k3d-server-1-78ae1551=https://172.20.0.3:2380 k3d-test-k3d-server-0-3e94d5e1=https://172.20.0.2:2380 k3d-test-k3d-server-2-31d30714=https://172.20.0.4:2380], expect: k3d-test-k3d-server-0-3e94d5e1=172.20.0.6"

As it turns out the IP for k3d-test-k3d-server-0 moves from 172.20.0.2 to 172.20.0.6 (presumably swapping with k3d-test-k3d-agent-1). Is there any way to lock this down a bit so docker doesn't accidentally flip things around?

What did you expect to happen

I expected the IPs to be maintained...

Which OS & Architecture

  • Linux amd64

Which version of k3d

  • output of k3d version
% k3d version
k3d version v4.4.1
k3s version v1.20.5-k3s1 (default)

Which version of docker

  • output of docker version and docker info
% docker version
Client:
 Version:           19.03.15
 API version:       1.40
 Go version:        go1.15.7
 Git commit:        420b1d3625
 Built:             Thu Feb 11 18:16:00 2021
 OS/Arch:           linux/amd64
 Experimental:      false

Server:
 Engine:
  Version:          19.03.15
  API version:      1.40 (minimum version 1.12)
  Go version:       go1.15.7
  Git commit:       420b1d3625
  Built:            Thu Feb 11 18:14:30 2021
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          1.3.9
  GitCommit:        ea765ab
 runc:
  Version:          1.0.0-rc10
  GitCommit:        dc9208a3303feef5b3839f4323d9beb36df0a9dd
 docker-init:
  Version:          0.18.0
  GitCommit:        fec3683b971d9c3ef73f284f176672c44b448662

% docker info
Client:
 Debug Mode: false

Server:
 Containers: 8
  Running: 7
  Paused: 0
  Stopped: 1
 Images: 3
 Server Version: 19.03.15
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Native Overlay Diff: true
 Logging Driver: json-file
 Cgroup Driver: cgroupfs
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: ea765ab
 runc version: dc9208a3303feef5b3839f4323d9beb36df0a9dd
 init version: fec3683b971d9c3ef73f284f176672c44b448662
 Security Options:
  seccomp
   Profile: default
 Kernel Version: 5.4.97-gentoo
 Operating System: Gentoo/Linux
 OSType: linux
 Architecture: x86_64
 CPUs: 8
 Total Memory: 15.62GiB
 Name: crossbow
 ID: BXTV:WYUT:UXM3:IW4V:4AUA:JU6Q:ZYSE:APB4:JWOZ:UTUS:ZKRK:SEDG
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 Registry: https://index.docker.io/v1/
 Labels:
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false
@mindkeep mindkeep added the bug Something isn't working label Apr 7, 2021
@iwilltry42 iwilltry42 self-assigned this Apr 7, 2021
@iwilltry42 iwilltry42 added this to the v4.4.2 milestone Apr 7, 2021
@iwilltry42 iwilltry42 added priority/high runtime Issue with the container runtime (docker) labels Apr 7, 2021
@iwilltry42
Copy link
Member

Hi @mindkeep , thanks for opening this issue!
You're right there, we saw this issue already in the context of #262 (#262 (comment)).
I'll investigate a bit on what we can do there 👍
(FWIW: old related moby/docker issue: moby/moby#2801)

@iwilltry42
Copy link
Member

Log

If there's no flag for k3s/etcd that we could use to work around this, we'll have to check how feasible it is to manage static IPs for k3d nodes (i.e. similar to using the --ip flag in docker).

@mindkeep
Copy link
Author

mindkeep commented Apr 7, 2021

Worth noting, if I shut down both containers and restart k3d-test-k3d-server-0 first, we get the original IP ordering and everything comes back up.

So it's purely a docker IP ordering problem.

Given that we already spin up a new docker network, could we explore assigning more static IPs within that network?

@iwilltry42
Copy link
Member

So it's purely a docker IP ordering problem.

Yep..

could we explore assigning more static IPs

Implementing some IPAM in k3d would be the only possible solution, yep 🤔

Let's turn this into a feature request 👍

@iwilltry42 iwilltry42 modified the milestones: v4.4.2, v4.5.0 Apr 7, 2021
@iwilltry42 iwilltry42 added enhancement New feature or request and removed bug Something isn't working labels Apr 7, 2021
@iwilltry42 iwilltry42 changed the title [BUG] Inconsistent IP addresses after starting and stopping containers causes etcd errors [FEATURE] k3d IPAM (to prevent etcd failures) Apr 7, 2021
@iwilltry42
Copy link
Member

iwilltry42 commented Apr 14, 2021

Hi @mindkeep, while I'm still trying to figure out the best way to cope with this, I already did some work on this.
Please give https://github.com/rancher/k3d/releases/tag/v4.5.0-dev.0 a try and start your cluster with --subnet auto or --subnet 172.45.0.0/16 (or whatever subnet it should use).
With --subnet auto, k3d will create a fake docker network to get a subnet auto-assigned by docker that it can use.
Also please make use of the --trace flag for verbose logs.
Awaiting your feedback :)

@bukowa
Copy link

bukowa commented Apr 14, 2021

@iwilltry42

First try

$ k3d2 cluster create test25 --servers=3 --subnet=auto
←[36mINFO←[0m[0000] Prep: Network
←[36mINFO←[0m[0000] Created network 'k3d-test25' (3ad95731bfce7029df8a7f7540daaa02c2fb305a10e24d531e0ed6e6756d08b2)
←[36mINFO←[0m[0000] Created volume 'k3d-test25-images'
←[36mINFO←[0m[0000] Creating initializing server node
←[36mINFO←[0m[0000] Creating node 'k3d-test25-server-0'
←[36mINFO←[0m[0002] Pulling image 'docker.io/rancher/k3s:v1.20.5-k3s1'
←[36mINFO←[0m[0011] Creating node 'k3d-test25-server-1'
←[36mINFO←[0m[0012] Creating node 'k3d-test25-server-2'
←[36mINFO←[0m[0012] Creating LoadBalancer 'k3d-test25-serverlb'
←[36mINFO←[0m[0014] Pulling image 'docker.io/rancher/k3d-proxy:v4.5.0-dev.0'
←[36mINFO←[0m[0019] Starting cluster 'test25'
←[36mINFO←[0m[0019] Starting the initializing server...
←[36mINFO←[0m[0019] Starting Node 'k3d-test25-server-0'
←[36mINFO←[0m[0020] Starting servers...
←[36mINFO←[0m[0020] Starting Node 'k3d-test25-server-1'
←[31mERRO←[0m[0020] Failed to start node 'k3d-test25-server-1'
←[31mERRO←[0m[0020] Failed Cluster Start: Failed to start server k3d-test25-server-1: Error response from daemon: Address already in use
←[31mERRO←[0m[0020] Failed to create cluster >>> Rolling Back
←[36mINFO←[0m[0020] Deleting cluster 'test25'
←[36mINFO←[0m[0020] Deleted k3d-test25-server-0
←[36mINFO←[0m[0020] Deleted k3d-test25-server-1
←[36mINFO←[0m[0020] Deleted k3d-test25-server-2
←[36mINFO←[0m[0020] Deleted k3d-test25-serverlb
←[36mINFO←[0m[0020] Deleting cluster network 'k3d-test25'
←[36mINFO←[0m[0021] Deleting image volume 'k3d-test25-images'
←[31mFATA←[0m[0021] Cluster creation FAILED, all changes have been rolled back!
$ k3d2 cluster create test25 --servers=3 --subnet=172.45.0.0/16
←[36mINFO←[0m[0000] Prep: Network
←[36mINFO←[0m[0000] Created network 'k3d-test25' (f7900949f4b380af681d7a9f1b39992b971cbbffc94bd57ee254696dc9812fbb)
←[36mINFO←[0m[0000] Created volume 'k3d-test25-images'
←[36mINFO←[0m[0000] Creating initializing server node
←[36mINFO←[0m[0000] Creating node 'k3d-test25-server-0'
←[36mINFO←[0m[0001] Creating node 'k3d-test25-server-1'
←[36mINFO←[0m[0002] Creating node 'k3d-test25-server-2'
←[36mINFO←[0m[0002] Creating LoadBalancer 'k3d-test25-serverlb'
←[36mINFO←[0m[0002] Starting cluster 'test25'
←[36mINFO←[0m[0002] Starting the initializing server...
←[36mINFO←[0m[0002] Starting Node 'k3d-test25-server-0'
←[36mINFO←[0m[0003] Starting servers...
←[36mINFO←[0m[0003] Starting Node 'k3d-test25-server-1'
←[31mERRO←[0m[0003] Failed to start node 'k3d-test25-server-1'
←[31mERRO←[0m[0003] Failed Cluster Start: Failed to start server k3d-test25-server-1: Error response from daemon: Address already in use
←[31mERRO←[0m[0003] Failed to create cluster >>> Rolling Back
←[36mINFO←[0m[0003] Deleting cluster 'test25'
←[36mINFO←[0m[0003] Deleted k3d-test25-server-0
←[36mINFO←[0m[0003] Deleted k3d-test25-server-1
←[36mINFO←[0m[0003] Deleted k3d-test25-server-2
←[36mINFO←[0m[0003] Deleted k3d-test25-serverlb
←[36mINFO←[0m[0003] Deleting cluster network 'k3d-test25'
←[36mINFO←[0m[0003] Deleting image volume 'k3d-test25-images'
←[31mFATA←[0m[0003] Cluster creation FAILED, all changes have been rolled back!

@iwilltry42
Copy link
Member

@bukowa ... well.. crap 😁
Forgot to add a case before building. 🙄

@bukowa
Copy link

bukowa commented Apr 14, 2021

@iwilltry42

After restarting docker and deleting that cluster, my old clusters with 1 server won't bootup :(

E0414 11:40:13.438977 7 controller.go:116] loading OpenAPI spec for "v1beta1.metrics.k8s.io" failed with: failed to retrieve openAPI spec, http error: ResponseCode: 503, Body: error trying to reach service: dial tcp 10.43.239.9:443: connect: no route to host

, Header: map[Content-Type:[text/plain; charset=utf-8] X-Content-Type-Options:[nosniff]]

I0414 11:40:13.439010 7 controller.go:129] OpenAPI AggregationController: action for item v1beta1.metrics.k8s.io: Rate Limited Requeue.

E0414 11:40:13.440809 7 available_controller.go:508] v1beta1.metrics.k8s.io failed with: Operation cannot be fulfilled on apiservices.apiregistration.k8s.io "v1beta1.metrics.k8s.io": the object has been modified; please apply your changes to the latest version and try again

I0414 11:40:14.147032 7 range_allocator.go:82] Sending events to api server.

I0414 11:40:14.147149 7 range_allocator.go:110] No Service CIDR provided. Skipping filtering out service addresses.

I0414 11:40:14.147175 7 range_allocator.go:116] No Secondary Service CIDR provided. Skipping filtering out secondary service addresses.

I0414 11:40:14.147195 7 controllermanager.go:554] Started "nodeipam"

I0414 11:40:14.147300 7 node_ipam_controller.go:159] Starting ipam controller

I0414 11:40:14.147315 7 shared_informer.go:240] Waiting for caches to sync for node

I0414 11:40:14.149398 7 controllermanager.go:554] Started "persistentvolume-expander"

I0414 11:40:14.149475 7 expand_controller.go:310] Starting expand controller

I0414 11:40:14.149482 7 shared_informer.go:240] Waiting for caches to sync for expand

I0414 11:40:14.151273 7 controllermanager.go:554] Started "clusterrole-aggregation"

I0414 11:40:14.151297 7 clusterroleaggregation_controller.go:149] Starting ClusterRoleAggregator

I0414 11:40:14.151304 7 shared_informer.go:240] Waiting for caches to sync for ClusterRoleAggregator

I0414 11:40:14.152995 7 controllermanager.go:554] Started "job"

I0414 11:40:14.153102 7 job_controller.go:148] Starting job controller

I0414 11:40:14.153113 7 shared_informer.go:240] Waiting for caches to sync for job

I0414 11:40:14.154074 7 shared_informer.go:240] Waiting for caches to sync for resource quota

W0414 11:40:14.156670 7 actual_state_of_world.go:534] Failed to update statusUpdateNeeded field in actual state of world: Failed to set statusUpdateNeeded to needed true, because nodeName="k3d-k3s-default-server-0" does not exist

I0414 11:40:14.158994 7 shared_informer.go:247] Caches are synced for persistent volume

E0414 11:40:14.160100 7 memcache.go:196] couldn't get resource list for metrics.k8s.io/v1beta1: the server is currently unable to handle the request

I0414 11:40:14.161068 7 shared_informer.go:247] Caches are synced for attach detach

I0414 11:40:14.165053 7 shared_informer.go:247] Caches are synced for TTL

I0414 11:40:14.166524 7 shared_informer.go:247] Caches are synced for PVC protection

I0414 11:40:14.169107 7 shared_informer.go:247] Caches are synced for endpoint

I0414 11:40:14.181276 7 shared_informer.go:247] Caches are synced for HPA

I0414 11:40:14.187013 7 shared_informer.go:247] Caches are synced for PV protection

I0414 11:40:14.187210 7 event.go:291] "Event occurred" object="lens-metrics/kube-state-metrics" kind="Endpoints" apiVersion="v1" type="Warning" reason="FailedToUpdateEndpoint" message="Failed to update endpoint lens-metrics/kube-state-metrics: Operation cannot be fulfilled on endpoints \"kube-state-metrics\": the object has been modified; please apply your changes to the latest version and try again"

I0414 11:40:14.195398 7 shared_informer.go:247] Caches are synced for ReplicationController

I0414 11:40:14.198810 7 shared_informer.go:247] Caches are synced for GC

I0414 11:40:14.218147 7 shared_informer.go:247] Caches are synced for namespace

I0414 11:40:14.218332 7 shared_informer.go:247] Caches are synced for endpoint_slice_mirroring

I0414 11:40:14.221166 7 shared_informer.go:247] Caches are synced for disruption

I0414 11:40:14.221254 7 disruption.go:339] Sending events to api server.

I0414 11:40:14.224425 7 shared_informer.go:247] Caches are synced for certificate-csrsigning-kubelet-serving

I0414 11:40:14.224557 7 shared_informer.go:247] Caches are synced for certificate-csrsigning-kubelet-client

I0414 11:40:14.224609 7 shared_informer.go:247] Caches are synced for certificate-csrsigning-legacy-unknown

I0414 11:40:14.224726 7 shared_informer.go:247] Caches are synced for certificate-csrsigning-kube-apiserver-client

I0414 11:40:14.226641 7 shared_informer.go:247] Caches are synced for endpoint_slice

I0414 11:40:14.228199 7 shared_informer.go:247] Caches are synced for service account

I0414 11:40:14.230801 7 shared_informer.go:247] Caches are synced for ReplicaSet

I0414 11:40:14.234906 7 shared_informer.go:247] Caches are synced for certificate-csrapproving

I0414 11:40:14.234977 7 shared_informer.go:247] Caches are synced for taint

I0414 11:40:14.235281 7 node_lifecycle_controller.go:1429] Initializing eviction metric for zone:

W0414 11:40:14.235413 7 node_lifecycle_controller.go:1044] Missing timestamp for Node k3d-k3s-default-server-0. Assuming now as a timestamp.

I0414 11:40:14.238699 7 taint_manager.go:187] Starting NoExecuteTaintManager

I0414 11:40:14.238856 7 event.go:291] "Event occurred" object="k3d-k3s-default-server-0" kind="Node" apiVersion="v1" type="Normal" reason="RegisteredNode" message="Node k3d-k3s-default-server-0 event: Registered Node k3d-k3s-default-server-0 in Controller"

I0414 11:40:14.239237 7 node_lifecycle_controller.go:1245] Controller detected that zone is now in state Normal.

I0414 11:40:14.247401 7 shared_informer.go:247] Caches are synced for node

I0414 11:40:14.247602 7 range_allocator.go:172] Starting range CIDR allocator

I0414 11:40:14.247663 7 shared_informer.go:240] Waiting for caches to sync for cidrallocator

I0414 11:40:14.247703 7 shared_informer.go:247] Caches are synced for cidrallocator

I0414 11:40:14.249530 7 shared_informer.go:247] Caches are synced for deployment

I0414 11:40:14.250710 7 shared_informer.go:247] Caches are synced for expand

I0414 11:40:14.251616 7 shared_informer.go:247] Caches are synced for ClusterRoleAggregator

I0414 11:40:14.272632 7 shared_informer.go:247] Caches are synced for daemon sets

I0414 11:40:14.284690 7 shared_informer.go:247] Caches are synced for stateful set

I0414 11:40:14.392178 7 shared_informer.go:247] Caches are synced for crt configmap

I0414 11:40:14.453240 7 shared_informer.go:247] Caches are synced for job

I0414 11:40:14.454227 7 shared_informer.go:247] Caches are synced for resource quota

I0414 11:40:14.504974 7 shared_informer.go:247] Caches are synced for resource quota

E0414 11:40:14.804091 7 memcache.go:101] couldn't get resource list for metrics.k8s.io/v1beta1: the server is currently unable to handle the request

I0414 11:40:14.806192 7 shared_informer.go:240] Waiting for caches to sync for garbage collector

I0414 11:40:14.906439 7 shared_informer.go:247] Caches are synced for garbage collector

I0414 11:40:14.949133 7 shared_informer.go:247] Caches are synced for garbage collector

I0414 11:40:14.949164 7 garbagecollector.go:151] Garbage collector: all resource monitors have synced. Proceeding to collect garbage

E0414 11:40:16.558960 7 available_controller.go:508] v1beta1.metrics.k8s.io failed with: failing or missing response from https://10.43.239.9:443/apis/metrics.k8s.io/v1beta1: Get "https://10.43.239.9:443/apis/metrics.k8s.io/v1beta1": dial tcp 10.43.239.9:443: connect: no route to host

W0414 11:40:17.065119 7 lease.go:233] Resetting endpoints for master service "kubernetes" to [172.18.0.3]

time="2021-04-14T11:40:17.067400700Z" level=info msg="Stopped tunnel to 172.18.0.2:6443"

time="2021-04-14T11:40:17.067465300Z" level=info msg="Proxy done" err="context canceled" url="wss://172.18.0.2:6443/v1-k3s/connect"

time="2021-04-14T11:40:17.067664600Z" level=info msg="error in remotedialer server [400]: websocket: close 1006 (abnormal closure): unexpected EOF"

I0414 11:44:52.803254 7 network_policy_controller.go:157] Shutting down network policies full sync goroutine

time="2021-04-14T11:44:52.805001100Z" level=info msg="Shutting down batch/v1, Kind=Job workers"

time="2021-04-14T11:44:52.805117200Z" level=info msg="Shutting down helm.cattle.io/v1, Kind=HelmChartConfig workers"

time="2021-04-14T11:44:52.805641500Z" level=info msg="Shutting down /v1, Kind=Endpoints workers"

time="2021-04-14T11:44:52.807819300Z" level=info msg="Shutting down /v1, Kind=Node workers"

@iwilltry42
Copy link
Member

@bukowa can you inspect those node containers?
And can you check if there's an overlapping docker network (docker network ls & docker network inspect)?

@bukowa
Copy link

bukowa commented Apr 14, 2021

@iwilltry42 no overlapping networks + all nodes started right now after another docker restart, strange :) before they were in exit loop

@iwilltry42
Copy link
Member

@bukowa
Copy link

bukowa commented Apr 15, 2021

@iwilltry42 i made a small github action that may be usefull to test this https://github.com/bukowa/k3dtest/runs/2352095712?check_suite_focus=true

@iwilltry42
Copy link
Member

Thanks for that @bukowa , I forked it and added kubectl to check that output.
It seems to work, right?
E.g. https://github.com/iwilltry42/k3dtest/runs/2352325264

@bukowa
Copy link

bukowa commented Apr 15, 2021

@iwilltry42 seems to work but not on my machine ^^

ff192959bbdd   rancher/k3d-proxy:v4.5.0-dev.1   "/bin/sh -c nginx-pr…"   25 minutes ago   Restarting (1) 39 seconds ago                                                                                 k3d-test55-serverlb
db451387ad46   rancher/k3s:v1.20.5-k3s1         "/bin/k3s server --t…"   25 minutes ago   Exited (255) 23 minutes ago                                                                                   k3d-test55-server-2
10ec4e821999   rancher/k3s:v1.20.5-k3s1         "/bin/k3s server --t…"   25 minutes ago   Up 21 minutes                                                                                                 k3d-test55-server-1
ff7e15fb60bb   rancher/k3s:v1.20.5-k3s1         "/bin/k3s server --c…"   25 minutes ago   Exited (255) 23 minutes ago                                                                                   k3d-test55-server-0

server0.txt
server1.txt
server2.txt
serverlb.txt

@bukowa
Copy link

bukowa commented Apr 15, 2021

@iwilltry42 i think ive got something, because i still have old clusters running:

ff192959bbdd   rancher/k3d-proxy:v4.5.0-dev.1   "/bin/sh -c nginx-pr…"   33 minutes ago   Up 20 seconds                  80/tcp, 0.0.0.0:52232->6443/tcp                                               k3d-test55-serverlb
db451387ad46   rancher/k3s:v1.20.5-k3s1         "/bin/k3s server --t…"   33 minutes ago   Up 21 seconds                                                                                                k3d-test55-server-2
10ec4e821999   rancher/k3s:v1.20.5-k3s1         "/bin/k3s server --t…"   33 minutes ago   Up 20 seconds                                                                                                k3d-test55-server-1
ff7e15fb60bb   rancher/k3s:v1.20.5-k3s1         "/bin/k3s server --c…"   33 minutes ago   Up 20 seconds                                                                                                k3d-test55-server-0
4e8e8de805e9   rancher/k3s:v1.20.4-k3s1         "/bin/k3s server --t…"   2 weeks ago      Exited (255) 23 seconds ago    0.0.0.0:32080->32080/tcp, 0.0.0.0:32443->32443/tcp, 0.0.0.0:65336->6443/tcp   k3d-wpenv-server-0
afee5e598565   rancher/k3d-proxy:v4.3.0         "/bin/sh -c nginx-pr…"   4 weeks ago      Restarting (1) 2 seconds ago                                                                                 k3d-k3s-default-serverlb
0bdf54e2bf41   rancher/k3s:v1.20.4-k3s1         "/bin/k3s server --t…"   4 weeks ago      Exited (255) 23 seconds ago                                                                                  k3d-k3s-default-server-0

now each time i restart docker looks like alive nodes are shifting, take a look:

restart 1:

ff192959bbdd   rancher/k3d-proxy:v4.5.0-dev.1   "/bin/sh -c nginx-pr…"   36 minutes ago   Up 3 seconds                  80/tcp, 0.0.0.0:52232->6443/tcp                                               k3d-test55-serverlb
db451387ad46   rancher/k3s:v1.20.5-k3s1         "/bin/k3s server --t…"   36 minutes ago   Exited (255) 47 seconds ago                                                                                 k3d-test55-server-2
10ec4e821999   rancher/k3s:v1.20.5-k3s1         "/bin/k3s server --t…"   36 minutes ago   Exited (255) 47 seconds ago                                                                                 k3d-test55-server-1
ff7e15fb60bb   rancher/k3s:v1.20.5-k3s1         "/bin/k3s server --c…"   36 minutes ago   Exited (255) 47 seconds ago                                                                                 k3d-test55-server-0
4e8e8de805e9   rancher/k3s:v1.20.4-k3s1         "/bin/k3s server --t…"   2 weeks ago      Up 45 seconds                 0.0.0.0:32080->32080/tcp, 0.0.0.0:32443->32443/tcp, 0.0.0.0:65336->6443/tcp   k3d-wpenv-server-0
afee5e598565   rancher/k3d-proxy:v4.3.0         "/bin/sh -c nginx-pr…"   4 weeks ago      Up 45 seconds                 80/tcp, 0.0.0.0:50670->6443/tcp                                               k3d-k3s-default-serverlb
0bdf54e2bf41   rancher/k3s:v1.20.4-k3s1         "/bin/k3s server --t…"   4 weeks ago      Up 45 seconds                                                                                               k3d-k3s-default-server-0

restart 2:

ff192959bbdd   rancher/k3d-proxy:v4.5.0-dev.1   "/bin/sh -c nginx-pr…"   37 minutes ago   Up 1 second                       80/tcp, 0.0.0.0:52232->6443/tcp                                               k3d-test55-serverlb
db451387ad46   rancher/k3s:v1.20.5-k3s1         "/bin/k3s server --t…"   37 minutes ago   Up 23 seconds                                                                                                   k3d-test55-server-2
10ec4e821999   rancher/k3s:v1.20.5-k3s1         "/bin/k3s server --t…"   37 minutes ago   Up 23 seconds                                                                                                   k3d-test55-server-1
ff7e15fb60bb   rancher/k3s:v1.20.5-k3s1         "/bin/k3s server --c…"   37 minutes ago   Exited (255) About a minute ago                                                                                 k3d-test55-server-0
4e8e8de805e9   rancher/k3s:v1.20.4-k3s1         "/bin/k3s server --t…"   2 weeks ago      Exited (255) 25 seconds ago       0.0.0.0:32080->32080/tcp, 0.0.0.0:32443->32443/tcp, 0.0.0.0:65336->6443/tcp   k3d-wpenv-server-0
afee5e598565   rancher/k3d-proxy:v4.3.0         "/bin/sh -c nginx-pr…"   4 weeks ago      Up 2 seconds                      80/tcp, 0.0.0.0:50670->6443/tcp                                               k3d-k3s-default-serverlb
0bdf54e2bf41   rancher/k3s:v1.20.4-k3s1         "/bin/k3s server --t…"   4 weeks ago      Exited (255) 25 seconds ago                                                                                     k3d-k3s-default-server-0

restart 3:

ff192959bbdd   rancher/k3d-proxy:v4.5.0-dev.1   "/bin/sh -c nginx-pr…"   38 minutes ago   Up 2 seconds                  80/tcp, 0.0.0.0:52232->6443/tcp                                               k3d-test55-serverlb
db451387ad46   rancher/k3s:v1.20.5-k3s1         "/bin/k3s server --t…"   38 minutes ago   Exited (255) 26 seconds ago                                                                                 k3d-test55-server-2
10ec4e821999   rancher/k3s:v1.20.5-k3s1         "/bin/k3s server --t…"   38 minutes ago   Exited (255) 26 seconds ago                                                                                 k3d-test55-server-1
ff7e15fb60bb   rancher/k3s:v1.20.5-k3s1         "/bin/k3s server --c…"   38 minutes ago   Exited (255) 2 minutes ago                                                                                  k3d-test55-server-0
4e8e8de805e9   rancher/k3s:v1.20.4-k3s1         "/bin/k3s server --t…"   2 weeks ago      Up 25 seconds                 0.0.0.0:32080->32080/tcp, 0.0.0.0:32443->32443/tcp, 0.0.0.0:65336->6443/tcp   k3d-wpenv-server-0
afee5e598565   rancher/k3d-proxy:v4.3.0         "/bin/sh -c nginx-pr…"   4 weeks ago      Up 24 seconds                 80/tcp, 0.0.0.0:50670->6443/tcp                                               k3d-k3s-default-serverlb
0bdf54e2bf41   rancher/k3s:v1.20.4-k3s1         "/bin/k3s server --t…"   4 weeks ago      Up 23 seconds                                                                                               k3d-k3s-default-server-0

restart 4:

ff192959bbdd   rancher/k3d-proxy:v4.5.0-dev.1   "/bin/sh -c nginx-pr…"   40 minutes ago   Restarting (1) 7 seconds ago                                                                                    k3d-test55-serverlb
db451387ad46   rancher/k3s:v1.20.5-k3s1         "/bin/k3s server --t…"   40 minutes ago   Exited (255) 3 minutes ago                                                                                      k3d-test55-server-2
10ec4e821999   rancher/k3s:v1.20.5-k3s1         "/bin/k3s server --t…"   40 minutes ago   Up About a minute                                                                                               k3d-test55-server-1
ff7e15fb60bb   rancher/k3s:v1.20.5-k3s1         "/bin/k3s server --c…"   40 minutes ago   Up About a minute                                                                                               k3d-test55-server-0
4e8e8de805e9   rancher/k3s:v1.20.4-k3s1         "/bin/k3s server --t…"   2 weeks ago      Exited (255) About a minute ago   0.0.0.0:32080->32080/tcp, 0.0.0.0:32443->32443/tcp, 0.0.0.0:65336->6443/tcp   k3d-wpenv-server-0
afee5e598565   rancher/k3d-proxy:v4.3.0         "/bin/sh -c nginx-pr…"   4 weeks ago      Restarting (1) 6 seconds ago                                                                                    k3d-k3s-default-serverlb
0bdf54e2bf41   rancher/k3s:v1.20.4-k3s1         "/bin/k3s server --t…"   4 weeks ago      Exited (255) About a minute ago                                                                                 k3d-k3s-default-server-0
$ docker network ls
NETWORK ID     NAME              DRIVER    SCOPE
b436e5a0149c   bridge            bridge    local
e05bb6c1420e   host              host      local
56bb9cb725d6   k3d-k3s-default   bridge    local
ed22e183cae8   k3d-test55        bridge    local
5312bd5f60bd   k3d-wpenv         bridge    local
2208d8a45078   none              null      local
$ docker network inspect k3d-k3s-default
[
    {
        "Name": "k3d-k3s-default",
        "Id": "56bb9cb725d6eeb944f529cd964814845d5e5142029f83e06bfc85d584beb4aa",
        "Created": "2021-03-12T00:27:05.3874863Z",
        "Scope": "local",
        "Driver": "bridge",
        "EnableIPv6": false,
        "IPAM": {
            "Driver": "default",
            "Options": null,
            "Config": [
                {
                    "Subnet": "172.18.0.0/16",
                    "Gateway": "172.18.0.1"
                }
            ]
        },
        "Internal": false,
        "Attachable": false,
        "Ingress": false,
        "ConfigFrom": {
            "Network": ""
        },
        "ConfigOnly": false,
        "Containers": {
            "afee5e59856529cb6e304a423b93a79ff59b7439e4c4c8a5fc7b5be0a3f4c5b6": {
                "Name": "k3d-k3s-default-serverlb",
                "EndpointID": "7a9659b83a4c9d76b291a4461abeea0e8411af01cef9f2c85efae474aa221a38",
                "MacAddress": "02:42:ac:12:00:02",
                "IPv4Address": "172.18.0.2/16",
                "IPv6Address": ""
            }
        },
        "Options": {},
        "Labels": {
            "app": "k3d"
        }
    }
]
$ docker network inspect k3d-test55
[
    {
        "Name": "k3d-test55",
        "Id": "ed22e183cae8b7f29442f03cf95b81dd3ba12523e355aa8b9acfb9af84b281f6",
        "Created": "2021-04-15T11:45:01.0438683Z",
        "Scope": "local",
        "Driver": "bridge",
        "EnableIPv6": false,
        "IPAM": {
            "Driver": "default",
            "Options": null,
            "Config": [
                {
                    "Subnet": "172.20.0.0/16",
                    "Gateway": "172.20.0.1"
                }
            ]
        },
        "Internal": false,
        "Attachable": false,
        "Ingress": false,
        "ConfigFrom": {
            "Network": ""
        },
        "ConfigOnly": false,
        "Containers": {
            "10ec4e821999abf5a2c7a501cecafd8f13daea4842a9873f96985d36a8d7ef3f": {
                "Name": "k3d-test55-server-1",
                "EndpointID": "a130a1d83af61109579915e2b32fe898e1417e8ad65cd1eb563a578eda2d82ad",
                "MacAddress": "02:42:ac:14:00:03",
                "IPv4Address": "172.20.0.3/16",
                "IPv6Address": ""
            },
            "ff192959bbdd6fa5bd35e7fb3ba850b2156760c3cd6765159d311141cda3b885": {
                "Name": "k3d-test55-serverlb",
                "EndpointID": "a05b53563a77e22ba2a79f0560c01c260833cd4cdf9f1f59b270434a3814f42a",
                "MacAddress": "02:42:ac:14:00:04",
                "IPv4Address": "172.20.0.4/16",
                "IPv6Address": ""
            },
            "ff7e15fb60bb195de04aac993dc7456abcc3ea2adacd78a7bb4ee62ce278e098": {
                "Name": "k3d-test55-server-0",
                "EndpointID": "f84d9ba79007959197f4b5af7f1873f7a1e85da5f5aa23840963b9e45aea2f3b",
                "MacAddress": "02:42:ac:14:00:02",
                "IPv4Address": "172.20.0.2/16",
                "IPv6Address": ""
            }
        },
        "Options": {},
        "Labels": {
            "app": "k3d"
        }
    }
]
$ docker network inspect k3d-wpenv
[
    {
        "Name": "k3d-wpenv",
        "Id": "5312bd5f60bd523d046f919bf42c2b25c75463cc851aeb64e3bfff9b52583a11",
        "Created": "2021-03-28T22:57:14.5128859Z",
        "Scope": "local",
        "Driver": "bridge",
        "EnableIPv6": false,
        "IPAM": {
            "Driver": "default",
            "Options": null,
            "Config": [
                {
                    "Subnet": "172.19.0.0/16",
                    "Gateway": "172.19.0.1"
                }
            ]
        },
        "Internal": false,
        "Attachable": false,
        "Ingress": false,
        "ConfigFrom": {
            "Network": ""
        },
        "ConfigOnly": false,
        "Containers": {},
        "Options": {},
        "Labels": {
            "app": "k3d"
        }
    }
]

@iwilltry42
Copy link
Member

@bukowa , I am like absolutely lost there 🤔
The network settings look fine to me and I cannot think of any way, that the clusters could interfere.. 🤔

@bukowa
Copy link

bukowa commented Apr 15, 2021

@iwilltry42 ok maybe this finding can become useful when someone encounters similar issue

@renepardon
Copy link

I tried to do k3d cluster stop dev ; k3d cluster start dev after I received the error from issue #262

Error from server (ServiceUnavailable): the server is currently unable to handle the request (get nodes)

Now it hangs here:

INFO[0000] Stopping cluster 'dev'                       
INFO[0000] Starting cluster 'dev'                       
INFO[0000] Starting the initializing server...          
INFO[0000] Starting Node 'k3d-dev-server-0'             
INFO[0001] Starting servers...                          
INFO[0001] Starting Node 'k3d-dev-server-1'             

The cluster was initially created with:

k3d registry create registry.localhost --port 5000
k3d cluster create dev \
    --k3s-server-arg "--no-deploy=traefik" \
    --registry-use k3d-registry.localhost:5000 \
    --port 80:80@loadbalancer \
    --port 443:443@loadbalancer \
    --api-port 6443 --servers 3 --agents 3

@iwilltry42
Copy link
Member

Hi @renepardon , can you please share the logs of the k3d-dev-server-0 and k3d-dev-server-1 containers?
Also, which version of k3d and k3s are you using and what is your environment?

@renepardon
Copy link

renepardon commented May 27, 2021 via email

@renepardon
Copy link

@iwilltry42

k3d version v4.4.3
k3s version v1.20.6-k3s1 (default)

But I can't find the log files. Neither in my home directory, nor in /var/log

@iwilltry42
Copy link
Member

@renepardon , if you were to search for actual log files, they'd be in /var/lib/docker/containers/<ID>/<ID>.log. But you can just do e.g. docker logs k3d-dev-server-0 to get them 👍
Also, please try k3d v4.4.4 and paste your kernel version (uname -r).

@renepardon
Copy link

@iwilltry42

k3d-dev-server-0 log repeats infinite like this:

time="2021-06-07T06:42:26.583579237Z" level=info msg="Failed to test data store connection: this server is a not a member of the etcd cluster. Found [k3d-dev-server-1-b8022d18=https://172.21.0.4:2380 k3d-dev-server-2-bb40fdf0=https://172.21.0.5:2380 k3d-dev-server-0-f3a8f84f=https://172.21.0.3:2380], expect: k3d-dev-server-0-f3a8f84f=172.21.0.7"
time="2021-06-07T06:42:26.881852734Z" level=info msg="certificate CN=k3s,O=k3s signed by CN=k3s-server-ca@1622098330: notBefore=2021-05-27 06:52:10 +0000 UTC notAfter=2022-06-07 06:42:26 +0000 UTC"
time="2021-06-07T06:42:26.882548039Z" level=info msg="Cluster-Http-Server 2021/06/07 06:42:26 http: TLS handshake error from 172.21.0.6:48878: remote error: tls: bad certificate"
time="2021-06-07T06:42:26.883032016Z" level=info msg="certificate CN=k3s,O=k3s signed by CN=k3s-server-ca@1622098330: notBefore=2021-05-27 06:52:10 +0000 UTC notAfter=2022-06-07 06:42:26 +0000 UTC"
time="2021-06-07T06:42:27.350483832Z" level=info msg="certificate CN=k3s,O=k3s signed by CN=k3s-server-ca@1622098330: notBefore=2021-05-27 06:52:10 +0000 UTC notAfter=2022-06-07 06:42:27 +0000 UTC"
time="2021-06-07T06:42:27.351461555Z" level=info msg="Cluster-Http-Server 2021/06/07 06:42:27 http: TLS handshake error from 172.21.0.2:35352: remote error: tls: bad certificate"

and k3d-dev-server-1:

{"level":"info","ts":"2021-06-07T06:43:10.559Z","caller":"mvcc/index.go:189","msg":"compact tree index","revision":427102}
{"level":"info","ts":"2021-06-07T06:43:10.572Z","caller":"mvcc/kvstore_compaction.go:56","msg":"finished scheduled compaction","compact-revision":427102,"took":"12.938268ms"}
W0607 06:43:13.421471       8 handler_proxy.go:102] no RequestInfo found in the context
E0607 06:43:13.421512       8 controller.go:116] loading OpenAPI spec for "v1beta1.metrics.k8s.io" failed with: failed to retrieve openAPI spec, http error: ResponseCode: 503, Body: service unavailable
, Header: map[Content-Type:[text/plain; charset=utf-8] X-Content-Type-Options:[nosniff]]
I0607 06:43:13.421518       8 controller.go:129] OpenAPI AggregationController: action for item v1beta1.metrics.k8s.io: Rate Limited Requeue.
{"level":"warn","ts":"2021-06-07T06:43:13.504Z","caller":"rafthttp/probing_status.go:70","msg":"prober detected unhealthy status","round-tripper-name":"ROUND_TRIPPER_SNAPSHOT","remote-peer-id":"be99668240cb5dc7","rtt":"0s","error":"dial tcp 172.21.0.3:2380: connect: connection refused"}
{"level":"warn","ts":"2021-06-07T06:43:13.504Z","caller":"rafthttp/probing_status.go:70","msg":"prober detected unhealthy status","round-tripper-name":"ROUND_TRIPPER_RAFT_MESSAGE","remote-peer-id":"be99668240cb5dc7","rtt":"0s","error":"dial tcp 172.21.0.3:2380: connect: connection refused"}
{"level":"warn","ts":"2021-06-07T06:43:18.504Z","caller":"rafthttp/probing_status.go:70","msg":"prober detected unhealthy status","round-tripper-name":"ROUND_TRIPPER_RAFT_MESSAGE","remote-peer-id":"be99668240cb5dc7","rtt":"0s","error":"dial tcp 172.21.0.3:2380: connect: connection refused"}
{"level":"warn","ts":"2021-06-07T06:43:18.504Z","caller":"rafthttp/probing_status.go:70","msg":"prober detected unhealthy status","round-tripper-name":"ROUND_TRIPPER_SNAPSHOT","remote-peer-id":"be99668240cb5dc7","rtt":"0s","error":"dial tcp 172.21.0.3:2380: connect: connection refused"}
{"level":"warn","ts":"2021-06-07T06:43:23.504Z","caller":"rafthttp/probing_status.go:70","msg":"prober detected unhealthy status","round-tripper-name":"ROUND_TRIPPER_RAFT_MESSAGE","remote-peer-id":"be99668240cb5dc7","rtt":"0s","error":"dial tcp 172.21.0.3:2380: connect: connection refused"}
{"level":"warn","ts":"2021-06-07T06:43:23.504Z","caller":"rafthttp/probing_status.go:70","msg":"prober detected unhealthy status","round-tripper-name":"ROUND_TRIPPER_SNAPSHOT","remote-peer-id":"be99668240cb5dc7","rtt":"0s","error":"dial tcp 172.21.0.3:2380: connect: connection refused"}
{"level":"warn","ts":"2021-06-07T06:43:28.504Z","caller":"rafthttp/probing_status.go:70","msg":"prober detected unhealthy status","round-tripper-name":"ROUND_TRIPPER_SNAPSHOT","remote-peer-id":"be99668240cb5dc7","rtt":"0s","error":"dial tcp 172.21.0.3:2380: connect: connection refused"}
{"level":"warn","ts":"2021-06-07T06:43:28.504Z","caller":"rafthttp/probing_status.go:70","msg":"prober detected unhealthy status","round-tripper-name":"ROUND_TRIPPER_RAFT_MESSAGE","remote-peer-id":"be99668240cb5dc7","rtt":"0s","error":"dial tcp 172.21.0.3:2380: connect: connection refused"}
{"level":"warn","ts":"2021-06-07T06:43:33.504Z","caller":"rafthttp/probing_status.go:70","msg":"prober detected unhealthy status","round-tripper-name":"ROUND_TRIPPER_RAFT_MESSAGE","remote-peer-id":"be99668240cb5dc7","rtt":"0s","error":"dial tcp 172.21.0.3:2380: connect: connection refused"}
{"level":"warn","ts":"2021-06-07T06:43:33.504Z","caller":"rafthttp/probing_status.go:70","msg":"prober detected unhealthy status","round-tripper-name":"ROUND_TRIPPER_SNAPSHOT","remote-peer-id":"be99668240cb5dc7","rtt":"0s","error":"dial tcp 172.21.0.3:2380: connect: connection refused"}
{"level":"warn","ts":"2021-06-07T06:43:38.505Z","caller":"rafthttp/probing_status.go:70","msg":"prober detected unhealthy status","round-tripper-name":"ROUND_TRIPPER_SNAPSHOT","remote-peer-id":"be99668240cb5dc7","rtt":"0s","error":"dial tcp 172.21.0.3:2380: connect: connection refused"}
{"level":"warn","ts":"2021-06-07T06:43:38.505Z","caller":"rafthttp/probing_status.go:70","msg":"prober detected unhealthy status","round-tripper-name":"ROUND_TRIPPER_RAFT_MESSAGE","remote-peer-id":"be99668240cb5dc7","rtt":"0s","error":"dial tcp 172.21.0.3:2380: connect: connection refused"}
W0607 06:43:43.357254       8 info.go:53] Couldn't collect info from any of the files in "/etc/machine-id,/var/lib/dbus/machine-id"
{"level":"warn","ts":"2021-06-07T06:43:43.505Z","caller":"rafthttp/probing_status.go:70","msg":"prober detected unhealthy status","round-tripper-name":"ROUND_TRIPPER_RAFT_MESSAGE","remote-peer-id":"be99668240cb5dc7","rtt":"0s","error":"dial tcp 172.21.0.3:2380: connect: connection refused"}
{"level":"warn","ts":"2021-06-07T06:43:43.505Z","caller":"rafthttp/probing_status.go:70","msg":"prober detected unhealthy status","round-tripper-name":"ROUND_TRIPPER_SNAPSHOT","remote-peer-id":"be99668240cb5dc7","rtt":"0s","error":"dial tcp 172.21.0.3:2380: connect: connection refused"}
{"level":"warn","ts":"2021-06-07T06:43:48.505Z","caller":"rafthttp/probing_status.go:70","msg":"prober detected unhealthy status","round-tripper-name":"ROUND_TRIPPER_SNAPSHOT","remote-peer-id":"be99668240cb5dc7","rtt":"0s","error":"dial tcp 172.21.0.3:2380: connect: connection refused"}
{"level":"warn","ts":"2021-06-07T06:43:48.505Z","caller":"rafthttp/probing_status.go:70","msg":"prober detected unhealthy status","round-tripper-name":"ROUND_TRIPPER_RAFT_MESSAGE","remote-peer-id":"be99668240cb5dc7","rtt":"0s","error":"dial tcp 172.21.0.3:2380: connect: connection refused"}
{"level":"warn","ts":"2021-06-07T06:43:53.505Z","caller":"rafthttp/probing_status.go:70","msg":"prober detected unhealthy status","round-tripper-name":"ROUND_TRIPPER_RAFT_MESSAGE","remote-peer-id":"be99668240cb5dc7","rtt":"0s","error":"dial tcp 172.21.0.3:2380: connect: connection refused"}
{"level":"warn","ts":"2021-06-07T06:43:53.505Z","caller":"rafthttp/probing_status.go:70","msg":"prober detected unhealthy status","round-tripper-name":"ROUND_TRIPPER_SNAPSHOT","remote-peer-id":"be99668240cb5dc7","rtt":"0s","error":"dial tcp 172.21.0.3:2380: connect: connection refused"}
{"level":"warn","ts":"2021-06-07T06:43:58.505Z","caller":"rafthttp/probing_status.go:70","msg":"prober detected unhealthy status","round-tripper-name":"ROUND_TRIPPER_RAFT_MESSAGE","remote-peer-id":"be99668240cb5dc7","rtt":"0s","error":"dial tcp 172.21.0.3:2380: connect: connection refused"}
{"level":"warn","ts":"2021-06-07T06:43:58.506Z","caller":"rafthttp/probing_status.go:70","msg":"prober detected unhealthy status","round-tripper-name":"ROUND_TRIPPER_SNAPSHOT","remote-peer-id":"be99668240cb5dc7","rtt":"0s","error":"dial tcp 172.21.0.3:2380: connect: connection refused"}
{"level":"warn","ts":"2021-06-07T06:44:03.505Z","caller":"rafthttp/probing_status.go:70","msg":"prober detected unhealthy status","round-tripper-name":"ROUND_TRIPPER_RAFT_MESSAGE","remote-peer-id":"be99668240cb5dc7","rtt":"0s","error":"dial tcp 172.21.0.3:2380: connect: connection refused"}
{"level":"warn","ts":"2021-06-07T06:44:03.506Z","caller":"rafthttp/probing_status.go:70","msg":"prober detected unhealthy status","round-tripper-name":"ROUND_TRIPPER_SNAPSHOT","remote-peer-id":"be99668240cb5dc7","rtt":"0s","error":"dial tcp 172.21.0.3:2380: connect: connection refused"}
{"level":"warn","ts":"2021-06-07T06:44:08.505Z","caller":"rafthttp/probing_status.go:70","msg":"prober detected unhealthy status","round-tripper-name":"ROUND_TRIPPER_RAFT_MESSAGE","remote-peer-id":"be99668240cb5dc7","rtt":"0s","error":"dial tcp 172.21.0.3:2380: connect: connection refused"}
{"level":"warn","ts":"2021-06-07T06:44:08.506Z","caller":"rafthttp/probing_status.go:70","msg":"prober detected unhealthy status","round-tripper-name":"ROUND_TRIPPER_SNAPSHOT","remote-peer-id":"be99668240cb5dc7","rtt":"0s","error":"dial tcp 172.21.0.3:2380: connect: connection refused"}

@iwilltry42
Copy link
Member

@renepardon , I'll follow up on the similar issue in #619 👍

@moio
Copy link

moio commented Mar 30, 2023

@iwilltry42 I can still reproduce that IPs are occasionally shuffled on restart on k3d 5.4.9 - k3s v1.24.12+k3s1 (especially after a host reboot).

Does it help if I open a new issue?

@iwilltry42 iwilltry42 moved this to Done in Networking Sep 27, 2023
@p-se
Copy link

p-se commented Jun 25, 2024

@iwilltry42 I can still reproduce that IPs are occasionally shuffled on restart on k3d 5.4.9 - k3s v1.24.12+k3s1 (especially after a host reboot).

Does it help if I open a new issue?

I'm experiencing the same issue. After rebooting the host, the cluster becomes irrecoverably lost. k3d v5.6.3 - k3s v1.28.8-k3s1.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request priority/high runtime Issue with the container runtime (docker)
Projects
Status: Done
Development

No branches or pull requests

6 participants