Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MacOS-intel] Ports are not reachable after a docker restart #2824

Open
giejay opened this issue Jul 5, 2022 · 9 comments
Open

[MacOS-intel] Ports are not reachable after a docker restart #2824

giejay opened this issue Jul 5, 2022 · 9 comments
Labels
area/provider/docker Issues or PRs related to docker kind/bug Categorizes issue or PR as related to a bug.

Comments

@giejay
Copy link

giejay commented Jul 5, 2022

What happened:
When Im restarting Docker (or the Macbook itself), the ports, which I have configured in the extraPortMappings section, are no longer available to the host.

What you expected to happen:
Ports should remain exposed after a Docker restart.

How to reproduce it (as minimally and precisely as possible):
I have a Cassandra image, including a service which exposes 9042 on NodePort 30000:

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: cassandra
  labels:
    app: cassandra
spec:
  serviceName: cassandra
  replicas: 1
  selector:
    matchLabels:
      app: cassandra
  template:
    metadata:
      labels:
        app: cassandra
    spec:
      terminationGracePeriodSeconds: 1800
      containers:
        - name: cassandra
          image: cassandra:3.0.21
          imagePullPolicy: Always
          ports:
            - containerPort: 7000
              name: intra-node
            - containerPort: 7001
              name: tls-intra-node
            - containerPort: 7199
              name: jmx
            - containerPort: 9042
              name: cql
          resources:
            limits:
              cpu: "2000m"
              memory: 6Gi
            requests:
              cpu: "1000m"
              memory: 2Gi
          securityContext:
            capabilities:
              add:
                - IPC_LOCK
          lifecycle:
            preStop:
              exec:
                command:
                  - /bin/sh
                  - -c
                  - nodetool drain
          env:
            - name: MAX_HEAP_SIZE
              value: 512M
            - name: HEAP_NEWSIZE
              value: 100M
            - name: CASSANDRA_SEEDS
              value: "cassandra-0.cassandra.default.svc.cluster.local"
            - name: CASSANDRA_CLUSTER_NAME
              value: "K8Demo"
            - name: CASSANDRA_DC
              value: "DC1-K8Demo"
            - name: CASSANDRA_RACK
              value: "Rack1-K8Demo"
            - name: POD_IP
              valueFrom:
                fieldRef:
                  fieldPath: status.podIP
---
apiVersion: v1
kind: Service
metadata:
  labels:
    app: cassandra
  name: cassandra
spec:
  type: NodePort
  ports:
    - nodePort: 30000
      port: 9042
      protocol: TCP
      targetPort: 9042
  selector:
    app: cassandra

That NodePort is forwarded to the host on 9042 again:

apiVersion: kind.x-k8s.io/v1alpha4
kind: Cluster
nodes:
  - role: control-plane
  - role: worker
    extraPortMappings:
      # cassandra
      - containerPort: 30000
        hostPort: 9042

Anything else we need to know?:
After reboot of Docker, curl commands and my DevCenter timeout on the 9042 host port. Before restart, everything is fine. The pods restart (after Docker restart) successfully. When exec'ing into Cassandra, Im able to curl the internal 9042 port just fine.

Environment:
MacOS Monterey

  • kind version: (use kind version):kind v0.14.0 go1.18.2 darwin/amd64

  • Kubernetes version: (use kubectl version):

Client Version: version.Info{Major:"1", Minor:"24", GitVersion:"v1.24.2", GitCommit:"f66044f4361b9f1f96f0053dd46cb7dce5e990a8", GitTreeState:"clean", BuildDate:"2022-06-15T14:22:29Z", GoVersion:"go1.18.3", Compiler:"gc", Platform:"darwin/amd64"}
Kustomize Version: v4.5.4
Server Version: version.Info{Major:"1", Minor:"24", GitVersion:"v1.24.0", GitCommit:"4ce5a8954017644c5420bae81d72b09b735c21f0", GitTreeState:"clean", BuildDate:"2022-05-19T15:39:43Z", GoVersion:"go1.18.1", Compiler:"gc", Platform:"linux/amd64"}
  • Docker version: (use docker info):
Client:
 Context:    default
 Debug Mode: false
 Plugins:
  buildx: Build with BuildKit (Docker Inc., v0.6.1-docker)
  compose: Docker Compose (Docker Inc., v2.0.0-rc.1)
  scan: Docker Scan (Docker Inc., v0.8.0)

Server:
 Containers: 5
  Running: 2
  Paused: 0
  Stopped: 3
 Images: 44
 Server Version: 20.10.8
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Native Overlay Diff: true
  userxattr: false
 Logging Driver: json-file
 Cgroup Driver: cgroupfs
 Cgroup Version: 1
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runc.v2 io.containerd.runtime.v1.linux runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: e25210fe30a0a703442421b0f60afac609f950a3
 runc version: v1.0.1-0-g4144b63
 init version: de40ad0
 Security Options:
  seccomp
   Profile: default
 Kernel Version: 5.10.47-linuxkit
 Operating System: Docker Desktop
 OSType: linux
 Architecture: x86_64
 CPUs: 10
 Total Memory: 13.67GiB
 Name: docker-desktop
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 HTTP Proxy: http.docker.internal:3128
 HTTPS Proxy: http.docker.internal:3128
 Registry: https://index.docker.io/v1/
 Labels:
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false
  • OS (e.g. from /etc/os-release):
    MacOS Monterey
@giejay giejay added the kind/bug Categorizes issue or PR as related to a bug. label Jul 5, 2022
@tarikmehinagic
Copy link

Having similar issue on Linux. Did you try to create new pods? Creating new pods also doesn't work for me after docker restart.
After docker service stops working due to maunally restarting docker or rebooting my machine, cluster is basically dead.
I cannot create new pods, ingress is not working, but if I port-forward services they are working just fine.

@giejay
Copy link
Author

giejay commented Jul 5, 2022

Indeed. Applying a new yaml with for example an Oracle container does nothing. Also removing the current Cassandra pod is not working. Only a cluster delete and create will fix it.

@tarikmehinagic
Copy link

Indeed. Applying a new yaml with for example an Oracle container does nothing. Also removing the current Cassandra pod is not working. Only a cluster delete and create will fix it.

I've seen more people with same issue, hopefully it'll be fixed soon.

@BenTheElder
Copy link
Member

Try the latest code at HEAD, you can clone the repo and make build then use ./bin/kind

There's a major fix for reboot networking within the cluster coming in the next release.

However the port forwards from the host to the container are up to docker.

Also, it is extremely unusual to need a single control plane and worker node for testing. I recommend using a single node at all instead for now which has a better chance of not breaking until the fix is released.

If you search for milti-node reboot in the issue tracker you can find past discussions if you're curious. But the patch for those is in the latest sources.

I'm on vacation until late July but may intermittently pop up. Antonio is also out this week and next I think.

@giejay
Copy link
Author

giejay commented Jul 5, 2022

@BenTheElder I will try to build it using the latest HEAD, thanks.

What I don't understand from your comment: "Also, it is extremely unusual to need a single control plane and worker node for testing. I recommend using a single node at all instead for now which has a better chance of not breaking until the fix is released.". How do I simply "use a single node"? I thought I was using a single node;) Or do you mean, use minikube and don't use kind at all?

apiVersion: kind.x-k8s.io/v1alpha4
kind: Cluster
nodes:
  - role: control-plane
  - role: worker
    extraPortMappings:
      # cassandra
      - containerPort: 30000
        hostPort: 9042

@BenTheElder
Copy link
Member

How do I simply "use a single node"? I thought I was using a single node;)

Your config file has two nodes:

apiVersion: kind.x-k8s.io/v1alpha4
kind: Cluster
nodes:
  - role: control-plane
  - role: worker
    extraPortMappings:
      # cassandra
      - containerPort: 30000
        hostPort: 9042

A single node equivilant would be:

apiVersion: kind.x-k8s.io/v1alpha4
kind: Cluster
nodes:
  - extraPortMappings:
      # cassandra
      - containerPort: 30000
        hostPort: 9042

@BenTheElder
Copy link
Member

we just released v0.15 which has some important changes related to restarts

@giejay
Copy link
Author

giejay commented Sep 2, 2022

Thanks for the heads-up @BenTheElder, I will check it out

@BenTheElder BenTheElder added the area/provider/docker Issues or PRs related to docker label Sep 9, 2022
@BenTheElder
Copy link
Member

Is this still an issue? I suspect it was a problem in docker but ...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/provider/docker Issues or PRs related to docker kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

No branches or pull requests

3 participants