Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Azuredisk container constantly failed after update from 1.29.1 #2314

Closed
ozlotusflare opened this issue May 3, 2024 · 1 comment · Fixed by #2315
Closed

Azuredisk container constantly failed after update from 1.29.1 #2314

ozlotusflare opened this issue May 3, 2024 · 1 comment · Fixed by #2315

Comments

@ozlotusflare
Copy link

ozlotusflare commented May 3, 2024

What happened:

After update controller from v1.29.1 to v1.30.1 we faced with problem when one container constantly restarting and eventuality become in CrashLoopBack state. We tested v1.29.5- same story.

What you expected to happen:

Working state

How to reproduce it:

Update helm chart from 1.29.1 to 1.30.1

Anything else we need to know?:

Environment:

  • CSI Driver version: v1.29.5, v1.30.0 and v1.30.1
  • Kubernetes version (use kubectl version): 1.26
  • OS (e.g. from /etc/os-release): Ubuntu 22.04.2 LTS
  • Kernel (e.g. uname -a): 6.2.0-1017-azure
  • Install tools: Helm
  • Others:
# kubect decribe pod
....
liveness-probe:
    Container ID:  containerd://24d24a66569cdb80d1b73affecb787ec27e2bec190ec4fc2d230d75e158539f1
    Image:         mcr.microsoft.com/oss/kubernetes-csi/livenessprobe:v2.12.0
    Image ID:      mcr.microsoft.com/oss/kubernetes-csi/livenessprobe@sha256:c762188c45d1b9bc9144b694b85313d5e49c741935a81d5b94fd7db978a40ae1
    Port:          <none>
    Host Port:     <none>
    Args:
      --csi-address=/csi/csi.sock
      --probe-timeout=3s
      --http-endpoint=localhost:29602
      --v=2
    State:          Running
      Started:      Fri, 03 May 2024 13:15:37 +0200
    Ready:          True
    Restart Count:  0
    Limits:
      memory:  128Mi
    Requests:
      cpu:        10m
      memory:     32Mi
    Environment:  <none>
    Mounts:
      /csi from socket-dir (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-9mrk7 (ro)
  azuredisk:
    Container ID:  containerd://3192c0f76dc21829d15559c2a4f695a1663f8190e8b5d817abc0dc9a3ff9abd2
    Image:         mcr.microsoft.com/oss/kubernetes-csi/azuredisk-csi:v1.30.1
    Image ID:      mcr.microsoft.com/oss/kubernetes-csi/azuredisk-csi@sha256:d965e110bddfbc3d894220ce82487b087e03b3d39f1a412e77e0e129a801af3f
    Port:          29604/TCP
    Host Port:     0/TCP
    Args:
      --v=5
      --endpoint=$(CSI_ENDPOINT)
      --metrics-address=0.0.0.0:29604
      --disable-avset-nodes=true
      --vm-type=
      --drivername=disk.csi.azure.com
      --cloud-config-secret-name=azure-cloud-provider
      --cloud-config-secret-namespace=kube-system
      --custom-user-agent=
      --user-agent-suffix=OSS-helm
      --allow-empty-cloud-config=false
      --vmss-cache-ttl-seconds=-1
      --enable-traffic-manager=false
      --traffic-manager-port=7788
      --enable-otel-tracing=false
      --check-disk-lun-collision=true
    State:          Running
      Started:      Fri, 03 May 2024 13:18:36 +0200
    Last State:     Terminated
      Reason:       Error  # <--
      Exit Code:    2   # <--
# Also Events
...
 Warning  Unhealthy        73s (x5 over 3m13s)  kubelet            Liveness probe failed: Get "http://localhost:29602/healthz": dial tcp 127.0.0.1:29602: connect: connection refused
  Normal   Killing          73s                  kubelet            Container azuredisk failed liveness probe, will be restarted
@andyzhangx
Copy link
Member

have you set hostNetwork: true in azure disk controller?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants