Skip to content
This repository has been archived by the owner on Jan 11, 2023. It is now read-only.

Pod unable to attach PV after being deleted (Wait for attach expect device path as a lun number, instead got: /dev/disk/azure/scsi1/lun0 (strconv.Atoi: parsing "/dev/disk/azure/scsi1/lun0": invalid syntax) #2906

Closed
ghost opened this issue May 10, 2018 · 10 comments

Comments

@ghost
Copy link

ghost commented May 10, 2018

Is this a request for help?:
Yes

Is this an ISSUE or FEATURE REQUEST? (choose one):
Issue

What version of acs-engine?:
v0.15.1

Orchestrator and version (e.g. Kubernetes, DC/OS, Swarm)
Kubernetes

What happened:
Jenkins pod misbehaving, deleted the pod manually which I've done many times before on other clusters. Upon trying to start the pod again under another name, I see the following error for the pod when attempting to start.

  Warning  FailedMount  12m (x210 over 8h)  kubelet, k8s-linuxpool1-10932256-4  Unable to mount volumes for pod "jenkins-5886fd98c-wnj4w_jenkins(8d4f7901-5411-11e8-8aef-000d3a338ebb)": timeout expired waiting for volumes to attach or mount for pod "jenkins"/"jenkins-5886fd98c-wnj4w". list of unmounted volumes=[jenkins-home]. list of unattached volumes=[jenkins-config plugin-dir secrets-dir jenkins-home jenkins-token-vb2tl]
  Warning  FailedMount  1m (x250 over 8h)   kubelet, k8s-linuxpool1-10932256-4  MountVolume.WaitForAttach failed for volume "pvc-4f3c835c-3d41-11e8-8112-000d3a338ebb" : azureDisk - Wait for attach expect device path as a lun number, instead got: /dev/disk/azure/scsi1/lun0 (strconv.Atoi: parsing "/dev/disk/azure/scsi1/lun0": invalid syntax)

What you expected to happen:
Pod to restart and attach to the PV.

How to reproduce it (as minimally and precisely as possible):
Deploy a pod using a PV. Delete the pod.

Anything else we need to know:
@andyzhangx Sounds like this may be related to andyzhangx/kubernetes@e287390 ?

kubectl get no -o yaml | grep "volumesAttached"
    volumesAttached:
    volumesAttached:
    volumesAttached:
    volumesAttached:
I0423 21:55:12.791570       1 attacher.go:145] azureDisk - VolumesAreAttached: check volume "dfs-cluster-group-dev-dynamic-pvc-7569faa7-473f-11e8-8aef-000d3a338ebb" (specName: "pvc-7569faa7-473f-11e8-8aef-000d3a338ebb") is no longer attached
I0423 21:55:12.830074       1 attacher.go:145] azureDisk - VolumesAreAttached: check volume "dfs-cluster-group-dev-dynamic-pvc-757de46a-473f-11e8-8aef-000d3a338ebb" (specName: "pvc-757de46a-473f-11e8-8aef-000d3a338ebb") is no longer attached
I0423 21:55:22.117927       1 attacher.go:291] azureDisk - disk:https://ds96201c813c2e11e8935d0.blob.core.windows.net/vhds/dfs-cluster-group-dev-dynamic-pvc-757de46a-473f-11e8-8aef-000d3a338ebb.vhd was detached from node:k8s-linuxpool1-10932256-1
I0423 21:55:36.414797       1 attacher.go:291] azureDisk - disk:https://ds96201c813c2e11e8935d0.blob.core.windows.net/vhds/dfs-cluster-group-dev-dynamic-pvc-7569faa7-473f-11e8-8aef-000d3a338ebb.vhd was detached from node:k8s-linuxpool1-10932256-3
I0425 03:31:55.273343       1 attacher.go:291] azureDisk - disk:https://ds96201c813c2e11e8935d0.blob.core.windows.net/vhds/dfs-cluster-group-dev-dynamic-pvc-66efdc72-4838-11e8-8aef-000d3a338ebb.vhd was detached from node:k8s-linuxpool1-10932256-2
I0425 03:43:42.690322       1 attacher.go:291] azureDisk - disk:https://ds96201c813c2e11e8935d0.blob.core.windows.net/vhds/dfs-cluster-group-dev-dynamic-pvc-1009cdcb-483a-11e8-8aef-000d3a338ebb.vhd was detached from node:k8s-linuxpool1-10932256-1
I0425 04:05:33.645047       1 attacher.go:291] azureDisk - disk:https://ds96201c813c2e11e8935d0.blob.core.windows.net/vhds/dfs-cluster-group-dev-dynamic-pvc-1d5bd2fd-483d-11e8-8aef-000d3a338ebb.vhd was detached from node:k8s-linuxpool1-10932256-2
I0425 04:39:48.223383       1 attacher.go:291] azureDisk - disk:https://ds96201c813c2e11e8935d0.blob.core.windows.net/vhds/dfs-cluster-group-dev-dynamic-pvc-82150073-483f-11e8-8aef-000d3a338ebb.vhd was detached from node:k8s-linuxpool1-10932256-2
I0425 04:57:00.824960       1 attacher.go:291] azureDisk - disk:https://ds96201c813c2e11e8935d0.blob.core.windows.net/vhds/dfs-cluster-group-dev-dynamic-pvc-83338560-4843-11e8-8aef-000d3a338ebb.vhd was detached from node:k8s-linuxpool1-10932256-1
I0425 05:11:57.622787       1 attacher.go:291] azureDisk - disk:https://ds96201c813c2e11e8935d0.blob.core.windows.net/vhds/dfs-cluster-group-dev-dynamic-pvc-a374abc1-4845-11e8-8aef-000d3a338ebb.vhd was detached from node:k8s-linuxpool1-10932256-1
I0425 05:38:56.179301       1 attacher.go:291] azureDisk - disk:https://ds96201c813c2e11e8935d0.blob.core.windows.net/vhds/dfs-cluster-group-dev-dynamic-pvc-c5928551-4847-11e8-8aef-000d3a338ebb.vhd was detached from node:k8s-linuxpool1-10932256-1
I0425 06:33:21.295764       1 attacher.go:145] azureDisk - VolumesAreAttached: check volume "dfs-cluster-group-dev-dynamic-pvc-285eddc7-484b-11e8-8aef-000d3a338ebb" (specName: "pvc-285eddc7-484b-11e8-8aef-000d3a338ebb") is no longer attached
I0425 06:33:51.864776       1 attacher.go:291] azureDisk - disk:https://ds96201c813c2e11e8935d0.blob.core.windows.net/vhds/dfs-cluster-group-dev-dynamic-pvc-285eddc7-484b-11e8-8aef-000d3a338ebb.vhd was detached from node:k8s-linuxpool1-10932256-2
I0425 07:02:05.935951       1 attacher.go:291] azureDisk - disk:https://ds96201c813c2e11e8935d0.blob.core.windows.net/vhds/dfs-cluster-group-dev-dynamic-pvc-b84a3ce1-4853-11e8-8aef-000d3a338ebb.vhd was detached from node:k8s-linuxpool1-10932256-1
I0425 07:40:55.479650       1 attacher.go:291] azureDisk - disk:https://ds96201c813c2e11e8935d0.blob.core.windows.net/vhds/dfs-cluster-group-dev-dynamic-pvc-c62f0816-4856-11e8-8aef-000d3a338ebb.vhd was detached from node:k8s-linuxpool1-10932256-1
I0425 07:50:16.905015       1 attacher.go:291] azureDisk - disk:https://ds96201c813c2e11e8935d0.blob.core.windows.net/vhds/dfs-cluster-group-dev-dynamic-pvc-3dffe090-485c-11e8-8aef-000d3a338ebb.vhd was detached from node:k8s-linuxpool1-10932256-1
I0425 20:50:58.448888       1 attacher.go:291] azureDisk - disk:https://ds96201c813c2e11e8935d0.blob.core.windows.net/vhds/dfs-cluster-group-dev-dynamic-pvc-d68e7b94-488a-11e8-8aef-000d3a338ebb.vhd was detached from node:k8s-linuxpool1-10932256-2
I0425 22:53:27.428831       1 attacher.go:291] azureDisk - disk:https://ds96201c813c2e11e8935d0.blob.core.windows.net/vhds/dfs-cluster-group-dev-dynamic-pvc-09b743b2-4742-11e8-8aef-000d3a338ebb.vhd was detached from node:k8s-linuxpool1-10932256-4
I0425 22:53:30.230460       1 attacher.go:145] azureDisk - VolumesAreAttached: check volume "dfs-cluster-group-dev-dynamic-pvc-09b7ec6f-4742-11e8-8aef-000d3a338ebb" (specName: "pvc-09b7ec6f-4742-11e8-8aef-000d3a338ebb") is no longer attached
I0425 22:53:36.210190       1 attacher.go:291] azureDisk - disk:https://ds96201c813c2e11e8935d0.blob.core.windows.net/vhds/dfs-cluster-group-dev-dynamic-pvc-09b7ec6f-4742-11e8-8aef-000d3a338ebb.vhd was detached from node:k8s-linuxpool1-10932256-3
I0502 17:37:06.424005       1 attacher.go:291] azureDisk - disk:https://ds96201c813c2e11e8935d0.blob.core.windows.net/vhds/dfs-cluster-group-dev-dynamic-pvc-a88c50a3-4e2e-11e8-8aef-000d3a338ebb.vhd was detached from node:k8s-linuxpool1-10932256-2
I0502 17:53:02.533871       1 attacher.go:291] azureDisk - disk:https://ds96201c813c2e11e8935d0.blob.core.windows.net/vhds/dfs-cluster-group-dev-dynamic-pvc-010ebd96-48ca-11e8-8aef-000d3a338ebb.vhd was detached from node:k8s-linuxpool1-10932256-3
I0508 22:42:59.578060       1 attacher.go:291] azureDisk - disk:https://ds96201c813c2e11e8935d0.blob.core.windows.net/vhds/dfs-cluster-group-dev-dynamic-pvc-feda0e12-5310-11e8-8aef-000d3a338ebb.vhd was detached from node:k8s-linuxpool1-10932256-1
@andyzhangx
Copy link
Contributor

@ghost ghost closed this as completed May 11, 2018
@huydinhle
Copy link

how would I fixing this if I got pods running and needs to be remounted ? @andyzhangx

@andyzhangx
Copy link
Contributor

@huydinhle fix what issue? Usually user won't use pod directly, mostly use Deployment or StatefulSet, and then could kubectl delete po POD-NAME , new pod would be created and pod with disk would be remounted.

@huydinhle
Copy link

I tried it multiple times, and my pods keep giving me the exact same errors. @andyzhangx

Should I just keep deleting the pods until it comes back successfully? I have done 5 times so far.

@andyzhangx
Copy link
Contributor

@huydinhle
what's your k8s version? is it error like following:

Pod unable to attach PV after being deleted (Wait for attach expect device path as a lun number, instead got: /dev/disk/azure/scsi1/lun0 (strconv.Atoi: parsing "/dev/disk/azure/scsi1/lun0": invalid syntax)

@huydinhle
Copy link

huydinhle commented Nov 21, 2018

k8s version is 1.10.1 and here is the pods error messages

  Type     Reason                 Age               From                        Message
  ----     ------                 ----              ----                        -------
  Normal   Scheduled              8m                default-scheduler           Successfully assigned es-data-elasticsearch-cluster-default-1 to k8s-db-37692245-1
  Normal   SuccessfulMountVolume  8m                kubelet, k8s-db-37692245-1  MountVolume.SetUp succeeded for volume "es-certs-elasticsearch-cluster"
  Normal   SuccessfulMountVolume  8m                kubelet, k8s-db-37692245-1  MountVolume.SetUp succeeded for volume "default-token-sj5cr"
  Warning  FailedMount            1m (x11 over 8m)  kubelet, k8s-db-37692245-1  MountVolume.WaitForAttach failed for volume "pvc-fbb9b8ee-e3b3-11e8-a365-000d3a06b22f" : azureDisk - Wait for attach expect device path as a lun number, instead got:  (strconv.Atoi: parsing "": invalid syntax)
  Warning  FailedMount            1m (x3 over 6m)   kubelet, k8s-db-37692245-1  Unable to mount volumes for pod "es-data-elasticsearch-cluster-default-1_logging(9cdd0df8-ed2f-11e8-a1e3-000d3a06b721)": timeout expired waiting for volumes to attach or mount for pod "logging"/"es-data-elasticsearch-cluster-default-1". list of unmounted volumes=[es-data]. list of unattached volumes=[es-data es-certs-elasticsearch-cluster default-token-sj5cr]```


@andyzhangx 

@andyzhangx
Copy link
Contributor

@huydinhle Unfortunately there is no work around for this issue, only v1.10.0 & v1.10.1 would have this issue, upgrade to other version would fix this issue.

@huydinhle
Copy link

thank you for your quick comment @andyzhangx . We use acs-engine to bootstrap our cluster, this upgrade would just be a change in version for our apiserver and that's it? Do we need to change anything else for this upgrade? We only consider the upgrade route because it is a patch version change and seems to be minimal to do. But this is our production cluster, so we want to be careful when we do operations on our control plane.

@andyzhangx
Copy link
Contributor

@huydinhle I think you should upgrade to v1.10.2 or higher, follow by:
https://github.com/Azure/acs-engine/blob/master/examples/k8s-upgrade/README.md

The upgrade will set up new agent node and do kubectl drain ... and move the original workloads to new node one by one.

@jackfrancis is this upgrade functionality prodution ready? I will leave this question to our tech lead.

@jackfrancis
Copy link
Member

@huydinhle I would advise doing the following before upgrading your cluster for the first time:

  1. Create a new cluster using (1) the same version of acs-engine that you built your current production cluster with, and (2) using the same api model (with different DNS prefix, etc)
  2. Using the latest released version of acs-engine, following the upgrade README instructions above and upgrade that new cluster to version v1.10.9
  3. Validate the cluster after upgrade

If the cluster upgrade succeeded, and the cluster appears operational for the workloads you use, then you have some assurance that acs-engine upgrade will work for your production cluster.

The short story is: because acs-engine is so user-configurable, we don't test every possible api model configuration for upgrade against all released versions of acs-engine. It's wise to treat this procedure as something that may not succeed, and validate it using a "staging environment" strategy.

This issue was closed.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants