Skip to content

Commit

Permalink
Merge pull request #60346 from andyzhangx/fix-devname-change
Browse files Browse the repository at this point in the history
Automatic merge from submit-queue (batch tested with PRs 60346, 60135, 60289, 59643, 52640). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

fix device name change issue for azure disk

**What this PR does / why we need it**:
fix device name change issue for azure disk due to default host cache setting changed from None to ReadWrite from v1.7, and default host cache setting in azure portal is `None`

**Which issue(s) this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when PR gets merged)*:
Fixes #60344, #57444
also fixes following issues:
Azure/acs-engine#1918
Azure/AKS#201

**Special notes for your reviewer**:
From v1.7, default host cache setting changed from None to ReadWrite, this would lead to device name change after attach multiple disks on azure vm, finally lead to disk unaccessiable from pod.
For an example:
statefulset with 8 replicas(each with an azure disk) on one node will always fail, according to my observation, add the 6th data disk will always make dev name change, some pod could not access data disk after that.

I have verified this fix on v1.8.4
Without this PR on one node(dev name changes):
```
azureuser@k8s-agentpool2-40588258-0:~$ tree /dev/disk/azure
...
└── scsi1
    ├── lun0 -> ../../../sdk
    ├── lun1 -> ../../../sdj
    ├── lun2 -> ../../../sde
    ├── lun3 -> ../../../sdf
    ├── lun4 -> ../../../sdg
    ├── lun5 -> ../../../sdh
    └── lun6 -> ../../../sdi
```

With this PR on one node(no dev name change):
```
azureuser@k8s-agentpool2-40588258-1:~$ tree /dev/disk/azure
...
└── scsi1
    ├── lun0 -> ../../../sdc
    ├── lun1 -> ../../../sdd
    ├── lun2 -> ../../../sde
    ├── lun3 -> ../../../sdf
    ├── lun5 -> ../../../sdh
    └── lun6 -> ../../../sdi
```

Following `myvm-0`, `myvm-1` is crashing due to dev name change, after controller manager replacement, myvm2-x  pods work well.

```
Every 2.0s: kubectl get po                                                                                                                                                   Sat Feb 24 04:16:26 2018

NAME      READY     STATUS             RESTARTS   AGE
myvm-0    0/1       CrashLoopBackOff   13         41m
myvm-1    0/1       CrashLoopBackOff   11         38m
myvm-2    1/1       Running            0          35m
myvm-3    1/1       Running            0          33m
myvm-4    1/1       Running            0          31m
myvm-5    1/1       Running            0          29m
myvm-6    1/1       Running            0          26m

myvm2-0   1/1       Running            0          17m
myvm2-1   1/1       Running            0          14m
myvm2-2   1/1       Running            0          12m
myvm2-3   1/1       Running            0          10m
myvm2-4   1/1       Running            0          8m
myvm2-5   1/1       Running            0          5m
myvm2-6   1/1       Running            0          3m
```

**Release note**:

```
fix device name change issue for azure disk
```
/assign @karataliu 
/sig azure
@feiskyer  could you mark it as v1.10 milestone?
@brendandburns @khenidak @rootfs @jdumars FYI

Since it's a critical bug, I will cherry pick this fix to v1.7-v1.9, note that v1.6 does not have this issue since default cachingmode is `None`
  • Loading branch information
Kubernetes Submit Queue authored Feb 25, 2018
2 parents be2e702 + c3e8f68 commit 71c2135
Showing 1 changed file with 5 additions and 4 deletions.
9 changes: 5 additions & 4 deletions pkg/volume/azure_dd/azure_common.go
Original file line number Diff line number Diff line change
Expand Up @@ -35,9 +35,10 @@ import (
)

const (
defaultFSType = "ext4"
defaultStorageAccountType = storage.StandardLRS
defaultAzureDiskKind = v1.AzureSharedBlobDisk
defaultFSType = "ext4"
defaultStorageAccountType = storage.StandardLRS
defaultAzureDiskKind = v1.AzureSharedBlobDisk
defaultAzureDataDiskCachingMode = v1.AzureDataDiskCachingNone
)

type dataDisk struct {
Expand Down Expand Up @@ -141,7 +142,7 @@ func normalizeStorageAccountType(storageAccountType string) (storage.SkuName, er

func normalizeCachingMode(cachingMode v1.AzureDataDiskCachingMode) (v1.AzureDataDiskCachingMode, error) {
if cachingMode == "" {
return v1.AzureDataDiskCachingReadWrite, nil
return defaultAzureDataDiskCachingMode, nil
}

if !supportedCachingModes.Has(string(cachingMode)) {
Expand Down

0 comments on commit 71c2135

Please sign in to comment.