Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Timeout is_disk_connected = wait_for_connected_disk(600) #16

Open
s4kro opened this issue Aug 13, 2019 · 6 comments
Open

Timeout is_disk_connected = wait_for_connected_disk(600) #16

s4kro opened this issue Aug 13, 2019 · 6 comments

Comments

@s4kro
Copy link

s4kro commented Aug 13, 2019

Greetings!

I have a problem with attach.py with timeout_for_connected_disk. Looks like method from pyudev.monitor def from_netlink could not connect to vm via timeout here

context = pyudev.Context()
    monitor = pyudev.Monitor.from_netlink(context)
    monitor.filter_by(subsystem='block', device_type='disk').

Can you also show me a simple output here via print(result)

result = []
    for device in iter(partial(monitor.poll, timeout), None):
        if device.action == 'add':
            result = [device.device_node, 'connected']
            break
        elif device.action == 'remove':
            result = [device.device_node, 'disconnected']
            break
    return result

I was trying to incrase timeout to 1000s but still not work, im not realy sure how pydev.monitor works.
Rest works as intended.
I see how app create disk, how attach/deattach disk to vm. Also I see disk via fdisk -l

Disk /dev/sdd: 1073 MB, 1073741824 bytes, 2097152 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes

But I cant continue script, because it's reaching timeout for wait_for_connected_disk.
Maybe need to modernize this method?

@dzolnierz dzolnierz self-assigned this Aug 13, 2019
@dzolnierz
Copy link
Contributor

dzolnierz commented Aug 13, 2019

Hi,

are you attaching volumes invoking vclod-flexvolume directly or via Kubernetes?
If latter, does your kubelets running with --enable-controller-attach-detach=false? If not, the controller is trying to attach disk to random node (with success) and the whole pyudev logic is called on the controller also. And this will never succeed, because kernel emmits udev events on the target node where the disk was attached, not on the controller.

The same applies to invoking vcloud-flexvolume directly. Should be invoked on the very same node you would attach disk to.

@s4kro
Copy link
Author

s4kro commented Aug 14, 2019

Thanks for reply!
I tried to invoke flexvolume with both ways.
Is this option (--enable-controller-attach-detach=false) should be at kubemaster (thats what I did) or at kubeworkers too?
If i need to attach disk to kubeworker A, I need to execute vcloud-flexvolume attach directly on this node? If so i need to install flexvolume to all kubeworkers? I cant invoke script at workers because etcd runs only at master, and script trying to connect there, maybe i should do NodePort service foer etcd?
Also i was trying to moke is_disk_connected with some random variable value, but w/o result because i dont know how output from below code looks like.

result = []
    for device in iter(partial(monitor.poll, timeout), None):
        if device.action == 'add':
            result = [device.device_node, 'connected']
            break
        elif device.action == 'remove':
            result = [device.device_node, 'disconnected']
            break
    return result

@s4kro
Copy link
Author

s4kro commented Aug 14, 2019

I fixed attach.py here:
cmd_create_partition = ("echo -n ',,83;' | sfdisk %s") % (device_name)
now without "-n" and it works. Now back to k8s after deploy ngninx example pod i got
Unable to mount volumes for pod "nginx-vcloud_default(fa012f2f-be98-11e9-8a60-005056011de7)": timeout expired waiting for volumes to attach or mount for pod "default"/"nginx-vcloud". list of unmounted volumes=[testdisk]. list of unattached volumes=[testdisk default-token-qbft4]
Also i got at kubelet log:
desiredStateOfWorld. err=failed to get Plugin from volumeSpec for volume "testdisk" err=no volume plugin matched
Manifest:

apiVersion: v1
kind: Pod
metadata:
  name: nginx-vcloud
  namespace: default
spec:
  containers:
  - name: nginx-vcloud
    image: nginx
    volumeMounts:
    - name: testdisk
      mountPath: /data
    ports:
    - containerPort: 80
  volumes:
  - name: testdisk
    flexVolume:
      #driver: "sysoperator.pl/vcloud"
      driver: "answear.com/vcloud"
      fsType: "ext4"
      options:
        volumeName: "testdisk2"
        size: "1Gi"
        storage: "DC1-Kv-VSP-02-High"
        busType: "6"
        busSubType: "VirtualSCSI"
        mountoptions: "relatime,nobarrier"

@dzolnierz
Copy link
Contributor

If i need to attach disk to kubeworker A, I need to execute vcloud-flexvolume attach directly on this node?

Yes.

If so i need to install flexvolume to all kubeworkers?

Yes.

I cant invoke script at workers because etcd runs only at master, and script trying to connect there, maybe i should do NodePort service foer etcd?

The driver needs access to etcd for locking to work properly. It does not have to be the cluster instance.

cmd_create_partition = ("echo -n ',,83;' | sfdisk %s") % (device_name)

What Linux distribution do you use? Does echo come from the coreutils package? Show me echo --help and lsb_release -a.

Also i got at kubelet log:
desiredStateOfWorld. err=failed to get Plugin from volumeSpec for volume "testdisk" err=no volume plugin matched

Has the driver been installed correctly on every node? Has the kubelet process been restarted after installing the driver?
Show me ls -la /usr/libexec/kubernetes/kubelet-plugins/volume/exec/ on the node where the error come from.

@s4kro
Copy link
Author

s4kro commented Aug 15, 2019

[root@dev-test-worker-1 examples]# ls -la /usr/libexec/kubernetes/kubelet-plugins/volume/exec/
total 0
drwxr-xr-x 3 root root 32 Aug 12 18:52 .
drwxr-xr-x 3 root root 18 Aug 12 18:52 ..
drwxr-xr-x 2 root root 20 Aug 12 18:59 answear.com~vcloud

It's centos7.4.
cmd_create_partition = ("echo -n ',,83;' | sfdisk %s") % (device_name)
I fixed this one like this and it works.
cmd_create_partition = ("echo ',,83;' | sfdisk %s") % (device_name)
Still got k8s troubles. Cant deploy pod with volume. k8s version is 1.12.

The driver needs access to etcd for locking to work properly. It does not have to be the cluster instance.

Okay, got it. But if it is flexvolume won't work correctly. I tried to connect to NodePort, but no result.

@dzolnierz
Copy link
Contributor

[root@dev-test-worker-1 examples]# ls -la /usr/libexec/kubernetes/kubelet-plugins/volume/exec/
total 0
drwxr-xr-x 3 root root 32 Aug 12 18:52 .
drwxr-xr-x 3 root root 18 Aug 12 18:52 ..
drwxr-xr-x 2 root root 20 Aug 12 18:59 answear.com~vcloud

Seems ok.

It's centos7.4.
cmd_create_partition = ("echo -n ',,83;' | sfdisk %s") % (device_name)
I fixed this one like this and it works.
cmd_create_partition = ("echo ',,83;' | sfdisk %s") % (device_name)

This was fixed in #18.

The driver needs access to etcd for locking to work properly. It does not have to be the cluster instance.

Okay, got it. But if it is flexvolume won't work correctly. I tried to connect to NodePort, but no result.

Could you post me whole log from node's kubelet trying to attach volume? To gist.github.com?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants