Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] - unix /var/lib/kubelet/plugins/block.csi.vultr.com/csi.sock not accessible (for rook.io) #88

Open
defaultbranch opened this issue Jul 18, 2022 · 3 comments
Labels
bug Something isn't working

Comments

@defaultbranch
Copy link

Describe the bug
Using rook.io, the rook pods rook-ceph-osd-prepare- fails to setup a PersistentVolumeClaim.

"describe pod" finally reports the event (warning) "MapVolume.SetUpDevice failed for volume "pvc-c869a0057b0c4904" : kubernetes.io/csi: blockMapper.stageVolumeForBlock failed to check STAGE_UNSTAGE_VOLUME capability: rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing dial unix /var/lib/kubelet/plugins/block.csi.vultr.com/csi.sock: connect: connection refused"" from the kubelet.

To Reproduce
Steps to reproduce the behavior:
(NOTE: This setup works fine on Azure AKS, with only the storageClassName adjusted.)

  1. Create a fresh kubernetes cluster (probably 1 worker node is sufficient for reproduction)
  2. For basic Rook setup: From the files in https://github.com/rook/rook/tree/master/deploy/examples, run kubectl apply -f for crds.yaml, common.yaml and operator.yaml, this creates CRDs, Roles and a rook-ceph-operator deployment/pod
  3. Run kubectl apply -f for the following CephCluster yaml:
apiVersion: ceph.rook.io/v1
kind: CephCluster
metadata:
  name: rook-ceph
  namespace: rook-ceph
spec:
  cephVersion:
    # NOTE: see cluster.yaml in <https://github.com/rook/rook.git> for up-to-date image version 
    image: quay.io/ceph/ceph:v17.2.1
    allowUnsupported: false
  dataDirHostPath: /var/lib/rook
  skipUpgradeChecks: false
  continueUpgradeAfterChecksEvenIfNotHealthy: false
  waitTimeoutForHealthyOSDInMinutes: 10
  mon:
    count: 3
    allowMultiplePerNode: false
  mgr:
    count: 2
    allowMultiplePerNode: false
    modules:
      - name: pg_autoscaler
        enabled: true
  dashboard:
    enabled: true
    ssl: true
  storage:
   storageClassDeviceSets:
    - name: set1
      # NOTE: change this to the number of nodes that should host an OSD
      count: 1
      portable: false
      tuneDeviceClass: false
      encrypted: false
      volumeClaimTemplates:
      - metadata:
          name: data
        spec:
          storageClassName: vultr-block-storage-hdd
          accessModes:
            - ReadWriteOnce
          # NOTE: rook seems to expect a raw, unmounted device "volumeMode: Block"
          volumeMode: Block
          resources:
            requests:
              storage: 40Gi
  1. See events in kubectl -n rook-ceph describe pod rook-ceph-osd-prepare-* after the pod(s) got stuck.

Expected behavior
The pods rook-ceph-osd-prepare-* should disappear after a short time, and instead corresponding rook-ceph-osd-* pods (without -prepare-) should remain.

Additional context
I was using VKE with Kubernetes 1.23.x.

@defaultbranch defaultbranch added the bug Something isn't working label Jul 18, 2022
@defaultbranch defaultbranch changed the title [BUG] - Enter a descriptive title [BUG] - unix /var/lib/kubelet/plugins/block.csi.vultr.com/csi.sock not accessible by Rook Jul 18, 2022
@defaultbranch defaultbranch changed the title [BUG] - unix /var/lib/kubelet/plugins/block.csi.vultr.com/csi.sock not accessible by Rook [BUG] - unix /var/lib/kubelet/plugins/block.csi.vultr.com/csi.sock not accessible (for rook.io) Jul 18, 2022
@ddymko
Copy link
Contributor

ddymko commented Jul 18, 2022

@defaultbranch

#NOTE: rook seems to expect a raw, unmounted device "volumeMode: Block"
volumeMode: Block

IIRC the vultr-csi doesn't support a raw unmounted device

https://github.com/vultr/vultr-csi/blob/master/driver/mounter.go#L113

@kaznak
Copy link

kaznak commented Mar 8, 2024

Hi, I have encountered the same situation.
I have a question about this matter.

  1. do you plan to support volumeMode: Block in the future?
  2. is it possible to support volumeMode: Block by changing this CSI implementation?

Thank you in advance.

@cuppett
Copy link

cuppett commented Mar 11, 2024

I have a question about this matter.

  1. do you plan to support volumeMode: Block in the future?
  2. is it possible to support volumeMode: Block by changing this CSI implementation?

Unofficially: I'd expect this should be possible (with changes). When attaching block devices to vultr instances normally, you get the raw device and can do stuff to them (create LVM volumes, basic filesystems, use LUKS, etc).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants