Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

InvalidDiskCapacity, invalid capacity 0 on image filesystem [on clean install] #912

Closed
sjmudd opened this issue Jan 13, 2020 · 16 comments
Closed
Labels
kind/support Question with a workaround

Comments

@sjmudd
Copy link

sjmudd commented Jan 13, 2020

Had been following the Quick start guide at https://microk8s.io/docs/

May be related to #893 but I see differences.

  • This is a clean install of ubuntu but running on zfs (does this matter?)
  • I have installed microk8s via the snap
  • the node will not ever become healthy.

Running on:

  • Ubuntu 19.10 (eon)
  • microk8s v1.17.0

but I see

Events:
  Type     Reason                   Age   From               Message
  ----     ------                   ----  ----               -------
  Normal   Starting                 12m   kube-proxy, mad19  Starting kube-proxy.
  Normal   Starting                 12m   kubelet, mad19     Starting kubelet.
  Warning  InvalidDiskCapacity      12m   kubelet, mad19     invalid capacity 0 on image filesystem
  Normal   NodeAllocatableEnforced  12m   kubelet, mad19     Updated Node Allocatable limit across pods
  Normal   NodeHasSufficientMemory  12m   kubelet, mad19     Node mad19 status is now: NodeHasSufficientMemory
  Normal   NodeHasNoDiskPressure    12m   kubelet, mad19     Node mad19 status is now: NodeHasNoDiskPressure
  Normal   NodeHasSufficientPID     12m   kubelet, mad19     Node mad19 status is now: NodeHasSufficientPID

See the InvalidDiskCapacity warning.

This sounds harmless but if I try and do anything such as enable one of the microk8s components I then see that this fails with ContainerCreating.

sjmudd@mad19:~$ microk8s.enable dashboard
Applying manifest
...
sjmudd@mad19:~$ kubectl get pods --all-namespaces
NAMESPACE     NAME                                              READY   STATUS              RESTARTS   AGE
kube-system   dashboard-metrics-scraper-687667bb6c-9z5kd        0/1     ContainerCreating   0          4s
kube-system   heapster-v1.5.2-5c58f64f8b-h5ghg                  0/4     ContainerCreating   0          4s
kube-system   kubernetes-dashboard-5c848cc544-zmhcl             0/1     ContainerCreating   0          4s
kube-system   monitoring-influxdb-grafana-v4-6d599df6bf-74rvq   0/2     ContainerCreating   0          4s
sjmudd@mad19:~$ 

Disk space is fine:

Filesystem                Size  Used Avail Use% Mounted on
rpool/ROOT/ubuntu_pj8tws  890G  2,8G  887G   1% /

inspection-report-20200113_232425.tar.gz

So any thoughts on what I'm doing wrong would be most welcome.

@balchua
Copy link
Collaborator

balchua commented Jan 13, 2020

@sjmudd some addons take a bit of time to startup. Does the status "ContainerCreating" persist for a long time?

I am also getting that "InvalidDiskCapacity" warning on the node. I think it is benign.

My best guess on why this is reported by the kubelet, is that snaps file systems are mostly readonly (i think they are virtual filesystems).

You will normally see that they are 100% capacity. df -h.

None of the pods come up?

@sjmudd
Copy link
Author

sjmudd commented Jan 14, 2020

There is only one node at the moment and the containers are still in the same state ContainerCreating

sjmudd@mad19:~$ kubectl get pods --all-namespaces
NAMESPACE     NAME                                              READY   STATUS              RESTARTS   AGE
kube-system   dashboard-metrics-scraper-687667bb6c-9z5kd        0/1     ContainerCreating   0          8h
kube-system   heapster-v1.5.2-5c58f64f8b-h5ghg                  0/4     ContainerCreating   0          8h
kube-system   kubernetes-dashboard-5c848cc544-zmhcl             0/1     ContainerCreating   0          8h
kube-system   monitoring-influxdb-grafana-v4-6d599df6bf-74rvq   0/2     ContainerCreating   0          8h

FWIW here is the full df -h output on the box. (it's a real pc I use for resting not a vm):

Filesystem                                        Size  Used Avail Use% Mounted on
udev                                               16G     0   16G   0% /dev
tmpfs                                             3,2G  2,0M  3,2G   1% /run
rpool/ROOT/ubuntu_pj8tws                          889G  2,8G  887G   1% /
tmpfs                                              16G   80M   16G   1% /dev/shm
tmpfs                                             5,0M  4,0K  5,0M   1% /run/lock
tmpfs                                              16G     0   16G   0% /sys/fs/cgroup
bpool/BOOT/ubuntu_pj8tws                          1,8G  138M  1,7G   8% /boot
/dev/nvme0n1p2                                     45M  8,0M   33M  20% /boot/grub
/dev/nvme0n1p1                                    511M  7,8M  504M   2% /boot/efi
rpool/USERDATA/root_lifufh                        887G   61M  887G   1% /root
rpool/USERDATA/sjmudd_lifufh                      888G  929M  887G   1% /home/sjmudd
rpool/ROOT/ubuntu_pj8tws/var/log                  887G  334M  887G   1% /var/log
rpool/ROOT/ubuntu_pj8tws/var/mail                 887G  128K  887G   1% /var/mail
rpool/ROOT/ubuntu_pj8tws/var/www                  887G  128K  887G   1% /var/www
rpool/ROOT/ubuntu_pj8tws/var/lib                  888G  895M  887G   1% /var/lib
rpool/ROOT/ubuntu_pj8tws/var/spool                887G  256K  887G   1% /var/spool
rpool/ROOT/ubuntu_pj8tws/srv                      887G  128K  887G   1% /srv
rpool/ROOT/ubuntu_pj8tws/usr/local                887G  128K  887G   1% /usr/local
rpool/ROOT/ubuntu_pj8tws/var/snap                 887G   23M  887G   1% /var/snap
rpool/ROOT/ubuntu_pj8tws/var/games                887G  128K  887G   1% /var/games
rpool/ROOT/ubuntu_pj8tws/var/lib/NetworkManager   887G  256K  887G   1% /var/lib/NetworkManager
rpool/ROOT/ubuntu_pj8tws/var/lib/apt              887G   76M  887G   1% /var/lib/apt
rpool/ROOT/ubuntu_pj8tws/var/lib/dpkg             887G   32M  887G   1% /var/lib/dpkg
rpool/ROOT/ubuntu_pj8tws/var/lib/AccountServices  887G  128K  887G   1% /var/lib/AccountServices
/dev/loop0                                         90M   90M     0 100% /snap/core/7917
/dev/loop1                                         55M   55M     0 100% /snap/core18/1223
/dev/loop2                                        150M  150M     0 100% /snap/gnome-3-28-1804/71
/dev/loop3                                        1,0M  1,0M     0 100% /snap/gnome-logs/81
/dev/loop4                                         15M   15M     0 100% /snap/gnome-characters/317
/dev/loop5                                        4,3M  4,3M     0 100% /snap/gnome-calculator/501
/dev/loop6                                         45M   45M     0 100% /snap/gtk-common-themes/1353
/dev/loop8                                         11M   11M     0 100% /snap/kubectl/1373
tmpfs                                             3,2G   44K  3,2G   1% /run/user/1000
/dev/loop10                                        55M   55M     0 100% /snap/core18/1288
/dev/loop11                                        90M   90M     0 100% /snap/core/8268
/dev/loop12                                        15M   15M     0 100% /snap/gnome-characters/375
/dev/loop13                                       4,3M  4,3M     0 100% /snap/gnome-calculator/544
/dev/loop14                                       157M  157M     0 100% /snap/gnome-3-28-1804/110
/dev/loop7                                        172M  172M     0 100% /snap/microk8s/1107
/dev/sdb1                                         2,3G  2,3G     0 100% /media/sjmudd/Ubuntu 19.10 amd64
/dev/sdb3                                         2,0T   85M  1,9T   1% /media/sjmudd/casper-rw

As you say the snaps are all 100% usage as mounted as ro loopback devices.

Doing a describe on the first pod I see:

sjmudd@mad19:~$ kubectl describe pod dashboard-metrics-scraper-687667bb6c-9z5kd -n kube-system
Name:           dashboard-metrics-scraper-687667bb6c-9z5kd
Namespace:      kube-system
Priority:       0
Node:           mad19/192.168.10.19
Start Time:     Mon, 13 Jan 2020 23:33:43 +0100
Labels:         k8s-app=dashboard-metrics-scraper
                pod-template-hash=687667bb6c
Annotations:    <none>
Status:         Pending
IP:             
IPs:            <none>
Controlled By:  ReplicaSet/dashboard-metrics-scraper-687667bb6c
Containers:
  dashboard-metrics-scraper:
    Container ID:   
    Image:          kubernetesui/metrics-scraper:v1.0.2
    Image ID:       
    Port:           8000/TCP
    Host Port:      0/TCP
    State:          Waiting
      Reason:       ContainerCreating
    Ready:          False
    Restart Count:  0
    Liveness:       http-get http://:8000/ delay=30s timeout=30s period=10s #success=1 #failure=3
    Environment:    <none>
    Mounts:
      /tmp from tmp-volume (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kubernetes-dashboard-token-5ts5j (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             False 
  ContainersReady   False 
  PodScheduled      True 
Volumes:
  tmp-volume:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:     
    SizeLimit:  <unset>
  kubernetes-dashboard-token-5ts5j:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  kubernetes-dashboard-token-5ts5j
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node-role.kubernetes.io/master:NoSchedule
                 node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type     Reason                  Age                   From            Message
  ----     ------                  ----                  ----            -------
  Warning  FailedCreatePodSandBox  102s (x2274 over 8h)  kubelet, mad19  (combined from similar events): Failed to create pod sandbox: rpc error: code = Unknown desc = failed to start sandbox container: failed to create containerd task: failed to mount rootfs component &{overlay overlay [workdir=/var/snap/microk8s/common/var/lib/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/9248/work upperdir=/var/snap/microk8s/common/var/lib/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/9248/fs lowerdir=/var/snap/microk8s/common/var/lib/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/1/fs]}: invalid argument: unknown

So the problem seems to be here. It is not clear to me which argument is invalid or why, but this error seems to be the same on the second failed pod:

...
  Type     Reason                  Age                    From            Message
  ----     ------                  ----                   ----            -------
  Warning  FailedCreatePodSandBox  3m27s (x2266 over 8h)  kubelet, mad19  (combined from similar events): Failed to create pod sandbox: rpc error: code = Unknown desc = failed to start sandbox container: failed to create containerd task: failed to mount rootfs component &{overlay overlay [workdir=/var/snap/microk8s/common/var/lib/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/9249/work upperdir=/var/snap/microk8s/common/var/lib/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/9249/fs lowerdir=/var/snap/microk8s/common/var/lib/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/1/fs]}: invalid argument: unknown

and FailedCreatePodSandBox appears to be the cause.
I still know almost nothing about kubernetes internals to know why this may be.

@balchua
Copy link
Collaborator

balchua commented Jan 14, 2020

You mentioned you are using zfs. I not sure if containerd currently supports zfs.

Although there is a project in github that tends to this.
https://github.com/containerd/zfs

@sjmudd
Copy link
Author

sjmudd commented Jan 15, 2020

Hi. Sorry for the delay in responding. I do not think I am using any zfs specific features so I would expect the containerd setup to be "normal". Looking at the README at https://github.com/containerd/containerd seems to suggest the same so I guess I may be missing some edge case where this going wrong. Seems to imply that to help at least diagnose what's going on I may need to add debug information in various places in the code. I am not 100% sure which components may need patching to do this but will see if I can figure that out. Clearly the FailedCreatePodSandBox and InvalidDiskCapacity messages are a starting point. I will see if I get some time to look at this but in the meantime microk8s is not working which is a shame.

@balchua
Copy link
Collaborator

balchua commented Jan 15, 2020

Same thing is reported here.
k3s-io/k3s#66

@sjmudd
Copy link
Author

sjmudd commented Jan 16, 2020

Thanks for the reference. I don't think I'm going to make much progress looking from my side given the way the snap is built.

However, a better, clearer error message earlier in the snap install process such as "microk8s currently not supported on zfs, see: some_url for details" would maybe help a lot to recognise the issue than the current rather obscure messages I'm seeing. I am not sure if that should be a kubernetes upstream message or a microk8s message, but something of that for would aid with visibility.

@sjmudd
Copy link
Author

sjmudd commented Jan 16, 2020

One point of reference If I try to install the snap version of docker on this box it doesn't work. There are comments about it not working with confinement etc, yet if I install docker directly (via apt) on the server it works fine even though it seems to use containerd and is running on zfs. So clearly it should work even if things aren't now.

  • so docker with containerd seems to work with docker.io 19.03.2-0ubuntu1 on Ubuntu 19.10

I am wondering with the current installation if I can move /snap directory onto a ext3/ext3/xfs type filesystem if that would be enough for microk8s to startup. I am not quite sure where the containerd images are actually mounted etc. I have access to other disks/partitions so could certainly try this, or maybe even try using nfs. Performance is not really a concern right now as I just want to practice using kubernetes and microk8s seemed to be an easy way to achieve this.

@balchua
Copy link
Collaborator

balchua commented Jan 16, 2020

Hi @sjmudd appreciate you sharing more information on this.
Containerd images are stored in /var/snap/microk8s/common/var/lib/containerd.
Perhaps you might need to just mount this to an ext4 fs.

Now that you mentioned that docker with apt-get works while on snap it isn't happy. Maybe there is a need to allow snap get access to the zfs client.
Maybe snap folks can provide more help on this. 😁
@ktsakalozos care to share your thoughts on this?

@balchua
Copy link
Collaborator

balchua commented Jan 17, 2020

Maybe we just need to change containerd's snapshotter from overlay to zfs.

@sjmudd Can you try this, there is a file in /var/snap/microk8s/args/containerd-template.toml.

You should find this

    [plugins.cri.containerd]
      snapshotter = "overlayfs"
      no_pivot = false

Try changing it from overlay to zfs then restart microk8s.
Thanks.

@sjmudd
Copy link
Author

sjmudd commented Jan 17, 2020

This looks different. So I do microk8s.reset, I then do microk8s.enable dashboard and wait a while.

This shows:

sjmudd@mad19:~$ kubectl get all --all-namespaces
NAMESPACE     NAME                                                  READY   STATUS              RESTARTS   AGE
kube-system   pod/dashboard-metrics-scraper-687667bb6c-h2vhn        0/1     ContainerCreating   0          108s
kube-system   pod/heapster-v1.5.2-5c58f64f8b-dngbb                  0/4     ContainerCreating   0          108s
kube-system   pod/kubernetes-dashboard-5c848cc544-62l7c             0/1     ContainerCreating   0          108s
kube-system   pod/monitoring-influxdb-grafana-v4-6d599df6bf-5pzgp   0/2     ContainerCreating   0          108s
...

So I check the first pod:

This shows:

...
Events:
  Type     Reason                  Age        From               Message
  ----     ------                  ----       ----               -------
  Normal   Scheduled               <unknown>  default-scheduler  Successfully assigned kube-system/dashboard-metrics-scraper-687667bb6c-h2vhn to mad19
  Warning  FailedCreatePodSandBox  116s       kubelet, mad19     Failed to create pod sandbox: rpc error: code = Unknown desc = failed to create containerd container: error unpacking image: failed to prepare extraction snapshot "extract-139141208-FLoY sha256:e17133b79956ad6f69ae7f775badd1c11bad2fc64f0529cab863b9d12fbaa5c4": failed to open database file: open /var/snap/microk8s/common/var/lib/containerd/io.containerd.snapshotter.v1.zfs/metadata.db: no such file or directory
  Warning  FailedCreatePodSandBox  103s       kubelet, mad19     Failed to create pod sandbox: rpc error: code = Unknown desc = failed to create containerd container: error unpacking image: failed to prepare extraction snapshot "extract-613496891-PX7t sha256:e17133b79956ad6f69ae7f775badd1c11bad2fc64f0529cab863b9d12fbaa5c4": failed to open database file: open /var/snap/microk8s/common/var/lib/containerd/io.containerd.snapshotter.v1.zfs/metadata.db: no such file or directory
  Warning  FailedCreatePodSandBox  88s        kubelet, mad19     Failed to create pod sandbox: rpc error: code = Unknown desc = failed to create containerd container: error unpacking image: failed to prepare extraction snapshot "extract-668519401-Thla sha256:e17133b79956ad6f69ae7f775badd1c11bad2fc64f0529cab863b9d12fbaa5c4": failed to open database file: open /var/snap/microk8s/common/var/lib/containerd/io.containerd.snapshotter.v1.zfs/metadata.db: no such file or directory
...

Checking the directory I see:

sjmudd@mad19:~$ sudo ls -l /var/snap/microk8s/common/var/lib/containerd/
total 10
drwxr-xr-x 4 root root 4 Jan 13 22:43 io.containerd.content.v1.content
drwxr-xr-x 3 root root 3 Jan 13 22:43 io.containerd.grpc.v1.cri
drwx--x--x 2 root root 3 Jan 13 22:41 io.containerd.metadata.v1.bolt
drwx--x--x 3 root root 3 Jan 13 22:43 io.containerd.runtime.v1.linux
drwx--x--x 2 root root 2 Jan 13 22:41 io.containerd.runtime.v2.task
drwx------ 3 root root 3 Jan 13 22:41 io.containerd.snapshotter.v1.aufs
drwxr-xr-x 2 root root 2 Jan 13 22:41 io.containerd.snapshotter.v1.btrfs
drwx------ 3 root root 3 Jan 13 22:41 io.containerd.snapshotter.v1.native
drwx------ 3 root root 4 Jan 13 22:43 io.containerd.snapshotter.v1.overlayfs
drwx------ 2 root root 2 Jan 13 22:43 tmpmounts

So the required io.containerd.snapshotter.v1.zfs directory appears to be missing as is the related metadata.db file.

@balchua
Copy link
Collaborator

balchua commented Jan 19, 2020

There seems to be a way to get this working.

#401 (comment)

Steps are also indicated in the microk8s site.
https://microk8s.io/docs/install-alternatives

@balchua balchua added the kind/support Question with a workaround label Jan 19, 2020
@sjmudd
Copy link
Author

sjmudd commented Jan 20, 2020

Hi, FWIW yes the comment worked for me.
Thanks for pointing this out. It's good to see that this is possible. I still wonder about the original error message being somewhat confusing but the thing is that there is now a good explanation of what's required to get things working.

@sjmudd sjmudd closed this as completed Jan 20, 2020
@balchua
Copy link
Collaborator

balchua commented Jan 20, 2020

Great that it worked. Thanks for confirming. 👍

@ruffst
Copy link

ruffst commented May 22, 2021

sudo nano /boot/firmware/cmdline.txt
add the following @beginning of the line:
group_enable=memory cgroup_memory=1

This fixed the Error for me

@sadath-12
Copy link

hi @balchua im facing exact problem as @sjmudd did i get that warning for my nodes and cni pods r not scheduling but in my case its not microk8s but containerd as i installed like this

sudo apt-get update
sudo apt-get install -y containerd


sudo mkdir -p /etc/containerd
sudo containerd config default | sudo tee /etc/containerd/config.toml

sudo vi /etc/containerd/config.toml


#find the [plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options] section and change systemdcgroup to true
SystemdCgroup = true

sudo systemctl restart containerd
sudo systemctl enable containerd

seeing the solution i changed snapshotter to zfs which was in a line in /etc/containerd/config.toml but dont seem to workaround ..

Any helpful suggestions ?

@raja-png
Copy link

Try with modprobe switch from overlay to zfs ,

$ modprobe zfs
$ systemctl restart containerd

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/support Question with a workaround
Projects
None yet
Development

No branches or pull requests

5 participants