Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Plans for ZFS support? #66

Closed
fuzzykiller opened this issue Feb 27, 2019 · 16 comments
Closed

Plans for ZFS support? #66

fuzzykiller opened this issue Feb 27, 2019 · 16 comments
Labels
kind/enhancement An improvement to existing functionality
Milestone

Comments

@fuzzykiller
Copy link

Deployment on ZFS is currently not possible, because OverlayFS does not work with ZFS:

overlayfs: filesystem on '/var/lib/rancher/k3s/agent/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/273/fs' not supported as upperdir

From the containerd logfile, it appears the ZFS snapshotter is not included:

time="2019-02-27T14:55:43.605823860+01:00" level=info msg="starting containerd" revision= version=1.2.3+unknown
time="2019-02-27T14:55:43.606278371+01:00" level=info msg="loading plugin \"io.containerd.content.v1.content\"..." type=io.containerd.content.v1
time="2019-02-27T14:55:43.606418919+01:00" level=info msg="loading plugin \"io.containerd.snapshotter.v1.native\"..." type=io.containerd.snapshotter.v1
time="2019-02-27T14:55:43.606671517+01:00" level=info msg="loading plugin \"io.containerd.snapshotter.v1.overlayfs\"..." type=io.containerd.snapshotter.v1
time="2019-02-27T14:55:43.607001436+01:00" level=info msg="loading plugin \"io.containerd.metadata.v1.bolt\"..." type=io.containerd.metadata.v1
time="2019-02-27T14:55:43.624241298+01:00" level=info msg="loading plugin \"io.containerd.differ.v1.walking\"..." type=io.containerd.differ.v1
...

ZFS support would be awesome! Are there any plans to include that?

@ibuildthecloud
Copy link
Contributor

We specifically removed the ZFS snapshotter. The reason being that we don't intend to include the ZFS user space as I believe that is not portable across kernel versions. So we can include the ZFS snapshotter and you would be required to first install the ZFS tools, which is common place already.

@fuzzykiller
Copy link
Author

Thanks for the quick feedback!

The ZFS userspace tools are generally present when ZFS is used because otherwise ZFS cannot be used. So that’s really a non-issue as far as I can see. Except perhaps when running k3s in a Docker container on top of ZFS, but I’m not sure whether that would work anyway.

@erikwilson erikwilson added the kind/enhancement An improvement to existing functionality label Mar 25, 2019
@fire
Copy link

fire commented May 11, 2019

Is there a timeline for including the ZFS snapshotter?

@jirkadanek
Copy link

ZFS is sadly quite painful with Docker in Docker and similar scenarios. It might be best to avoid the problem by creating a volume in your ZFS pool, formatting that volume to ext4, and having docker use "overlay2" on top of that, instead of "zfs".

zfs create -s -V 20GB zroot/docker
mkfs.ext4 /dev/zvol/zroot/docker
# add the mount to /etc/fstab
mount /dev/zvol/zroot/docker /var/lib/docker

The zfs create -s is for sparse volumes. Analogous to thin provisioning on LVM.

I just finished setting this up and it nicely solves my problems with k3s and also kind. I use these for testing and development, and there the volume should be just fine.

@terinjokes
Copy link

terinjokes commented Nov 8, 2019

I was able to point k3s at the host's containerd (which had the zfs snapshotter and was configured to use it by default) and successfully run pods.

$ k3s server --container-runtime-endpoint /run/containerd/containerd.sock

$ sudo k3s kubectl -n kube-system get po -owide
NAME                                      READY   STATUS    RESTARTS   AGE     IP            NODE     NOMINATED NODE   READINESS GATES
local-path-provisioner-58fb86bdfd-tx9lt   1/1     Running   0          4m52s   10.88.0.137   rincon   <none>           <none>
coredns-57d8bbb86-7t6v6                   1/1     Running   0          4m52s   10.88.0.136   rincon   <none>           <none>

$ zfs list -r vault/storage/containerd
NAME                          USED  AVAIL     REFER  MOUNTPOINT
vault/storage/containerd     37.2M   466G       26K  /var/lib/containerd/io.containerd.snapshotter.v1.zfs
vault/storage/containerd/1    450K   466G      450K  legacy
vault/storage/containerd/10    25K   466G     20.0M  legacy
vault/storage/containerd/3     15K   466G      451K  legacy
vault/storage/containerd/4     15K   466G      451K  legacy
vault/storage/containerd/5    238K   466G      238K  legacy
vault/storage/containerd/6   3.67M   466G     3.67M  legacy
vault/storage/containerd/7   19.8M   466G     20.0M  legacy
vault/storage/containerd/8   13.0M   466G     16.6M  legacy
vault/storage/containerd/9     35K   466G     16.6M  legacy

@haohaolee
Copy link

@terinjokes Hi, could you please share more details about containerd with zfs support. How can I install containerd with zfs support?

@jirkadanek
Copy link

jirkadanek commented Jan 10, 2020

Another possible workaround, again it involves creating ext4 filesystem, but this time using Docker itself through Docker loopback volume plugin, not by asking zfs to do it. Described in rancher/k3d docs https://github.com/rancher/k3d/blob/dc4c29361fd433d3592b62d3d8d23256a5ed5728/docs/examples.md#running-on-filesystems-k3s-doesnt-like-btrfs-tmpfs-

@stevefan1999-personal
Copy link

Well you can apparently leverage the host's docker for k3s, which can spawn containers on a ZFS-based graph.
You just need to add --docker argument to k3s server/agent. Make sure that your docker info is using ZFS tho.

ExecStart=/usr/local/bin/k3s \
    server --docker\

This however completely defeats the purpose of packaging CRI in k3s.

@stevefan1999-personal
Copy link

stevefan1999-personal commented Mar 9, 2020

We specifically removed the ZFS snapshotter. The reason being that we don't intend to include the ZFS user space as I believe that is not portable across kernel versions. So we can include the ZFS snapshotter and you would be required to first install the ZFS tools, which is common place already.

This is not necessary the case now for Proxmox-based distribution with version >= 5, and also now in Ubuntu 20.04 (albeit deemed "experimental"), both of which cherry-picks and integrates OpenZFS Linux module codebase into their kernel recipe.

OpenZFS ABI is very stable but is denied entering kernel mainline due to some stupid license politics problem. And even if it is unstable, you can always use DKMS to hopefully keep it up with the new kernel.

Despite this, I think k3s team still won't consider adding ZFS support anyway, but what about re-adding AUFS support? AUFS works generally fine atop of ZFS despite there are some reported edges cases related to file mounts not in sync, and AUFS is more general than ZFS and easier to merge in k3s I think.

EDIT:

aaaaaaaaaaaand btw the current setback is to create a ext4 volume as containerd backend like @jirkadanek did, but on a different location

# echo "[+] making sure everything in containerd is empty to let it mount"
# rm -rf /var/lib/rancher/k3s/agent/containerd/*
# echo "[+] create the k3s hierarchy"
# zfs create rpool/k3s
# zfs create -s -V 128GB rpool/k3s/containerd
# echo "[+] wait for the newly created volume to format"
# mkfs.ext4 /dev/rpool/k3s/containerd
# echo "[+] adding the mount to fstab"
# echo "/dev/rpool/k3s/containerd /var/lib/rancher/k3s/agent/containerd ext4 auto 0 0" > /etc/fstab
# echo "[+] manually mounting the volume to the specific mount point"
# mount /dev/rpool/k3s/containerd /var/lib/rancher/k3s/agent/containerd
# echo "[!] if you saw this message without any error it means you're good to go"

Keep in mind that this is still not perfect -- so you basically recreated snapshot on ZFS, and so the atomic COW goodies is probably gone, you have no fine control over the data size whatsoever, I think in the future maybe we could get to install k3s plugins to accommodate situations like these.

ZFS is a very popular filesystem among Proxmox users after all, and it's a shame such a beautiful piece of work is being insulted like this.

@mcerveny
Copy link

Again,again,again - when you return ZFS snapshotter to containerd (v1.19.5+k3s2) ?
"--snapshotter zfs" does not work:

# grep zfs /var/lib/rancher/k3s/agent/containerd/containerd.log 
time="2020-12-30T19:46:27.633898921+01:00" level=warning msg="failed to load plugin io.containerd.grpc.v1.cri" error="failed to create CRI service: failed to find snapshotter \"zfs\""
# ctr plugin list | grep snapshotter
io.containerd.snapshotter.v1    native                   linux/amd64    ok        
io.containerd.snapshotter.v1    overlayfs                linux/amd64    ok        

Now I must use external containerd from debian testing (Bullseye) with ZFS snapshotter (and calico CNI):

# tail -1 /etc/apt/sources.list
deb http://http.us.debian.org/debian/ testing non-free contrib main
# apt-get install containerd
# systemctl stop containerd
# cat /etc/containerd/config.toml 
version = 2
[plugins."io.containerd.grpc.v1.cri".containerd]
  snapshotter = "zfs"
[plugins."io.containerd.internal.v1.opt"]
  path = "/var/lib/containerd/opt"
# zfs create -o mountpoint=/var/lib/containerd/io.containerd.snapshotter.v1.zfs rpool/containerd
# systemctl start containerd
# curl -sfL https://get.k3s.io | INSTALL_K3S_EXEC="--flannel-backend=none --disable-network-policy --cluster-init --container-runtime-endpoint unix:///run/containerd/containerd.sock" sh -
[INFO]  Finding release for channel stable
[INFO]  Using v1.19.5+k3s2 as release
...
# kubectl apply -f calico.yaml 
# ctr plugins list | grep snapshotter
io.containerd.snapshotter.v1    aufs                     linux/amd64    error     
io.containerd.snapshotter.v1    btrfs                    linux/amd64    error     
io.containerd.snapshotter.v1    devmapper                linux/amd64    error     
io.containerd.snapshotter.v1    native                   linux/amd64    ok        
io.containerd.snapshotter.v1    overlayfs                linux/amd64    ok        
io.containerd.snapshotter.v1    zfs                      linux/amd64    ok        
# kubectl get all -A
NAMESPACE     NAME                                           READY   STATUS      RESTARTS   AGE
kube-system   pod/calico-kube-controllers-744cfdf676-jk6fh   1/1     Running     0          2m4s
kube-system   pod/calico-node-rm96d                          1/1     Running     0          2m4s
kube-system   pod/coredns-66c464876b-jlpd8                   1/1     Running     0          2m14s
kube-system   pod/helm-install-traefik-wsv8b                 0/1     Completed   0          2m14s
kube-system   pod/local-path-provisioner-7ff9579c6-w2s7h     1/1     Running     0          2m14s
kube-system   pod/metrics-server-7b4f8b595-72fzr             1/1     Running     0          2m14s
kube-system   pod/svclb-traefik-h56pz                        2/2     Running     0          45s
kube-system   pod/traefik-5dd496474-p25st                    1/1     Running     0          45s

NAMESPACE     NAME                         TYPE           CLUSTER-IP     EXTERNAL-IP   PORT(S)                      AGE
default       service/kubernetes           ClusterIP      10.43.0.1      <none>        443/TCP                      2m28s
kube-system   service/kube-dns             ClusterIP      10.43.0.10     <none>        53/UDP,53/TCP,9153/TCP       2m26s
kube-system   service/metrics-server       ClusterIP      10.43.57.92    <none>        443/TCP                      2m26s
kube-system   service/traefik              LoadBalancer   10.43.179.56   10.199.1.24   80:30978/TCP,443:31367/TCP   45s
kube-system   service/traefik-prometheus   ClusterIP      10.43.99.37    <none>        9100/TCP                     45s

NAMESPACE     NAME                           DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR            AGE
kube-system   daemonset.apps/calico-node     1         1         1       1            1           kubernetes.io/os=linux   2m5s
kube-system   daemonset.apps/svclb-traefik   1         1         1       1            1           <none>                   45s

NAMESPACE     NAME                                      READY   UP-TO-DATE   AVAILABLE   AGE
kube-system   deployment.apps/calico-kube-controllers   1/1     1            1           2m5s
kube-system   deployment.apps/coredns                   1/1     1            1           2m27s
kube-system   deployment.apps/local-path-provisioner    1/1     1            1           2m26s
kube-system   deployment.apps/metrics-server            1/1     1            1           2m26s
kube-system   deployment.apps/traefik                   1/1     1            1           45s

NAMESPACE     NAME                                                 DESIRED   CURRENT   READY   AGE
kube-system   replicaset.apps/calico-kube-controllers-744cfdf676   1         1         1       2m5s
kube-system   replicaset.apps/coredns-66c464876b                   1         1         1       2m14s
kube-system   replicaset.apps/local-path-provisioner-7ff9579c6     1         1         1       2m14s
kube-system   replicaset.apps/metrics-server-7b4f8b595             1         1         1       2m14s
kube-system   replicaset.apps/traefik-5dd496474                    1         1         1       45s

NAMESPACE     NAME                             COMPLETIONS   DURATION   AGE
kube-system   job.batch/helm-install-traefik   1/1           90s        2m26s
# zfs list -t all
NAME                           USED  AVAIL     REFER  MOUNTPOINT
bpool                         69.4M   762M       96K  /boot
bpool/BOOT                    68.7M   762M       96K  none
bpool/BOOT/debian             68.6M   762M     41.1M  /boot
bpool/BOOT/debian@install     27.4M      -     41.1M  -
data                          3.46M  10.2T      140K  /data
rpool                         1.73G   152G       96K  /
rpool/ROOT                    1.34G   152G       96K  none
rpool/ROOT/debian             1.34G   152G     1.28G  /
rpool/ROOT/debian@install     61.9M      -      698M  -
rpool/containerd               388M   152G      120K  /var/lib/containerd/io.containerd.snapshotter.v1.zfs
rpool/containerd/1             500K   152G      500K  legacy
rpool/containerd/1@snapshot      0B      -      500K  -
rpool/containerd/10           2.98M   152G     4.66M  legacy
rpool/containerd/10@snapshot     0B      -     4.66M  -
rpool/containerd/11           2.98M   152G     7.57M  legacy
rpool/containerd/11@snapshot     0B      -     7.57M  -
rpool/containerd/12           1.72M   152G     9.23M  legacy
rpool/containerd/12@snapshot     0B      -     9.23M  -
rpool/containerd/13             68K   152G     9.24M  legacy
rpool/containerd/13@snapshot     0B      -     9.24M  -
rpool/containerd/14           3.46M   152G     12.6M  legacy
rpool/containerd/14@snapshot     0B      -     12.6M  -
rpool/containerd/15            120K   152G     12.6M  legacy
rpool/containerd/16           97.2M   152G     97.2M  legacy
rpool/containerd/16@snapshot     0B      -     97.2M  -
rpool/containerd/17            112K   152G     97.2M  legacy
rpool/containerd/17@snapshot     0B      -     97.2M  -
rpool/containerd/18            632K   152G     97.3M  legacy
rpool/containerd/19             72K   152G      508K  legacy
rpool/containerd/2              72K   152G      508K  legacy
rpool/containerd/20             72K   152G      508K  legacy
rpool/containerd/21           4.95M   152G     4.95M  legacy
rpool/containerd/21@snapshot     0B      -     4.95M  -
rpool/containerd/22             72K   152G      508K  legacy
rpool/containerd/23           14.8M   152G     19.7M  legacy
rpool/containerd/23@snapshot     0B      -     19.7M  -
rpool/containerd/24             72K   152G      508K  legacy
rpool/containerd/25           3.61M   152G     3.61M  legacy
rpool/containerd/25@snapshot     0B      -     3.61M  -
rpool/containerd/26            128K   152G     19.7M  legacy
rpool/containerd/27           21.0M   152G     24.4M  legacy
rpool/containerd/27@snapshot     0B      -     24.4M  -
rpool/containerd/28            104K   152G      104K  legacy
rpool/containerd/28@snapshot     0B      -      104K  -
rpool/containerd/29             64K   152G      104K  legacy
rpool/containerd/29@snapshot     0B      -      104K  -
rpool/containerd/3             148K   152G      148K  legacy
rpool/containerd/3@snapshot      0B      -      148K  -
rpool/containerd/30            432K   152G      432K  legacy
rpool/containerd/30@snapshot     0B      -      432K  -
rpool/containerd/31           25.6M   152G     25.6M  legacy
rpool/containerd/31@snapshot     0B      -     25.6M  -
rpool/containerd/32           19.3M   152G     19.7M  legacy
rpool/containerd/32@snapshot     0B      -     19.7M  -
rpool/containerd/33             72K   152G      508K  legacy
rpool/containerd/34           55.4M   152G     79.7M  legacy
rpool/containerd/34@snapshot     0B      -     79.7M  -
rpool/containerd/35            104K   152G     19.7M  legacy
rpool/containerd/36           2.08M   152G     27.7M  legacy
rpool/containerd/36@snapshot     0B      -     27.7M  -
rpool/containerd/37             72K   152G     27.7M  legacy
rpool/containerd/38           4.41M   152G     4.41M  legacy
rpool/containerd/38@snapshot     0B      -     4.41M  -
rpool/containerd/39           15.5M   152G     19.8M  legacy
rpool/containerd/39@snapshot     0B      -     19.8M  -
rpool/containerd/4              72K   152G      156K  legacy
rpool/containerd/4@snapshot      0B      -      156K  -
rpool/containerd/40            100K   152G     79.8M  legacy
rpool/containerd/40@snapshot     0B      -     79.8M  -
rpool/containerd/41           1.83M   152G     80.8M  legacy
rpool/containerd/42            104K   152G     19.8M  legacy
rpool/containerd/43             72K   152G      508K  legacy
rpool/containerd/44             72K   152G      508K  legacy
rpool/containerd/45           3.62M   152G     3.62M  legacy
rpool/containerd/45@snapshot     0B      -     3.62M  -
rpool/containerd/46           1.26M   152G     4.68M  legacy
rpool/containerd/46@snapshot     0B      -     4.68M  -
rpool/containerd/47             76K   152G     4.69M  legacy
rpool/containerd/47@snapshot     0B      -     4.69M  -
rpool/containerd/48            200K   152G     4.70M  legacy
rpool/containerd/49            200K   152G     4.70M  legacy
rpool/containerd/5            61.6M   152G     61.7M  legacy
rpool/containerd/5@snapshot      0B      -     61.7M  -
rpool/containerd/50            280K   152G      280K  legacy
rpool/containerd/50@snapshot     0B      -      280K  -
rpool/containerd/51           3.52M   152G     3.73M  legacy
rpool/containerd/51@snapshot     0B      -     3.73M  -
rpool/containerd/52           34.6M   152G     38.3M  legacy
rpool/containerd/52@snapshot     0B      -     38.3M  -
rpool/containerd/53             88K   152G     38.3M  legacy
rpool/containerd/6              88K   152G     61.7M  legacy
rpool/containerd/7             104K   152G     29.7M  legacy
rpool/containerd/8             104K   152G      104K  legacy
rpool/containerd/8@snapshot      0B      -      104K  -
rpool/containerd/9            1.74M   152G     1.76M  legacy
rpool/containerd/9@snapshot      0B      -     1.76M  -

@brandond
Copy link
Member

brandond commented Jan 4, 2021

I don't believe we've changed any of our plans around ZFS support, for the reasons @ibuildthecloud described in #66 (comment)

@AstraLuma
Copy link

For anyone else that's found this, https://blog.nobugware.com/post/2019/k3s-containterd-zfs/ gives some indication about using an external containerd with k3s.

Personally, I plan on just setting up a zfs volume for /var/lib/rancher and using OpenEBS's zfs-localpv for persistent volumes (since this is a single-node cluster anyway).

@mhumeSF
Copy link

mhumeSF commented Jan 24, 2021

Personally, I plan on just setting up a zfs volume for /var/lib/rancher and using OpenEBS's zfs-localpv for persistent volumes (since this is a single-node cluster anyway).

👆 I use this as well for my single node k3s cluster and find it works great.

@lord-kyron
Copy link

@AstraLuma @mhumeSF Can you guys exolain how you did exposed the zpool to the docker container? I deployed k3d but when tried to deploy the zfs-localpv I was getting errors indicating that the docker containers serving as nodes for the k3d cannot work with zfs, cannot see the zpool and etc. I willbe very grateful if someone of you describe how you exposed the zpool to the k3d so that it can be used by the zfs-localpv!

@mhumeSF
Copy link

mhumeSF commented May 31, 2021

@lord-kyron I don't run k3d and I think k3s in docker with zfs support would be difficult see #66 (comment). I run k3s on a root volume that is ext4. zfs-localpv will autocreate zfs volumes for any pvs that specify that storageclass.

@AstraLuma
Copy link

@AstraLuma @mhumeSF Can you guys exolain how you did exposed the zpool to the docker container?

I'm not at all. I'm running k3s directly on the host.

@k3s-io k3s-io locked and limited conversation to collaborators Sep 9, 2021
@dweomer dweomer closed this as completed Sep 9, 2021

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
kind/enhancement An improvement to existing functionality
Projects
None yet
Development

No branches or pull requests