Enable Simulation of automatically provisioned ReadWriteMany PVs #1487

joshatcaper · 2020-04-17T02:06:06Z

What would you like to be added: A method to provide automatically provisioned ReadeWriteMany PVs that are available on all workers.

Currently the storage provisioner that is being used can only provision

Why is this needed: The current volume provisioner that is being only supports creating ReadWriteOnce volumes. This is because kind is using the rancher local-path-provisioner and they hard code their provisioner to disallow any PVCs with an access mode other than ReadWriteOnce. Many managed kubernetes providers supply some type of distributed file system. I'm currently using Azure Storage File (which is SMB/cifs under the hood) for this use case in production. Google's Kubernetes Engine offers ReadOnlyMany out of the box.

Possible solutions: Could we have the control plane node start up an NFS container backed by a ReadWriteOnce?

Thanks for your time!

The text was updated successfully, but these errors were encountered:

BenTheElder · 2020-04-17T03:59:45Z

NFS from an overlayfs requires a 4.15+ kernel IIRC.
Currently kind imposes no additional requirements on kernel version beyond what kubernetes does upstream.

I don't think we want to start imposing any kernel requirement yet, or the overhead of running & managing NFS by default.

kind of course supports installing additional drivers, preferably with CSI.

IMHO it makes more sense to run this as an addon. cc @msau42 @pohly.

We can discuss other Read* modes upstream in the rancher project.

BenTheElder · 2020-04-17T04:03:44Z

Ah, I hadn't had a need for RWM. https://kubernetes.io/docs/concepts/storage/persistent-volumes/#access-modes

even ReadOnlyMany is going to require some kind of network attached storage or something, since the "many" is nodes not pods (my mistake)

I don't think rancher / local storage is going to do read across nodes 😅

probably the best solution here is to document some yaml to apply for getting an NFS provisioner installed on top of a standard kind cluster.

joshatcaper · 2020-04-19T17:10:20Z

@BenTheElder ah, didn't know NFS required a newer kernel in this instance. Would it be possible to do something similar with docker volumes instead? The following docker-compose example should back the containers with a shared volume that is consistent-ish:

version: "2.3"
services:
  control-plane0:
    image: k8s.gcr.io/pause
    volumes:
      - rwmpvc:/rwmpvc
  worker0:
    image: k8s.gcr.io/pause
    volumes:
      - rwmpvc:/rwmpvc
  worker1:
    image: k8s.gcr.io/pause
    volumes:
      - rwmpvc:/rwmpvc

volumes:
  rwmpvc:

The ouput of docker-compose up && docker container inspect <container> will show:

        ...
        "Mounts": [
            {
                "Type": "volume",
                "Name": "test_rwmpvc",
                "Source": "/var/lib/docker/volumes/test_rwmpvc/_data",
                "Destination": "/rwmpvc",
                "Driver": "local",
                "Mode": "rw",
                "RW": true,
                "Propagation": ""
            }
        ],
        ...

Using an approach like this would not require any NFS server to be run internally in the containers. The PV provisioner just needs to consistently derive the host path in a similar way like /rwmpvc/<uuid> on each host.

BenTheElder · 2020-04-19T20:45:50Z

This will work in backends where the nodes are all on a single machine (which we may not guarantee in the future) IF we write a custom provisioner.

IMHO it's better to just provide an opt-in NFS solution you can deploy and document it.

It should just be a kubectl apply away from installing an NFS provisioner as long as you have an updated kernel.

msau42 · 2020-04-21T01:12:03Z

Agree, I think a opt-in NFS tutorial would be the best option here for users that need it.

We don't have any great options from sig-storage perspective, most solutions already assume you have an nfs server setup somewhere.

nfs external provisioner: this repo is deprecated and in the process of being migrated to its own repo. This uses ganesha to provision nfs servers, but still requires some sort of stable disk to back it.
nfs-client external provisioner: this repo is deprecated and in the process of being migrated to its own repo. This takes an existing nfs share and carves out subdirectories from it as PVs.
nfs csi driver: currently does not support dynamic provisioning, but there are plans to add in an nfs-client-like provisioner in the near future. can potentially add snapshots support in the future too.

joshatcaper · 2020-04-21T01:46:22Z

I don't know if this is possible but would there be some way to abstract this from the end user using some method of packaging and enabling "addons" similar to minikube? I don't know about the long term goals of kind but from an outsiders perspective it seems like a wonderful way to deploy an ephemeral copy of software in a CI stage. I was investigating it as a method to run some end-to-end integration testing on my company's software. I'd really like it if the configurations I end up applying to the created cluster very closely match what I'd push to a real cluster otherwise I'd be worried about running into the same issues you hit when you build a "dev" and "production" version of a binary and only test against your "dev" builds, never your production build.

I don't know if addons are a clean way of accomplishing this goal but I think the utility of kind for the in-CI-deployment workflow would greatly be helped by something that completely hides that this isn't a real managed kube cluster from the end user. Obviously, though, having some way to do this is better than having no way of doing this.

Interested in your thoughts.

BenTheElder · 2020-04-22T04:05:46Z

I don't know if this is possible but would there be some way to abstract this from the end user using some method of packaging and enabling "addons" similar to minikube?

Hi, regarding addons: we're not bundling addons at this time.

That approach tends to be problematic for users as it couples the lifecycle of the addons to the version of the cluster tool.

SIG Cluster Lifecycle seems to agree and the future of addon work there seems to be the cluster addons project, which involves a generic system on top of any cluster. We're tracking that work and happy to integrate when it's ready #253

In the meantime addons tend to not be any different from any other cluster workload, they can be managed with kubectl, helm, kustomize, kpt, etc.

For an example of a more involved "addon" that isn't actually bundled with kind config dependencies see https://kind.sigs.k8s.io/docs/user/ingress/

I don't know about the long term goals of kind but from an outsiders perspective it seems like a wonderful way to deploy an ephemeral copy of software in a CI stage.

This gives a rough idea where our priorities are at, which do include supporting this more or less
https://kind.sigs.k8s.io/docs/contributing/project-scope/

I was investigating it as a method to run some end-to-end integration testing on my company's software. I'd really like it if the configurations I end up applying to the created cluster very closely match what I'd push to a real cluster otherwise I'd be worried about running into the same issues you hit when you build a "dev" and "production" version of a binary and only test against your "dev" builds, never your production build.

We have a KubeCon talk about this: https://kind.sigs.k8s.io/docs/user/resources/#testing-your-k8s-apps-with-kind--benjamin-elder--james-munnelly

I don't know if addons are a clean way of accomplishing this goal but I think the utility of kind for the in-CI-deployment workflow would greatly be helped by something that completely hides that this isn't a real managed kube cluster from the end user. Obviously, though, having some way to do this is better than having no way of doing this.

Clusters have a standard API in KUBECONFIG and the API endpoint.
Unfortunately for portability reasons we can't quite hide that this isn't the same as your real cluster, a lot of extension points break down here including but not limited to:

ingress
loadbalancer
storage classes (nonstandard ones in your prod environment, k8s really only has default as something of a standard)

For these you'll want to provide your own wrapper of some sort to ensure that the kind cluster matches your prod more closely (e.g. mimicking the custom storage classes from your prod cluster, trying to run a similar or the same ingress..)

BenTheElder · 2020-05-02T05:57:21Z

nfs-common will be installed on the nodes going forward which should enable NFS volumes. you still need to run an NFS server somehow.

BenTheElder · 2020-05-02T05:57:44Z

(also confirmed that it works, the kubernetes NFS e2e tests pass)

BenTheElder · 2020-05-02T07:22:07Z

requires 4.16 kernel https://www.phoronix.com/scan.php?page=news_item&px=OverlayFS-NFS-Export-Linux-4.16

danquah · 2020-06-16T14:48:25Z

Just did a verification of this feature.

I first made sure kubernetes was cloned to ${GOPATH}/src/k8s.io/kubernetes as described in https://kind.sigs.k8s.io/docs/user/working-offline/#prepare-kubernetes-source-code

I then built my own node-image using the latest base-image with nfs-common via the following (takes a while!)

kind build node-image --image kindest/node:master --base-image kindest/base:v20200610-99eb0617 --kube-root "${GOPATH}/src/k8s.io/kubernetes"

Next i created a cluster using the new node-image via

kind create cluster --config kind-config.yaml

Using the following kind-config.yaml

kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
nodes:
- role: control-plane
  image: kindest/node:master

I then pulled and loaded the nfs-provisioner image to prepare for installation

docker pull quay.io/kubernetes_incubator/nfs-provisioner
kind  load docker-image quay.io/kubernetes_incubator/nfs-provisioner

The provisioner could then be installed via Helm (Helm was installed separately).

helm repo add stable https://kubernetes-charts.storage.googleapis.com/
helm install nfs-provisioner stable/nfs-server-provisioner

And I was then finally able to to provision a NFS volume via the following PVC

---
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: test-dynamic-volume-claim
spec:
  storageClassName: "nfs"
  accessModes:
    - ReadWriteMany
  resources:
    requests:
      storage: 100Mi

Everything worked like a charm - looking forward to the next Kind release :)

ayoubfaouzi · 2020-07-25T00:16:49Z

Nice ! I am currently looking for this. When this will be released?

BenTheElder · 2020-07-25T00:20:10Z

0.9.0 delayed for various reasons. we'll re-evaluate and set a new target date soon.

…

On Fri, Jul 24, 2020 at 5:17 PM Noteworthy ***@***.***> wrote: Nice ! I am currently looking for this. When this will be released? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#1487 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAHADKYJIF6PC4KE2MGE2OLR5IP75ANCNFSM4MKMFTTQ> .

danquah · 2020-08-05T14:33:12Z

@BenTheElder any updates on the new target date? Trying to determine whether to base some internal setup on our own build of kind or whether there will be a release in the near future we can use instead.

BenTheElder · 2020-08-27T07:41:38Z

Sorry I missed this comment (sweeping issues now), v0.9.0 was re-scheduled to match k8s v1.19 but some last minute fixes are still pending so we didn't cut the release today (k8s did). I expect to have those merged by tomorrow.

koxu1996 · 2020-09-17T18:38:35Z

This is side note, but might be useful for someone. When I updated node image from 18.8 to 19.1 then NFS helm chart does not work properly: memory is filled up in few seconds. I investigated the problem and it seems rpc.statd from nfs-utils package is outdated and it is leaking memory.

BenTheElder · 2020-09-17T18:44:10Z

that's unfortunate. we're shipping the latest available in the distro at the moment (ubuntu 20.10), if it's fixed in ubuntu we'll pick it up in a future kind image.

koxu1996 · 2020-09-18T15:02:44Z

@BenTheElder Now I think it might be something different. That's how I reproduce issue:

$ kind create cluster --image [NODE_IMAGE]
$ helm install stable/nfs-server-provisioner --generate-name
# wait 30s until 100% memory is filled up

Issue is present when I use most recent node images:

kindest/node:~~v1.19.0~~v1.19.1 (98cf52888646)
kindest/node:v1.18.8 (f4bcc97a0ad6)

List of node images that works without problem:

kindest/node:v1.18.8 (I don't know digest, but it was version older than 4 days)
kindest/node:v1.18.6

Note: I tried building latest node image from kind:v0.9.0 sources and it works fine 😕

BenTheElder · 2020-09-18T17:25:47Z

1.19.0 isn't a latest image (please see the kind release notes as usual) and all of the images that are current were built with the same version, there were no changed to the base image or node image build process between those builds and tagging the release.

koxu1996 · 2020-09-18T19:46:12Z

@BenTheElder Sorry, I pasted corrected digest 98cf52888646, but lower version - it should be latest v1.19.1:

$ docker pull kindest/node:v1.19.1
v1.19.1: Pulling from kindest/node
Digest: sha256:98cf5288864662e37115e362b23e4369c8c4a408f99cbc06e58ac30ddc721600
Status: Image is up to date for kindest/node:v1.19.1
docker.io/kindest/node:v1.19.1

So issue is present for latest node image. I am trying to track down what was changed during latest node images update.

aojea · 2020-09-19T11:05:11Z

I' m almost sure is because of this
#1799

but I keep thinking that is an nfs bug 😄
#760 (comment)

@koxu1996 you should limit the filedescriptor at the OS level

koxu1996 · 2020-09-19T14:32:53Z

@aojea Indeed, I bisected KinD commits and this is the culprit: 2f17d25.

I use Arch BTW 😆 and kernel-limit of file descriptors is really high:

$ sudo sysctl -a | grep "fs.nr_open"
fs.nr_open = 1073841816

To workaround the NFS issue you can change kernel-level limits, eg.

sudo sysctl -w fs.nr_open=1048576

or you could use custom node image.

Edit:

I asked nfs-utils maintainer about this bug and got following reply:

This was fixed by the following libtirpc commit:

commit e7c34df8f57331063b9d795812c62cec3ddfbc17 (tag: libtirpc-1-2-7-rc3)
Author: Jaime Caamano Ruiz [email protected]
Date: Tue Jun 16 13:00:52 2020 -0400
libtirpc: replace array with list for per-fd locks
Which is in the latest RC release libtirpc-1-2-7-rc4

BenTheElder · 2020-09-25T21:35:39Z

looks like libtirpc is not packaged yet. I'm not sure how we want to proceed here

BenTheElder · 2021-02-06T02:18:03Z

I think we should try to make sure libtirpc is updated and document how to setup an NFS provisioner, I'm not sure if this is in scope to have in the default setup, but it's certainly in scope to put a guide on the site.

backtrackshubham · 2021-06-11T13:06:24Z

kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: test-dynamic-volume-claim
spec:
  storageClassName: "nfs"
  accessModes:
    - ReadWriteMany
  resources:
    requests:
      storage: 100Mi

This fails with an error saying,

Finished building Kubernetes
Building node image ...
Building in kind-build-1623416466-881282865
Image build Failed! Failed to pull Images: command "docker exec --privileged kind-build-1623416466-881282865 cat /etc/containerd/config.toml" failed with error: exit status 1
ERROR: error building node image: command "docker exec --privileged kind-build-1623416466-881282865 cat /etc/containerd/config.toml" failed with error: exit status 1
Command Output: cat: /etc/containerd/config.toml: No such file or directory

BenTheElder · 2021-06-11T15:28:41Z

well first of all you should not need to build new node imaiges, we've had multiple releases since #1487 (comment), they already contain all of the changes.

... and the reason that's failing is the base image specified in the command in that comment is very outdated versus current kind. you can skip all the image building steps, NFS should just work now, we run NFS tests in CI. There's no changes to kind needed, just the cluster objects installed at runtime for your NFS service / PVs.

backtrackshubham · 2021-06-11T15:48:10Z

Hey @BenTheElder thanks for the comment, but when I tried using the storage class nfs it went into the pending state describeing the pvc showed that it doesn't have the storage class "nfs", I understand that you have suggested to run a nfs server some where, but my question is in current version of kind can we do (after having my nfs server) pvc's with access mode ReadWriteMany, I went through the issues inorder to find something on this but was not able to find, any help or suggestions would be wonderful

BenTheElder · 2021-06-11T15:58:18Z

but when I tried using the storage class nfs it went into the pending state describeing the pvc showed that it doesn't have the storage class "nfs"

yes, we don't have the storage class because that has to refer to a specific NFS setup, and that's something you can choose and install at runtime

I understand that you have suggested to run a nfs server some where,

yes, #1487 (comment) starting from "I then pulled and loaded the nfs-provisioner image to prepare for installation" is still relevant as one approach. The part before that with the custom image is not.

but my question is in current version of kind can we do (after having my nfs server) pvc's with access mode ReadWriteMany,

Yes, in any version nfs has readwritemany, it's just that NFS could not work in a nested container environment when the project was started (issues in the linux kernel actually, not in kind itself). It can now. (see also: #1806)

I don't specifically work with this, but NFS in kind is not special (versus another cluster tool) anymore.

We just need someone to document doing this.

backtrackshubham · 2021-06-11T16:19:19Z

Thanks I am still in a phase of understanding and learning about K8, and many thanks to the devs and contributors of kind , I will see how to do it thanks, 😃

backtrackshubham · 2021-07-07T17:32:22Z

Thanks I am still in a phase of understanding and learning about K8, and many thanks to the devs and contributors of kind , I will see how to do it thanks, 😃

Hi @BenTheElder so thanks for all the guidance and ideas I was successfully able to deploy a NFS server with mode RWM using the steps that you and other devs indicated on a Linux system, but now when I am trying to move the same setup on a Mac ( Docker desktop) , I could see that the pos for nfs provisioner is failling with (upon describing)

Warning  FailedScheduling  33m (x2 over 33m)   default-scheduler  0/3 nodes are available: 1 node(s) had taint {node-role.kubernetes.io/master: }, that the pod didn't tolerate, 2 node(s) had taint {node.kubernetes.io/not-ready: }, that the pod didn't tolerate.

And then it eventually gets into crash loop, I found this answer suggesting some change but I would like to understand what exactly have changed between the two systems, could it be because of the resources, as on Linux system the the KInd cluster was flying with 24G ram but here on Mac its 6 CPUs, 4 GB mem 2GB Swap and 200GB HHD,

Thanks

BenTheElder · 2021-07-07T22:14:01Z

You should also consider running less nodes, kind tries to be as light as possible but kubeadm recommends something like 2gb for node for a more typical cluster IIRC 😅

Kubernetes does not yet use swap effectively, and actually officially requires it to be off, though we set an option to make it run anyhow.

node.kubernetes.io/not-ready is not a taint you should have to remove and kind in general should not require you to manually remove taints ever, this means the nodes are not healthy (which is a very general symptom)

EDIT: If you need more help with that please file a different issue for your case since it's not related to RWM PVs, so folks monitoring this can avoid being notified spuriously, and so we can track your issue directly. We can cross link them for reference. The new issue template also requests useful information for debugging.

meln5674 · 2024-06-14T19:41:06Z

On the off chance anyone is still watching this, local-path-provisioner has supported RWX volumes for a few releases now, and with v0.0.27 now supports multiple storage classes with a single deployment.

Unless I've overlooked something, I think it should be reasonable to automatically create a RWX storage class for single-node clusters. To support multi-node clusters, that could be accomplished by mounting the same host volume to the same location in each node container, and that could be provided by a new field in the configuration. This would even support future multi-host setups if the user is made responsible for mounting network storage at that location on each host out-of-band.

I would be happy to start work on a PR for this if the idea isn't rejected out of hand.

mosesdd · 2024-07-05T15:31:04Z

@meln5674 I created a workaround in my environment for this:

kubectl -n local-path-storage patch configmap local-path-config -p '{"data": {"config.json": "{\n\"sharedFileSystemPath\": \"/var/local-path-provisioner\"\n}"}}'

As long as you use only one node configuration this works totally fine.

See https://github.com/rancher/local-path-provisioner?tab=readme-ov-file#definition for details

joshatcaper added the kind/feature Categorizes issue or PR as related to a new feature. label Apr 17, 2020

BenTheElder added the priority/backlog Higher priority than priority/awaiting-more-evidence. label Apr 25, 2020

BenTheElder mentioned this issue May 2, 2020

install nfs common #1548

Merged

BenTheElder mentioned this issue Aug 25, 2020

Document how to enable NFS volume mounts #1806

Closed

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Dec 24, 2020

k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Jan 23, 2021

BenTheElder removed the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Feb 6, 2021

kubernetes-sigs deleted a comment from fejta-bot Feb 6, 2021

BenTheElder added the lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. label Feb 6, 2021

aojea added kind/documentation Categorizes issue or PR as related to documentation. help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. labels Feb 6, 2021

BenTheElder mentioned this issue Jul 15, 2021

I was trying to create a pvc with RWX access mode in the kind cluster. I see an error that it is not supported. #2371

Closed

BenTheElder mentioned this issue Jan 20, 2022

[hyperkit minikube] NFS Provisioner going in crash loop #2597

Closed

BenTheElder mentioned this issue Feb 18, 2022

A Pod becomes a memory hog in Kind (the issue seems Kind-specific) #2623

Closed

huw0 mentioned this issue Aug 23, 2024

openmetadata-dependencies pods are in pending state. open-metadata/openmetadata-helm-charts#235

Open

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enable Simulation of automatically provisioned ReadWriteMany PVs #1487

Enable Simulation of automatically provisioned ReadWriteMany PVs #1487

joshatcaper commented Apr 17, 2020

BenTheElder commented Apr 17, 2020

BenTheElder commented Apr 17, 2020 •

edited

Loading

joshatcaper commented Apr 19, 2020

BenTheElder commented Apr 19, 2020

msau42 commented Apr 21, 2020

joshatcaper commented Apr 21, 2020

BenTheElder commented Apr 22, 2020

BenTheElder commented May 2, 2020

BenTheElder commented May 2, 2020

BenTheElder commented May 2, 2020

danquah commented Jun 16, 2020

ayoubfaouzi commented Jul 25, 2020

BenTheElder commented Jul 25, 2020 via email

danquah commented Aug 5, 2020

BenTheElder commented Aug 27, 2020

koxu1996 commented Sep 17, 2020

BenTheElder commented Sep 17, 2020

koxu1996 commented Sep 18, 2020 •

edited

Loading

BenTheElder commented Sep 18, 2020

koxu1996 commented Sep 18, 2020

aojea commented Sep 19, 2020

koxu1996 commented Sep 19, 2020 •

edited

Loading

BenTheElder commented Sep 25, 2020

BenTheElder commented Feb 6, 2021

backtrackshubham commented Jun 11, 2021

BenTheElder commented Jun 11, 2021

backtrackshubham commented Jun 11, 2021

BenTheElder commented Jun 11, 2021

backtrackshubham commented Jun 11, 2021

backtrackshubham commented Jul 7, 2021

BenTheElder commented Jul 7, 2021 •

edited

Loading

meln5674 commented Jun 14, 2024

mosesdd commented Jul 5, 2024 •

edited

Loading

Enable Simulation of automatically provisioned ReadWriteMany PVs #1487

Enable Simulation of automatically provisioned ReadWriteMany PVs #1487

Comments

joshatcaper commented Apr 17, 2020

BenTheElder commented Apr 17, 2020

BenTheElder commented Apr 17, 2020 • edited Loading

joshatcaper commented Apr 19, 2020

BenTheElder commented Apr 19, 2020

msau42 commented Apr 21, 2020

joshatcaper commented Apr 21, 2020

BenTheElder commented Apr 22, 2020

BenTheElder commented May 2, 2020

BenTheElder commented May 2, 2020

BenTheElder commented May 2, 2020

danquah commented Jun 16, 2020

ayoubfaouzi commented Jul 25, 2020

BenTheElder commented Jul 25, 2020 via email

danquah commented Aug 5, 2020

BenTheElder commented Aug 27, 2020

koxu1996 commented Sep 17, 2020

BenTheElder commented Sep 17, 2020

koxu1996 commented Sep 18, 2020 • edited Loading

BenTheElder commented Sep 18, 2020

koxu1996 commented Sep 18, 2020

aojea commented Sep 19, 2020

koxu1996 commented Sep 19, 2020 • edited Loading

BenTheElder commented Sep 25, 2020

BenTheElder commented Feb 6, 2021

backtrackshubham commented Jun 11, 2021

BenTheElder commented Jun 11, 2021

backtrackshubham commented Jun 11, 2021

BenTheElder commented Jun 11, 2021

backtrackshubham commented Jun 11, 2021

backtrackshubham commented Jul 7, 2021

BenTheElder commented Jul 7, 2021 • edited Loading

meln5674 commented Jun 14, 2024

mosesdd commented Jul 5, 2024 • edited Loading

BenTheElder commented Apr 17, 2020 •

edited

Loading

koxu1996 commented Sep 18, 2020 •

edited

Loading

koxu1996 commented Sep 19, 2020 •

edited

Loading

BenTheElder commented Jul 7, 2021 •

edited

Loading

mosesdd commented Jul 5, 2024 •

edited

Loading