-
Notifications
You must be signed in to change notification settings - Fork 92
loss of data in k8s persistent volumes when used in k8sm setup #798
Comments
First thoughts about reasonable defaults (because executors really rootDir={sandbox}/root And then if you want to override rootDir to point to some location on the We're actually thinking through a related problem right now with respect to A better solution might come in the form of a custom k8sm runtime Another solution might be to write a custom mesos isolator module that adds On Tue, Mar 29, 2016 at 11:02 PM, ravilr [email protected] wrote:
|
@jdef
k8s kubelet sets up the volume mounts in a directory configured by --root-dir flag and bind mounts them on to docker containers. kubelet also runs Kubelet.cleanupOrphanedVolumes() in sync loop to cleanup/unmount any volume mounts that are left out on the kubelet host from any killed/finished pods.
In case of k8sm, kubelet's RootDirectory is set to mesos executor's sandbox dir. this is overridable using --kubelet-root-dir flag in scheduler, but this doesn't work due to executor using the same dir for setting up static-pod-config dir and the executor fails to come up with error if it finds an already existing static-pods dir.
the issue we are seeing with kubelet executor using the sandbox dir itself as kubelet root-dir is, whenever the executor id (or slave id) changes due to executor restart/slave restart/framework upgrade, the kubelet doesn't get a chance to properly cleanup orphaned volumes mounted on the host. in case of persistent volumes, the kubelet pods volume dir would still be pointing to mounted filesystem. Then, the mesos slave's gc_delay setting kicks in and tries to cleanup the old executors sandbox dirs, which leads to rm'ing of persistent volume dirs. the end result is : all data backed by persistent volume are gone.
i think static-pods dir should be using the mesos sandbox dir instead of using kubelet.RootDirectory. then one could set --kubelet-root-dir to a static path on the slave host. But, still there is no guarantee that a slave gets assigned a kubelet executor task again, which means the kubelet volume dirs might be left mounted forever. But atleast, they won't be deleted inadvertently by the mesos-slave gc.
we are experiencing this in our k8sm cluster, using nfs backed persistent volumes.
The text was updated successfully, but these errors were encountered: