-
Notifications
You must be signed in to change notification settings - Fork 45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pod failed to mount SparseLoopDevice Volume after node reboot #2726
Comments
The "Update pillar after volume provisioning" step should be outside of the for loop because: - it's ID is not customized per-volume (duplicate ID) - we only need to refresh once after deploying all the volumes Closes: #2726 Closes: #2729 Signed-off-by: Sylvain Laperche <[email protected]>
The "Update pillar after volume provisioning" step should be outside of the for loop because: - it's ID is not customized per-volume (duplicate ID) - we only need to refresh once after deploying all the volumes Refs: #2726 Closes: #2729 Signed-off-by: Sylvain Laperche <[email protected]>
In our snapshot upgrade state we start from already installed machine that have the issue with sparse loop device, so add a workaround for the #2726 issue
In our snapshot upgrade state we start from already installed machine that have the issue with sparse loop device, so add a workaround for the #2726 issue
In our snapshot upgrade state we start from already installed machine that have the issue with sparse loop device, so add a workaround for the #2726 issue
In our snapshot upgrade state we start from already installed machine that have the issue with sparse loop device, so add a workaround for the #2726 issue
In our snapshot upgrade state we start from already installed machine that have the issue with sparse loop device, so add a workaround for the #2726 issue
This can be worked around by applying the Salt state responsible for creating the loop devices. This is supposed to happen at node reboot (using This is a nuisance, at least. As discussed on Slack, I see two ways to 'fix' this:
Given the second approach, here's an outline of how this can be deployed:
(Note: this is slightly different from the current implementation in the module, where the After creating the file,
Similarly, removing the volume implies |
For the |
FWIW, I did not test what happens when you |
For For
|
Thanks. Indeed this could imply we need two unit templates: one for 'file' volumes (with Alternatively, we have some script which can handle both 'file' and 'block' volumes somehow, drop this in |
Yup, I can confirm that
|
Confirmed. In the
Script design TBD of course, likely first figuring out the device ID 'read-only', then finally |
The previous approach was relying entirely on Salt (through our custom `metalk8s_volumes` execution/state module) to manage provisioning and cleanup of loop devices (on sparse files). The issues arise when (re)booting a node: - Salt minion has to execute a startup state, which may come in when kube-apiserver is not (yet) available - `losetup --find` calls, with the version available for CentOS 7, are not atomic, so we may lose some devices if we have too many (re-running the state manually is the current workaround) To fix these, we introduce two main changes: - management of loop devices is delegated to systemd; using a unit template file, we define a service per volume (passing the Volume's UUID as a parameter), which will provision and attach the device on start, and detach it on stop - `losetup --find` invocations from these units is made sequential by using an exclusive lockfile (this is not perfect, but avoids the need to re-implement the command ourselves to include the fix from higher versions of `losetup`) Note that we cannot simply use `losetup --detach` in the unit template, because sparse volumes may not always be discovered from the same path: - either the volume is formatted, and we can find it in /dev/disk/by-uuid/ - or it's kept raw, so we only have the UUID in the partition table, and we can discover the device through /dev/disk/by-partuuid/ Fixes: #2726
We adjust the `lint-python` tox environment to: - handle Python 2 and Python 3 files (we only run mypy on the latter) - always run both pylint and mypy, and report exit codes for both in case of failure See: #2726
The previous approach was relying entirely on Salt (through our custom `metalk8s_volumes` execution/state module) to manage provisioning and cleanup of loop devices (on sparse files). The issues arise when (re)booting a node: - Salt minion has to execute a startup state, which may come in when kube-apiserver is not (yet) available - `losetup --find` calls, with the version available for CentOS 7, are not atomic, so we may lose some devices if we have too many (re-running the state manually is the current workaround) To fix these, we introduce two main changes: - management of loop devices is delegated to systemd; using a unit template file, we define a service per volume (passing the Volume's UUID as a parameter), which will provision and attach the device on start, and detach it on stop - `losetup --find` invocations from these units is made sequential by using an exclusive lockfile (this is not perfect, but avoids the need to re-implement the command ourselves to include the fix from losetup 2.25 and higher: see util-linux/util-linux@f7e21185) Note that we cannot simply use `losetup --detach` in the unit template, because sparse volumes may not always be discovered from the same path: - either the volume is formatted, and we can find it in /dev/disk/by-uuid/ - or it's kept raw, so we only have the UUID in the partition table, and we can discover the device through /dev/disk/by-partuuid/ Fixes: #2726
We adjust the `lint-python` tox environment to: - handle Python 2 and Python 3 files (we only run mypy on the latter) - always run both pylint and mypy, and report exit codes for both in case of failure See: #2726
The previous approach was relying entirely on Salt (through our custom `metalk8s_volumes` execution/state module) to manage provisioning and cleanup of loop devices (on sparse files). The issues arise when (re)booting a node: - Salt minion has to execute a startup state, which may come in when kube-apiserver is not (yet) available - `losetup --find` calls, with the version available for CentOS 7, are not atomic, so we may lose some devices if we have too many (re-running the state manually is the current workaround) To fix these, we introduce two main changes: - management of loop devices is delegated to systemd; using a unit template file, we define a service per volume (passing the Volume's UUID as a parameter), which will provision and attach the device on start, and detach it on stop - `losetup --find` invocations from these units is made sequential by using an exclusive lockfile (this is not perfect, but avoids the need to re-implement the command ourselves to include the fix from losetup 2.25 and higher: see util-linux/util-linux@f7e21185) Note that we cannot simply use `losetup --detach` in the unit template, because sparse volumes may not always be discovered from the same path: - either the volume is formatted, and we can find it in /dev/disk/by-uuid/ - or it's kept raw, so we only have the UUID in the partition table, and we can discover the device through /dev/disk/by-partuuid/ Fixes: #2726
We adjust the `lint-python` tox environment to: - handle Python 2 and Python 3 files (we only run mypy on the latter) - always run both pylint and mypy, and report exit codes for both in case of failure See: #2726
We adjust the `lint-python` tox environment to: - handle Python 2 and Python 3 files (we only run mypy on the latter) - always run both pylint and mypy, and report exit codes for both in case of failure See: #2726
The previous approach was relying entirely on Salt (through our custom `metalk8s_volumes` execution/state module) to manage provisioning and cleanup of loop devices (on sparse files). The issues arise when (re)booting a node: - Salt minion has to execute a startup state, which may come in when kube-apiserver is not (yet) available - `losetup --find` calls, with the version available for CentOS 7, are not atomic, so we may lose some devices if we have too many (re-running the state manually is the current workaround) To fix these, we introduce two main changes: - management of loop devices is delegated to systemd; using a unit template file, we define a service per volume (passing the Volume's UUID as a parameter), which will provision and attach the device on start, and detach it on stop - `losetup --find` invocations from these units is made sequential by using an exclusive lockfile (this is not perfect, but avoids the need to re-implement the command ourselves to include the fix from losetup 2.25 and higher: see util-linux/util-linux@f7e21185) Note that we cannot simply use `losetup --detach` in the unit template, because sparse volumes may not always be discovered from the same path: - either the volume is formatted, and we can find it in /dev/disk/by-uuid/ - or it's kept raw, so we only have the UUID in the partition table, and we can discover the device through /dev/disk/by-partuuid/ Fixes: #2726
We adjust the `lint-python` tox environment to: - handle Python 2 and Python 3 files (we only run mypy on the latter) - always run both pylint and mypy, and report exit codes for both in case of failure See: #2726
Component:
'containers', 'kubernetes', 'volumes'
What happened:
When provisioning a Volume using SparseLoopDevice mode, a pod consuming such volume may fail to start after a node reboot, because it is unable to mount the volume since its path on disk may not exist anymore:
Warning FailedMount 90s (x23 over 32m) kubelet, master-1 MountVolume.NewMounter initialization failed for volume "master-1-prometheus" : path "/dev/disk/by-uuid/183dd142-9c46-411f-b160-d99bbd03f6c9" does not exist
Workaround: restart salt-minion on all nodes where some Volumes were created using SparseLoopDevice mode
What was expected:
As a persistent Volume, we expect its path on disk to be persistent so that the pod can mount it after a reboot and retrieve data that were stored under it.
Steps to reproduce
Resolution proposal (optional):
The text was updated successfully, but these errors were encountered: