-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adjust update strategy for NNF DaemonSets #118
Adjust update strategy for NNF DaemonSets #118
Comments
We need to consider the same for lusre-csi-driver. There might be other areas for this as well. |
@behlendorf, I'm making changes to set this to a sane default for our 3 daemonsets:
This value will be adjustable for each system. |
@bdevcich as an initial swag how about 25%. This seems like it may be a reasonable compromise between propagating the changes rapidly and potentially overwhelming the system / container repository / other. Then we can tune on a per system basis, and revisit the default as needed. |
Perfect, that's the percentage that I've been playing around with in my testing. |
PRs here to default these all to 25%:
For nnf-dm, the NnfDataMovementManager resource is edited rather than the DaemonSet directly. The manager is responsibe for managing the DaemonSet. |
@behlendorf I am comfortable closing this issue after we implemented this manually today on El Cap. Do you agree? |
Yup, things are looking much better after these changes. |
From @behlendorf:
When you add/remove a
lustrefilesystem
resource, thennf-dm-manager-controller
sees that and then adds/removes aVolume
andVolumeMount
to thennf-dm-worker
DaemonSet
. Kubernetes then handles it from there and restarts thennf-dm-worker
pods on each rabbit to mount/umount that filesystem change. TheDaemonSet
defines this for the updateStrategy:That
maxUnavailable: 1
is what is causing the sequential behavior. We'll need to tweak this.The text was updated successfully, but these errors were encountered: