Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adjust update strategy for NNF DaemonSets #118

Assignees

Comments

@bdevcich
Copy link
Contributor

bdevcich commented Jan 11, 2024

From @behlendorf:

It took a significant amount of time for it to kill and restart all the pods when I added in the merced filesystem since it ran through it sequentially. Thankfully that's a one time thing, but it seems it will make redeploying slow.

When you add/remove a lustrefilesystem resource, the nnf-dm-manager-controller sees that and then adds/removes a Volume and VolumeMount to the nnf-dm-worker DaemonSet. Kubernetes then handles it from there and restarts the nnf-dm-worker pods on each rabbit to mount/umount that filesystem change. The DaemonSet defines this for the updateStrategy:

 updateStrategy:
    rollingUpdate:
      maxSurge: 0
      maxUnavailable: 1
    type: RollingUpdate

That maxUnavailable: 1 is what is causing the sequential behavior. We'll need to tweak this.

@bdevcich
Copy link
Contributor Author

We need to consider the same for lusre-csi-driver. There might be other areas for this as well.

@bdevcich
Copy link
Contributor Author

@behlendorf, I'm making changes to set this to a sane default for our 3 daemonsets:

  • nnf-dm-worker
  • lustre-csi-node
  • nnf-node-manager

maxUnavailable can be set to a number of nodes/pods or a percentage. Setting it to 100% would cause it to attempt a restart on all the nodes at the same time, 50% would do half, 25% a quarter, etc. Do you have a preference on what percentage (or hard number) should be?

This value will be adjustable for each system.

@behlendorf
Copy link
Collaborator

@bdevcich as an initial swag how about 25%. This seems like it may be a reasonable compromise between propagating the changes rapidly and potentially overwhelming the system / container repository / other. Then we can tune on a per system basis, and revisit the default as needed.

@bdevcich
Copy link
Contributor Author

@bdevcich as an initial swag how about 25%. This seems like it may be a reasonable compromise between propagating the changes rapidly and potentially overwhelming the system / container repository / other. Then we can tune on a per system basis, and revisit the default as needed.

Perfect, that's the percentage that I've been playing around with in my testing.

@bdevcich bdevcich moved this from 📋 Open to 👀 In review in Issues Dashboard Feb 13, 2024
@bdevcich bdevcich changed the title Adjust update strategy for nnf-dm worker pods Adjust update strategy for NNF DaemonSets Feb 13, 2024
@bdevcich
Copy link
Contributor Author

bdevcich commented Feb 13, 2024

PRs here to default these all to 25%:

For nnf-dm, the NnfDataMovementManager resource is edited rather than the DaemonSet directly. The manager is responsibe for managing the DaemonSet.

@bdevcich
Copy link
Contributor Author

@behlendorf I am comfortable closing this issue after we implemented this manually today on El Cap. Do you agree?

@behlendorf
Copy link
Collaborator

Yup, things are looking much better after these changes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment