-
Notifications
You must be signed in to change notification settings - Fork 114
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multiple SriovNetworkNodePolicy creation fails #230
Comments
Hi @e0ne can you please check the sriov-network-config-daemon logs? |
I believe we essentially hitting a deadlock in case worker1, after reboot, needs to drain the node and worker 2 is holding the drain lock (is leader and waiting for draining annotation to be removed from node1). Or generally speaking, any case where |
Since drain operation started we don't need to requires drain lock for this node because node already has required annotation. It's safe to continue node drain procedure without lock. Closes: k8snetworkplumbingwg#230 Signed-off-by: Ivan Kolodyazhny <[email protected]>
Wondering why worker-2 is able to get the drain lock while worker-1 still holds |
as far as i saw there is no place in the daemon code that prevents it. as soon as |
It gets the drainLock when dn.drainable is equal to true, but dn.drainable requires other nodes to not have |
i see in daemon.go-L#803 that its called within it will loop until it becomes drainable or context is canceled which never happens in this case. am i missing something? |
dn.drainable will be set once other nodes complete drain, right? so it will eventually be able to proceed.
|
yes but other nodes will not be able to complete drain if it also attempt to get drain lock (get leadership) PR #232 addresses that by skipping the lock in daemon |
I've reproduce this issue on OCP. |
Since drain operation started we don't need to requires drain lock for this node because node already has required annotation. It's safe to continue node drain procedure without lock. Closes: k8snetworkplumbingwg#230 Signed-off-by: Ivan Kolodyazhny <[email protected]>
Environment details:
Steps to reproduce:
Expected results:
All policies are applied
Actual results:
Part of policies applied. Leader elected for worker 2. Worker1 still has 'Draining' annotation, so config daemons do not proceed any configurations
The text was updated successfully, but these errors were encountered: