-
Notifications
You must be signed in to change notification settings - Fork 114
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Continue node drain after reboot #232
Continue node drain after reboot #232
Conversation
Since drain operation started we don't need to requires drain lock for this node because node already has required annotation. It's safe to continue node drain procedure without lock. Closes: k8snetworkplumbingwg#230 Signed-off-by: Ivan Kolodyazhny <[email protected]>
Thanks for your PR,
To skip the vendors CIs use one of:
|
Let's keep this PR in a WIP state because it requires additional testing, unit-tests implementation and most probably additional fixes |
Thanks for your PR,
To skip the vendors CIs use one of:
|
fdd02ab
to
8f52b7c
Compare
Thanks for your PR,
To skip the vendors CIs use one of:
|
I tested this patch on OCP. It works. |
if err := dn.completeDrain(); err != nil { | ||
glog.Errorf("nodeStateSyncHandler(): failed to complete draining: %v", err) | ||
return err | ||
} | ||
} else if !ok { | ||
} else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we are now calling dn.annotateNode()
in the else clause even if annotation is already set. this will invoke an extra GET to k8s API.
this IMO should be called like before only if annotation is not set WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is only two places where we add 'Idle' annotation: the line above and 'completeDrain' function, so IMO it doesn't make sense to check for it.
Probably we need to refactor the part of code where we do annotations to make it more clear
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it just to save a call to k8s API in annotateNode()
Think of a case where nodeStateSyncHandler runs but (for some reason) it does not require drain and the sriov-node-state annotation is already set to idle. in this case we dont really need to call annotateNode.
not sure how often we might hit this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
got it, will fix it into 'annotateNode' function
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will there be a corner case where dn.node is not the latest if not get from annotateNode call?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will there be a corner case where dn.node is not the latest if not get from annotateNode call?
dn.node will get updated via informer. it will eventually be consistent (thats how config daemon is designed).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
got it, will fix it into 'annotateNode' function
I think this else clause aims to add idle annotation if annotation does not exist in dn.node
. i prefer to keep this logic
(add a check : hasSriovNodeStateAnnot()
) or similar.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will there be a corner case where dn.node is not the latest if not get from annotateNode call?
dn.node will get updated via informer.
I was thinking if there is delay between informer receive the node update event and dn.node be update
or a delay between node being update in api server and informer receive the node update event.
In the above case, dn.node is not the latest, which may result in inaccurate return from nodeHasAnnotation
? (just thinking the possibility, I didn't test myself or it may be easy to test)
it will eventually be consistent (thats how config daemon is designed).
If the nodeHasAnnotation
returns incorrectly, I don't see how it can resolve by itself, guess the node will still maintain its old annotation forever unless new policy or state event?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the nodeHasAnnotation returns incorrectly, I don't see how it can resolve by itself, guess the node will still maintain its old annotation forever unless new policy or state event?
i dont see how its different from what we had before, the change just moved the logic to a separate function.
your concern is valid theoretically, we have not encountered this. and its general for config daemon design.
/cc @SchSeba |
Thanks for your PR,
To skip the vendors CIs use one of:
|
Signed-off-by: Ivan Kolodyazhny <[email protected]>
687cbcb
to
031dcbc
Compare
Thanks for your PR,
To skip the vendors CIs use one of:
|
/test-all |
/lgtm |
Since drain operation started we don't need to requires
drain lock for this node because node already has required
annotation.
It's safe to continue node drain procedure without lock.
Closes: #230
Signed-off-by: Ivan Kolodyazhny [email protected]