-
Notifications
You must be signed in to change notification settings - Fork 690
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
internal/contour: only write status updates if we're the leader #1745
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Fixes projectcontour#1425 Fixes projectcontour#1385 Updates projectcontour#499 This PR threads the leader elected signal throught to contour.EventHandler allowing it to skip writing status back to the API unless it is currently the leader. This should fixes projectcontour#1425 by removing the condition where several Contours would fight to update status. This updates projectcontour#499 by continuing to reduce the number of updates that Contour generates, thereby processes. This PR does create a condition where during startup no Contour may be the leader and the xDS tables reach steady state before anyone is elected. This would mean the status of an object would be stale until the next update from the API server after leadership was established. To address this a mechanism to force a rebuild of the dag is added to the EventHandler and wired to election success. Signed-off-by: Dave Cheney <[email protected]>
youngnick
approved these changes
Oct 21, 2019
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Nice.
1 task
davecheney
added a commit
to davecheney/contour
that referenced
this pull request
Oct 21, 2019
This PR addresses incorrect comments that I noticed during projectcontour#1745. This PR also moves the holdoff exceeded calculation to its own function cleaning up the caller slightly. Something I noticed doing this is the holdoff logic is as follows. 1. When the EventHandler is created e.last is set to time.Now, this causes the stream of updates from the informers to be heldoff for up to e.HoldoffDelay + e.HoldoffMaxDelay. This process may happen several times until all updates from the API server are processed. 2. Once the cluster settles down the _next_ update from the API server will likely be larger than e.HoldoffMaxDelay from e.last and will be processed _immediately_. 3. If another update arrives after 2. but before another e.HoldoffMaxDelay then a pending timer will be started and the update processed after e.HoldoffDelay. The long and the short of the current logic is infrequent, O(seconds), updates will be processed immediately, not after e.HoldoffDelay. This was surprising to me as this was not the logic that I thought that I wrote when I last went through this function. That is not to say it is wrong, however it does open the possibility that a large stream of updates (k apply ...) arriving after the cluster is stable will cause at least 2 updates with the second being at least e.HoldoffDelay since the first. Signed-off-by: Dave Cheney <[email protected]>
davecheney
added a commit
to davecheney/contour
that referenced
this pull request
Oct 21, 2019
This PR addresses incorrect comments that I noticed during projectcontour#1745. This PR also moves the holdoff exceeded calculation to its own function cleaning up the caller slightly. Something I noticed doing this is the holdoff logic is as follows. 1. When the EventHandler is created e.last is set to time.Now, this causes the stream of updates from the informers to be heldoff for up to e.HoldoffDelay + e.HoldoffMaxDelay. This process may happen several times until all updates from the API server are processed. 2. Once the cluster settles down the _next_ update from the API server will likely be larger than e.HoldoffMaxDelay from e.last and will be processed _immediately_. 3. If another update arrives after 2. but before another e.HoldoffMaxDelay then a pending timer will be started and the update processed after e.HoldoffDelay. The long and the short of the current logic is infrequent, O(seconds), updates will be processed immediately, not after e.HoldoffDelay. This was surprising to me as this was not the logic that I thought that I wrote when I last went through this function. That is not to say it is wrong, however it does open the possibility that a large stream of updates (k apply ...) arriving after the cluster is stable will cause at least 2 updates with the second being at least e.HoldoffDelay since the first. Signed-off-by: Dave Cheney <[email protected]>
davecheney
added a commit
to davecheney/contour
that referenced
this pull request
Oct 21, 2019
This PR addresses incorrect comments that I noticed during projectcontour#1745. This PR also moves the holdoff exceeded calculation to its own function cleaning up the caller slightly. Something I noticed doing this is the holdoff logic is as follows. 1. When the EventHandler is created e.last is set to time.Now, this causes the stream of updates from the informers to be heldoff for up to e.HoldoffDelay + e.HoldoffMaxDelay. This process may happen several times until all updates from the API server are processed. 2. Once the cluster settles down the _next_ update from the API server will likely be larger than e.HoldoffMaxDelay from e.last and will be processed _immediately_. 3. If another update arrives after 2. but before another e.HoldoffMaxDelay then a pending timer will be started and the update processed after e.HoldoffDelay. The long and the short of the current logic is infrequent, O(seconds), updates will be processed immediately, not after e.HoldoffDelay. This was surprising to me as this was not the logic that I thought that I wrote when I last went through this function. That is not to say it is wrong, however it does open the possibility that a large stream of updates (k apply ...) arriving after the cluster is stable will cause at least 2 updates with the second being at least e.HoldoffDelay since the first. Signed-off-by: Dave Cheney <[email protected]>
davecheney
added a commit
to davecheney/contour
that referenced
this pull request
Oct 21, 2019
This PR addresses incorrect comments that I noticed during projectcontour#1745. Something I noticed doing this is the holdoff logic is as follows. 1. When the EventHandler is created e.last is set to time.Now, this causes the stream of updates from the informers to be heldoff for up to e.HoldoffDelay + e.HoldoffMaxDelay. This process may happen several times until all updates from the API server are processed. 2. Once the cluster settles down the _next_ update from the API server will likely be larger than e.HoldoffMaxDelay from e.last and will be processed _immediately_. 3. If another update arrives after 2. but before another e.HoldoffMaxDelay then a pending timer will be started and the update processed after e.HoldoffDelay. The long and the short of the current logic is infrequent, O(seconds), updates will be processed immediately, not after e.HoldoffDelay. This was surprising to me as this was not the logic that I thought that I wrote when I last went through this function. That is not to say it is wrong, however it does open the possibility that a large stream of updates (k apply ...) arriving after the cluster is stable will cause at least 2 updates with the second being at least e.HoldoffDelay since the first. Signed-off-by: Dave Cheney <[email protected]>
davecheney
added a commit
that referenced
this pull request
Oct 21, 2019
This PR addresses incorrect comments that I noticed during #1745. Something I noticed doing this is the holdoff logic is as follows. 1. When the EventHandler is created e.last is set to time.Now, this causes the stream of updates from the informers to be heldoff for up to e.HoldoffDelay + e.HoldoffMaxDelay. This process may happen several times until all updates from the API server are processed. 2. Once the cluster settles down the _next_ update from the API server will likely be larger than e.HoldoffMaxDelay from e.last and will be processed _immediately_. 3. If another update arrives after 2. but before another e.HoldoffMaxDelay then a pending timer will be started and the update processed after e.HoldoffDelay. The long and the short of the current logic is infrequent, O(seconds), updates will be processed immediately, not after e.HoldoffDelay. This was surprising to me as this was not the logic that I thought that I wrote when I last went through this function. That is not to say it is wrong, however it does open the possibility that a large stream of updates (k apply ...) arriving after the cluster is stable will cause at least 2 updates with the second being at least e.HoldoffDelay since the first. Signed-off-by: Dave Cheney <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Fixes #1425
Fixes #1385
Updates #499
This PR threads the leader elected signal throught to
contour.EventHandler allowing it to skip writing status back to the API
unless it is currently the leader.
This should fixes #1425 by removing the condition where several Contours
would fight to update status. This updates #499 by continuing to reduce
the number of updates that Contour generates, thereby processes.
This PR does create a condition where during startup no Contour may be
the leader and the xDS tables reach steady state before anyone is
elected. This would mean the status of an object would be stale until
the next update from the API server after leadership was established.
To address this a mechanism to force a rebuild of the dag is added to
the EventHandler and wired to election success.
Signed-off-by: Dave Cheney [email protected]