-
Notifications
You must be signed in to change notification settings - Fork 86
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ignore excluded processes from minimum uptime calculation when doing a rolling bounce #1333
Ignore excluded processes from minimum uptime calculation when doing a rolling bounce #1333
Conversation
5181f16
to
c02da4a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the early approval had an outdated view of the code.
addressMap[process.Locality[fdbv1beta2.FDBLocalityInstanceIDKey]] = append(addressMap[process.Locality[fdbv1beta2.FDBLocalityInstanceIDKey]], process.Address) | ||
|
||
if process.UptimeSeconds < minimumUptime { | ||
if process.UptimeSeconds < minimumUptime && !process.Excluded { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the benefit of checking here that the process is not excluded? If the process is excluded we will continue the loop anyway?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need the process addresses in the addressMap
otherwise we'll hit https://github.com/FoundationDB/fdb-kubernetes-operator/blob/c02da4a05c0808c17604ebab2b13ebcdc47dbf2c/controllers/bounce_processes.go#L110
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My comment was actually referring to the statement in line 62-64 where we check if the process is excluded and if so we will continue the for loop (which is correct) but in line 68 we check again if the process is not excluded which is a case we can't hit based on the current code.
I think there is still one case we have to solve: If a process was manually excluded (but never included again) we potentially hit the case that len(missingAddress) > 0
is true and we requeue forever until the process is included again or marked as removal. I believe the correct solution to this is to remove the if statement in line 62-64 and having the statement process.UptimeSeconds < minimumUptime && !process.Excluded
should be enough for our use case. Could you add a unit test to ensure we handle this case properly (I hope I explained it well enough).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was supposed to be removed. Amended.
Result of fdb-kubernetes-operator-pr on Linux CentOS 7
|
Result of fdb-kubernetes-operator-pr on Linux CentOS 7
|
Result of fdb-kubernetes-operator-pr on Linux CentOS 7
|
Apologies for that, I read the code more after submitting and realised the other approach is better. |
Result of fdb-kubernetes-operator-pr on Linux CentOS 7
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
JFYI: You can trigger the e2e pipeline by closing and reopening the PR.
addressMap[process.Locality[fdbv1beta2.FDBLocalityInstanceIDKey]] = append(addressMap[process.Locality[fdbv1beta2.FDBLocalityInstanceIDKey]], process.Address) | ||
|
||
if process.UptimeSeconds < minimumUptime { | ||
if process.UptimeSeconds < minimumUptime && !process.Excluded { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My comment was actually referring to the statement in line 62-64 where we check if the process is excluded and if so we will continue the for loop (which is correct) but in line 68 we check again if the process is not excluded which is a case we can't hit based on the current code.
I think there is still one case we have to solve: If a process was manually excluded (but never included again) we potentially hit the case that len(missingAddress) > 0
is true and we requeue forever until the process is included again or marked as removal. I believe the correct solution to this is to remove the if statement in line 62-64 and having the statement process.UptimeSeconds < minimumUptime && !process.Excluded
should be enough for our use case. Could you add a unit test to ensure we handle this case properly (I hope I explained it well enough).
Result of fdb-kubernetes-operator-pr on Linux CentOS 7
|
Result of fdb-kubernetes-operator-pr on Linux CentOS 7
|
Result of fdb-kubernetes-operator-pr on Linux CentOS 7
|
Result of fdb-kubernetes-operator-pr on Linux CentOS 7
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 👍 Please remove the comment before merging.
controllers/bounce_processes.go
Outdated
for _, process := range status.Cluster.Processes { | ||
addressMap[process.Locality[fdbv1beta2.FDBLocalityInstanceIDKey]] = append(addressMap[process.Locality[fdbv1beta2.FDBLocalityInstanceIDKey]], process.Address) | ||
|
||
if process.UptimeSeconds < minimumUptime { | ||
// comment to trigger PR builder |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we remove this comment again?
c605fcd
to
fbd6fee
Compare
Result of fdb-kubernetes-operator-pr on Linux CentOS 7
|
Result of fdb-kubernetes-operator-pr on Linux CentOS 7
|
Description
Ignore processes that are excluded when calculating the minimum uptime.
Type of change
Discussion
Testing
Unit tests
Documentation
Follow-up