-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
locksmith: set max semaphore to 3 #543
Conversation
With recent upgrade from Go 1.21 to Go 1.22 we noticed that this test has never run as designed. Previously this test was rebooting only one machine while it's supposed to coordinate the reboot of 3 machines. It was an issue with the "for" loop: > Previously, the variables declared by a “for” loop were created > once and updated by each iteration. In Go 1.22, each iteration > of the loop creates new variables, to avoid accidental sharing bugs. Now we set the maximum of token to 3, this is not recommended in production as it might create a downtime but here it's ok for testing purposes. Race condition should not occur as there is a 5 minutes delay between the command and the actual reboot. Signed-off-by: Mathieu Tortuyaux <[email protected]>
@tormath1 i think the goal was to get the machines to reboot sequentially, otherwise we're not really testing locksmith. Is the test failing because of the 5 minute delay? Why is there 5 minute delay for the reboot? If it's because the user is logged in on the console, is it possible to use ignition to disable |
Then if the goal is to reboot sequentially, the test will at least take 25 minutes:
The reboot delay is here: https://github.com/flatcar/locksmith/blob/5b2275ec726a7f70902d2da0c782bbad405e3ef5/locksmithctl/daemon.go#L169-L174 (even if no one is connected it waits 5 minutes) The delay between etcd check: https://github.com/flatcar/locksmith/blob/5b2275ec726a7f70902d2da0c782bbad405e3ef5/locksmithctl/daemon.go#L193-L195 |
Hmm, i see your point - would |
I read too quickly the locksmith code - indeed, if no one is connected it should reboot directly - we can try with 2 semaphore. |
@jepio |
ffeac0b
to
8299ca8
Compare
With recent upgrade from Go 1.21 to Go 1.22 we noticed that this test has never run as designed.
Previously this test was rebooting only one machine while it's supposed to coordinate the reboot of 3 machines.
It was an issue with the "for" loop:
Now we set the maximum of token to 3, this is not recommended in production as it might create a downtime but here it's ok for testing purposes.
Race condition should not occur as there is a 5 minutes delay between the command and the actual reboot.
Testing done
Locally on QEMU.
3 semaphore: http://jenkins.infra.kinvolk.io:8080/job/container/job/test/25061/