-
Notifications
You must be signed in to change notification settings - Fork 208
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
workflows: fix in-cluster job kubectl wait #451
Conversation
dc04715
to
34f9c3f
Compare
34f9c3f
to
2dd1b37
Compare
2dd1b37
to
4597588
Compare
4597588
to
ff82cc9
Compare
Links to test runs of workflow changes:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice find
`kubectl wait --for=condition=complete --timeout=X` behaviour is a bit counterintuitive: it waits until either the job succeeds or timeout is hit. When the job fails, it does not stop waiting: it will continue waiting until timeout is hit. For watching for failures, `--for=condition=failed` should be used. However, this will likewise wait until either the job fails or timeout is hit, and will not stop waiting if the job succeeds. `kubectl wait` unfortunately does not allow waiting for multiple conditions. To work around this, we set up two concurrent background waits for both conditions, and actively wait for the first one to end. This will ensure we do not wait for the whole allocated timeout everytime there is an error during the in-cluster script execution. Signed-off-by: Nicolas Busseneau <[email protected]>
All test runs passed except EKS (tunnel), but that is because of the consistent failure discussed on Slack here. Since:
=> We are in one of the exception cases to the zero-flakes strategy and do not need to wait to rebase on top of the future (hopefully soon) fix for the EKS (tunnel) workflow. This PR can be marked as |
ff82cc9
to
3ef26b8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Needs similar changes in the *-v1.10 GH workflows
Sir, this is |
kubectl wait --for=condition=complete --timeout=X
behaviour is a bit counterintuitive: it waits until either the job succeeds or timeout is hit. When the job fails, it does not stop waiting: it will continue waiting until timeout is hit.For watching for failures,
--for=condition=failed
should be used. However, this will likewise wait until either the job fails or timeout is hit, and will not stop waiting if the job succeeds.kubectl wait
unfortunately does not allow waiting for multiple conditions. To work around this, we set up two concurrent background waits for both conditions, and actively wait for the first one to end.This will ensure we do not wait for the whole allocated timeout everytime there is an error during the in-cluster script execution.