fix: add retries to find running web pod #1787
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
SUMMARY
Closes #1784
This PR adds
retries
to the task that tries finding running web pod. I don't have any ideas what values are the best forretries
anddelay
, but I believe 2 minutes is enough until the web pod to be running.ISSUE TYPE
ADDITIONAL INFORMATION
Tested locally by deploying following minimal AWX:
Without this PR (as 2.13.1 is), finding runing web pod is failed since there is no
wait
after deployment of web pod. In this case, web pod is still inContainerCreating
due to pull images, for example.Deploy custom Operator including this PR:
IMG=registry.example.com/ansible/awx-operator:wait_web BUILD_ARGS="--build-arg DEFAULT_AWX_VERSION=24.0.0" make docker-build docker-push deploy
Deployment of above minimal AWX is completed in the first loop
Also tested with the CR that contains
web_*ness_period
:With the Operator that includes both this PR and #1786, the CR can be deployied without any failure.
Note, in the implementation in this PR (and before
wait
is removed), if any one of the three containers in the web pod is running, the task succeeds and moves on to the next task.If we want to make sure that all three containers are strictly running, we will need to implement additional logic.