You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I understand that the AWX Operator is open source software provided for free and that I might not receive a timely response.
Bug Summary
On fresh deployments for AWX CR, sometimes the first or more reconciliation loop fails on the task "Verify the resource pod name is populated".
If we do nothing and wait for the next reconciliation loop, this task will succeed and the deployment will complete.
The first reconciliation:
...
--------------------------- Ansible Task StdOut -------------------------------
TASK [installer : Verify the resource pod name is populated.] ******************
task path: /opt/ansible/roles/installer/tasks/resources_configuration.yml:284
-------------------------------------------------------------------------------
--------------------------- Ansible Task StdOut -------------------------------
TASK [Verify the resource pod name is populated.] ********************************
fatal: [localhost]: FAILED! => {
"assertion": "awx_web_pod_name != ''",
"changed": false,
"evaluated_to": false,
"msg": "Could not find the tower pod's name."
}
-------------------------------------------------------------------------------
{"level":"error","ts":"2024-03-21T00:38:10Z","logger":"logging_event_handler","msg":"","name":"awx-demo","namespace":"awx","gvk":"awx.ansible.com/v1beta1, Kind=AWX","event_type":"runner_on_failed","job":"3324091116751141590","EventData.Task":"Verify the resource pod name is populated.","EventData.TaskArgs":"","EventData.FailedTaskPath":"/opt/ansible/roles/installer/tasks/resources_configuration.yml:284","error":"[playbook task failed]","stacktrace":"github.com/operator-framework/ansible-operator-plugins/internal/ansible/events.loggingEventHandler.Handle\n\tansible-operator-plugins/internal/ansible/events/log_events.go:111"}
...
Just wait for the later reconciliation with doing nothing:
...
--------------------------- Ansible Task StdOut -------------------------------
TASK [installer : Verify the resource pod name is populated.] ******************
task path: /opt/ansible/roles/installer/tasks/resources_configuration.yml:284
-------------------------------------------------------------------------------
{"level":"info","ts":"2024-03-21T00:39:09Z","logger":"logging_event_handler","msg":"[playbook task start]","name":"awx-demo","namespace":"awx","gvk":"awx.ansible.com/v1beta1, Kind=AWX","event_type":"playbook_on_task_start","job":"7413164896366316510","EventData.Name":"installer : Migrate database to the latest schema"}
--------------------------- Ansible Task StdOut -------------------------------
TASK [installer : Migrate database to the latest schema] ***********************
task path: /opt/ansible/roles/installer/tasks/install.yml:97
-------------------------------------------------------------------------------
{"level":"info","ts":"2024-03-21T00:39:09Z","logger":"logging_event_handler","msg":"[playbook task start]","name":"awx-demo","namespace":"awx","gvk":"awx.ansible.com/v1beta1, Kind=AWX","event_type":"playbook_on_task_start","job":"7413164896366316510","EventData.Name":"installer : Check for pending migrations"}
...
As commented by @LukWe99, the wait for the web pod is removed by #1674, but we should wait for the web pod up and running before proceeding.
Of course I understand that adding wait again can't work anymore since init cotainer that wait migrations to be completed can't be completed at this point. So we should keep wait removed, alternatlvely, adding retries to the task where finding running web pod is ideal solution:
Please confirm the following
Bug Summary
On fresh deployments for AWX CR, sometimes the first or more reconciliation loop fails on the task "Verify the resource pod name is populated".
If we do nothing and wait for the next reconciliation loop, this task will succeed and the deployment will complete.
The first reconciliation:
Just wait for the later reconciliation with doing nothing:
AWX Operator version
2.13.1
AWX version
24.0.0
Kubernetes platform
kubernetes
Kubernetes/Platform version
k3s version v1.28.7+k3s1
Modifications
no
Steps to reproduce
Deploy AWX Operator 2.13.1 and following minimal AWX CR:
Expected results
The first reconliciation loop is completed successfully without any failed tasks.
Actual results
The first reconliciation loop is failed then we have to wait for the next or more loops to be completed.
Additional information
Related to #1777 (comment)
As commented by @LukWe99, the
wait
for the web pod is removed by #1674, but we should wait for the web pod up and running before proceeding.Of course I understand that adding
wait
again can't work anymore since init cotainer that wait migrations to be completed can't be completed at this point. So we should keepwait
removed, alternatlvely, addingretries
to the task where finding running web pod is ideal solution:awx-operator/roles/installer/tasks/resources_configuration.yml
Lines 258 to 268 in c6fe038
Operator Logs
No response
The text was updated successfully, but these errors were encountered: