Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The conditional check ''OK' in get_url_result.msg or 'file already exists' in get_url_result.msg or get_url_result.status_code == 304' failed. #10494

Closed
vyom-soft opened this issue Oct 3, 2023 · 10 comments · Fixed by #10613
Labels
kind/bug Categorizes issue or PR as related to a bug.

Comments

@vyom-soft
Copy link

Environment:

  • Cloud provider or hardware configuration:

  • OS (printf "$(uname -srm)\n$(cat /etc/os-release)\n"):
    Linux 4.18.0-372.19.1.el8_6.x86_64 x86_64

NAME="Red Hat Enterprise Linux"
VERSION="8.8 (Ootpa)"
ID="rhel"
ID_LIKE="fedora"
VERSION_ID="8.8"
PLATFORM_ID="platform:el8"
PRETTY_NAME="Red Hat Enterprise Linux 8.8 (Ootpa)"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:redhat:enterprise_linux:8::baseos"
  • Version of Ansible (ansible --version):
ansible [core 2.15.4]
  config file = /etc/ansible/ansible.cfg
  configured module search path = ['/home/kvib/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules']
  ansible python module location = /usr/local/lib/python3.11/site-packages/ansible
  ansible collection location = /home/kvib/.ansible/collections:/usr/share/ansible/collections
  executable location = /usr/bin/ansible
  python version = 3.11.2 (main, Jun  6 2023, 07:39:01) [GCC 8.5.0 20210514 (Red Hat 8.5.0-18)] (/usr/bin/python3.11)
  jinja version = 3.1.2
  libyaml = True
  • Version of Python (python --version):
    Python 3.11.5

Kubespray version (commit) (git rev-parse --short HEAD):
v2.23.0

Network plugin used:

Full inventory with variables (ansible -i inventory/sample/inventory.ini all -m debug -a "var=hostvars[inventory_hostname]"):

Command used to invoke ansible:
ansible-playbook -i inventory/k8s-kvib/hosts.yaml --check --become --become-user=root cluster.yml -Kk -vvv

Output of ansible run:

Anything else do we need to know:

TASK [container-engine/containerd : Download_file | Download item] ***********************************************************************************************************************************************************
task path: /home/kvib/kubespray-2.23.0/kubespray/roles/download/tasks/download_file.yml:88
fatal: [node1]: FAILED! => {
    "msg": "The conditional check ''OK' in get_url_result.msg or 'file already exists' in get_url_result.msg or get_url_result.status_code == 304' failed. The error was: error while evaluating conditional ('OK' in get_url_result.msg or 'file already exists' in get_url_result.msg or get_url_result.status_code == 304): 'dict object' has no attribute 'status_code'. 'dict object' has no attribute 'status_code'"
}

NO MORE HOSTS LEFT ***********************************************************************************************************************************************************************************************************

PLAY RECAP *******************************************************************************************************************************************************************************************************************
[localhost](http://localhost/)                  : ok=3    changed=0    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0
node1                      : ok=130  changed=7    unreachable=0    failed=1    skipped=232  rescued=0    ignored=0
node2                      : ok=111  changed=4    unreachable=0    failed=0    skipped=219  rescued=0    ignored=0
node3                      : ok=110  changed=4    unreachable=0    failed=0    skipped=220  rescued=0    ignored=0
node4                      : ok=107  changed=4    unreachable=0    failed=0    skipped=223  rescued=0    ignored=0
node5                      : ok=107  changed=3    unreachable=0    failed=0    skipped=223  rescued=0    ignored=0

Monday 02 October 2023  15:35:07 +0200 (0:00:13.878)       0:02:44.429 ********

@vyom-soft vyom-soft added the kind/bug Categorizes issue or PR as related to a bug. label Oct 3, 2023
@VannTen
Copy link
Contributor

VannTen commented Nov 3, 2023

This look like ansible/ansible#65263 but this should be fixed since ansible 2.10., so this is a bit weird.

@Saigut
Copy link

Saigut commented Nov 7, 2023

same problem here (kubespray 2.23.1, run in docker container):

TASK [container-engine/runc : Download_file | Download item] ******************************************************************************************************************************************************************************
fatal: [node2]: FAILED! => {"msg": "The conditional check ''OK' in get_url_result.msg or 'file already exists' in get_url_result.msg or get_url_result.status_code == 304' failed. The error was: error while evaluating conditional ('OK' in get_url_result.msg or 'file already exists' in get_url_result.msg or get_url_result.status_code == 304): 'dict object' has no attribute 'status_code'. 'dict object' has no attribute 'status_code'"}
changed: [node1]

NO MORE HOSTS LEFT ************************************************************************************************************************************************************************************************************************

PLAY RECAP ********************************************************************************************************************************************************************************************************************************
localhost                  : ok=3    changed=0    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0   
node1                      : ok=132  changed=2    unreachable=0    failed=0    skipped=268  rescued=0    ignored=0   
node2                      : ok=108  changed=1    unreachable=0    failed=1    skipped=240  rescued=0    ignored=0 ```

@VannTen
Copy link
Contributor

VannTen commented Nov 7, 2023

I got a simple reproducer (run with ansible-playbook -i localhost playbook-file.yml)

- hosts: localhost
  vars:
    containerd_archive_checksums:
      arm64:
        1.7.8: 3fc551e8f51150804d80cc1958a271bd2252b6334f0355244d0faa5da7fa55d1
      amd64:
        1.7.8: 5f1d017a5a7359514d6187d6656e88fb2a592d107e6298db7963dbddb9a111d9
        1.7.8.wrong: 4f1d017a5a7359514d6187d6656e88fb2a592d107e6298db7963dbddb9a111d9 # altered checksum
  tasks:
  - name: Download_file | Download item
    get_url:
      url: "https://github.com/containerd/containerd/releases/download/v1.7.8/containerd-1.7.8-linux-amd64.tar.gz"
      dest: /tmp/container.tar.gz
      checksum: "{{ 'sha256:' + containerd_archive_checksums.amd64[item] }}"
    register: get_url_result
    until: "'OK' in get_url_result.msg or
      'file already exists' in get_url_result.msg or
      get_url_result.status_code == 304"
    loop:
      - 1.7.8
      - 1.7.8.wrong

Look like when the file already exists and the checksum is different ansible blows up. The fix was not robust enough apparently

@Saigut
Copy link

Saigut commented Nov 7, 2023

I tried kubespray-2.22.1, which have no such problem.

@VannTen
Copy link
Contributor

VannTen commented Nov 7, 2023

Yeah there has been a change with #10452 but I'm not sure if the problem is kubespray or ansible.
Is the file already present when this appear (if you can check) ? It's on a cluster upgrade ?

@Saigut
Copy link

Saigut commented Nov 8, 2023

Yeah there has been a change with #10452 but I'm not sure if the problem is kubespray or ansible. Is the file already present when this appear (if you can check) ? It's on a cluster upgrade ?

In my envirenment, the cluster is successfully configured by kubespray-2.22.1, so I can't debug this problem now.

When I met this problem, I am trying to create cluster with kubespray-2.23.1, and had ran ansible-playbook xxx cluster.yml several times due to other probelms (mainly were network problem).

@fabianstern1
Copy link

fabianstern1 commented Nov 9, 2023

I can confirm this problem too. I am using kubespray on tag v2.23.1 and have a raw setup in Hyper-V with Ubuntu 22.04 LTS machines with just a default installation and openssh-server enabled. After 45 minutes one or multiple of my 5 nodes randomly fail with this message:

error

A re-run on a clean system always gives similar output. A re-run straight after the error you see above seems to work (see image below). It is really annoying as I saw that especially the last commit message was treating a 304 Not Modified issue in the Download File task. Please help here if possible !

error2

@fabianstern1
Copy link

I was able to follow the ansible source code and could verify that in some cases status_code indeed does not return an integer as the documentation suggests here ( https://docs.ansible.com/ansible/latest/collections/ansible/builtin/get_url_module.html#return-status_code ). Instead it can be NULL as can be seen here: bjolivot/ansible@8a55c91#r132162667

I think we need to find out what message is returned instead and should not use status_code at all, what do you think?

@RomainMou
Copy link
Contributor

RomainMou commented Nov 10, 2023

I got a simple reproducer (run with ansible-playbook -i localhost playbook-file.yml)

[...]

Look like when the file already exists and the checksum is different ansible blows up. The fix was not robust enough apparently

It should failed in this case: the checksum is different from the checksum of the downloaded file. But we should have retries and a better error message.

It feels like status_code can be undefined when there is an issue, this break the until stuff, so it can't retry.
If I'm not mistaken, we could simply add something like:

      until: "'OK' in get_url_result.msg or
        'file already exists' in get_url_result.msg or
         (get_url_result.status_code is defined and get_url_result.status_code == 304)"

@mdbudnick
Copy link

mdbudnick commented Dec 15, 2023

This looks related to: #10592 which is indicating some kind of DNS problem.

Edit There is no checksum issue, I was looking at the wrong sha

I am still having issues after trying the suggestions from @RomainMou here and in #10613

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants