Nightly tests did not succeed on fedora-38/podman-next: testHealthcheckUser timeout #1443

cockpituous · 2023-10-11T01:58:18Z

Tests failed on 008ac87

The text was updated successfully, but these errors were encountered:

martinpitt · 2023-10-11T04:10:36Z

I've seen the same failure in containers/podman#20322 (comment) (earlier version, not current). This could be a new race condition, and needs investigation.

martinpitt · 2023-10-11T14:33:34Z

Also just happened in containers/podman#20161 in https://artifacts.dev.testing-farm.io/7f9b5087-dcac-4ee1-a53c-86e4d1339f91/

martinpitt · 2023-10-11T15:45:49Z

I tried to reproduce this locally, with running

for i in `seq 5`; do test/check-application TestApplication.testHealthcheckUser $RUNC || break; done

(and ...System in $RUNC2 in parallel). The loop is stable against current F38.

Then

dnf -y copr enable rhcontainerbot/podman-next
dnf update --repo='copr*' -y

This updates podman-5:4.7.0-1.fc38.x86_64 to podman-102:4.8.0~dev-1.20231011135052610412.main.2121.d437ca8fd.fc38.x86_64 , and of course a lot of related packages (netavark, crun, etc.), but I believe healthchecks belong into podman. This makes both *User and *System fail reliably, but later than where it failed on CI:

  File "/var/home/martin/upstream/cockpit-podman/test/check-application", line 2419, in testHealthcheckUser
    self._testHealthcheck(False)
  File "/var/home/martin/upstream/cockpit-podman/test/check-application", line 2347, in _testHealthcheck
    b.wait_visible(".ct-listing-panel-body tbody tr:nth-child(2)")
[...]
wait_js_cond(ph_is_present(".ct-listing-panel-body tbody tr:nth-child(2)")): Uncaught (in promise) Error: condition did not become true

Indeed clicking "Run health check" now doesn't do anything, neither does podman healthcheck run healthy. I have to reload the page to make them appear.

One important difference is that on current F38, I get an additional exec_died event when a healthcheck finishes:

2023-10-11 15:40:26.5563557 +0000 UTC container exec_died e781541fc1204729e2b36d2c5fabc21beb6d00e05d8f89ef953b81cead9fc8db (image=localhost/test-busybox:latest, name=healthy)
2023-10-11 15:40:26.568280289 +0000 UTC container health_status e781541fc1204729e2b36d2c5fabc21beb6d00e05d8f89ef953b81cead9fc8db (image=localhost/test-busybox:latest, name=healthy, health_status=healthy)

while with podman-next, I just get:

2023-10-11 15:42:08.858067536 +0000 UTC container health_status 1507d14a25b3c9d424d6dba58f9c27072ebf48f4682ed4ab0f382ba23341f3ff (image=localhost/test-busybox:latest, name=healthy, health_status=healthy)

Indeed we ignore the health_status event, that was a workaround for containers/podman#19237 which got fixed recently in podman.

When I fix/relax the workaround, it seems to work fine.

We previously didn't react to `health_status` events as they were broken, and only reacted to `exec_died` instead. With the upcoming podman release, `health_status` events are reliable, and they will also not be accompanied by an `exec_died` event any more. So start updating the container status on them. Still keep listening to `exec_died` to support older podman releases. Fixes cockpit-project#1443

We previously didn't react to `health_status` events as they were broken, and only reacted to `exec_died` instead. With the upcoming podman release, `health_status` events are reliable, and they will also not be accompanied by an `exec_died` event any more. So start updating the container status on them. Still keep listening to `exec_died` to support older podman releases. Fixes #1443

cockpituous added the nightly label Oct 11, 2023

martinpitt changed the title ~~Nightly tests did not succeed on fedora-38/podman-next~~ Nightly tests did not succeed on fedora-38/podman-next: testHealthcheckUser timeout Oct 11, 2023

martinpitt mentioned this issue Oct 11, 2023

CI: test overlay and vfs containers/podman#20161

Merged

martinpitt mentioned this issue Oct 11, 2023

React to health_status events #1447

Merged

jelly closed this as completed in #1447 Oct 12, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Nightly tests did not succeed on fedora-38/podman-next: testHealthcheckUser timeout #1443

Nightly tests did not succeed on fedora-38/podman-next: testHealthcheckUser timeout #1443

cockpituous commented Oct 11, 2023 •

edited by martinpitt

Loading

martinpitt commented Oct 11, 2023

martinpitt commented Oct 11, 2023

martinpitt commented Oct 11, 2023

Nightly tests did not succeed on fedora-38/podman-next: testHealthcheckUser timeout #1443

Nightly tests did not succeed on fedora-38/podman-next: testHealthcheckUser timeout #1443

Comments

cockpituous commented Oct 11, 2023 • edited by martinpitt Loading

martinpitt commented Oct 11, 2023

martinpitt commented Oct 11, 2023

martinpitt commented Oct 11, 2023

cockpituous commented Oct 11, 2023 •

edited by martinpitt

Loading