Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

healthcheck run: error: ctr does not exist in database: no such container #16075

Closed
edsantiago opened this issue Oct 6, 2022 · 6 comments · Fixed by #16129
Closed

healthcheck run: error: ctr does not exist in database: no such container #16075

edsantiago opened this issue Oct 6, 2022 · 6 comments · Fixed by #16129
Labels
flakes Flakes from Continuous Integration locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments.

Comments

@edsantiago
Copy link
Member

Seen just now in an in-flight PR:

[+1345s] not ok 292 podman create --health-on-failure=kill
...
# podman healthcheck run APUcS4ihNr
Error: container 43712cb6940abc3a35e883961fa79d4632674e40ca72394fec96535842fdf06c does not exist in database: no such container
[ rc=125 (** EXPECTED 0 **) ]

f36 local root aarch64. Also seen in my flake logs, but only once so I hadn't filed it:

[sys] 288 podman create --health-on-failure=kill

@edsantiago edsantiago added the flakes Flakes from Continuous Integration label Oct 6, 2022
@edsantiago
Copy link
Member Author

Another one on f36 aarch64 root

@edsantiago
Copy link
Member Author

and again

@edsantiago
Copy link
Member Author

This one is really blowing up. It has failed seven reruns (eight total) in nightly cron

@rhatdan
Copy link
Member

rhatdan commented Oct 10, 2022

@vrothberg PTAL

@vrothberg
Copy link
Member

I'll take a look.

vrothberg added a commit to vrothberg/libpod that referenced this issue Oct 11, 2022
The on-failure=kill system tests turned out to be flaky.
Once the container has been killed, the test waits for
systemd to restart the service by running `container inspect`
for 10 seconds.  The subsequent `healthcheck run` was the
flake point which suggests the 10 seconds timeout to not be
sufficiently high enough; presumably when the CI nodes are
under pressure.

Fixes: containers#16075
Signed-off-by: Valentin Rothberg <[email protected]>
@vrothberg
Copy link
Member

Opened #16112. Looks like test-side fix to me.

@github-actions github-actions bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label Sep 13, 2023
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 13, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
flakes Flakes from Continuous Integration locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments.
Projects
None yet
3 participants