-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
podman-remote: there's a hang somewhere #7241
Comments
Another wait https://cirrus-ci.com/task/6571684087988224 |
This script will reproduce it, although not quickly, you have to do it in a loop and wait possibly hours: $ cat >7241.sh <<EOF
set -e
./bin/podman-remote run -d --name foo alpine sh -c 'touch /foo;while test -e /foo; do sleep 1;done'
cid=foo
./bin/podman-remote logs $cid
./bin/podman-remote exec $cid /etc || true
./bin/podman-remote exec $cid no-such-command || true
./bin/podman-remote exec $cid rm /foo
timeout -v 15 ./bin/podman-remote wait $cid
./bin/podman-remote rm $cid
EOF
$ while bash -x 7241.sh;do echo;echo;done
...copious output... eventually dies
+ timeout -v 15 ./bin/podman-remote wait foo
timeout: sending signal TERM to command ‘./bin/podman-remote’
$ podman ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
c87ff4b0836f docker.io/library/alpine:latest sh -c touch /foo;... 15 minutes ago Exited (0) 10 minutes ago foo I find it interesting that CREATED was 15 minutes ago, but it exited 10. Implying that it took 5 minutes between the I'm heading out for the day, but am leaving the container as it is in case anyone has suggestions for me to |
same with you in "podman-remote wait $cid", in https://storage.googleapis.com/cirrus-ci-6707778565701632-fcae48/artifacts/containers/podman/5815345191583744/html/system_test.log.html.
exitCode is 124, so it seem be killed by something, but I test this on my local machine, it will be ok |
- new sanity checks for podman-remote: - first, confirm that when PODMAN is "-remote", we actually talk to a server (validated by presence of "Server:" string in "podman version"). - second, add test for containers#7212, in which we run "podman --remote" (podman with --remote flag, not podman-remote command) and make sure --remote is allowed both as the first option and also with other flag options preceding. - new test for "podman image tree" (piggybacking on top of a "podman build" test, because that gives us lots of layers). - skip "podman exec - basic test" when remote. It is consistently causing CI failures, breaking all of CI, due to containers#7241. Signed-off-by: Ed Santiago <[email protected]>
- new sanity checks for podman-remote: - first, confirm that when PODMAN is "-remote", we actually talk to a server (validated by presence of "Server:" string in "podman version"). - second, add test for containers#7212, in which we run "podman --remote" (podman with --remote flag, not podman-remote command) and make sure --remote is allowed both as the first option and also with other flag options preceding. - new test for "podman image tree" (piggybacking on top of a "podman build" test, because that gives us lots of layers). - skip "podman exec - basic test" when remote. It is consistently causing CI failures, breaking all of CI, due to containers#7241. Signed-off-by: Ed Santiago <[email protected]>
A friendly reminder that this issue had no activity for 30 days. |
@edsantiago Still seeing this one? |
Sorry, yes. Issue still present in master @ 54a61e3. It took 35 minutes of looping on my f32 laptop, but: ...
+ timeout -v 15 ./bin/podman-remote wait foo
timeout: sending signal TERM to command ‘./bin/podman-remote’
$ ./bin/podman ps -a -
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
316df2501ed6 docker.io/library/alpine:latest sh -c touch /foo;... 6 minutes ago Exited (0) About a minute ago foo |
A friendly reminder that this issue had no activity for 30 days. |
@edsantiago the original report said that there were hangs all over the place, but I guess this is very rare now? |
I I tried running the |
It was 'skip'ped due to frequent flakes (containers#7241). I just tried running the 7241 reproducer on my laptop for one hour, and saw no failures, so let's reenable this in CI and see if it comes back. I really hate problems that "go away" on their own without being explicitly acknowledged and fixed. Signed-off-by: Ed Santiago <[email protected]>
I believe this is fixed now, @edsantiago reopen if I am mistaken, |
Test was disabled August 2020 due to containers#7241, a hang. That issue has been closed, so let's see if it's really fixed. Signed-off-by: Ed Santiago <[email protected]>
Another semi-useless report with no reproducer nor actual details.
There's a hang somewhere in podman-remote. It's causing flakes in CI all over the place. It does not seem to be deterministic, it's happening in lots of different situations.
Examples:
Sorry for sparse details, I'm trying very hard to be done for the week and am failing dismally. Please add examples here as you see them, I think this is going to be an ugly one.
The text was updated successfully, but these errors were encountered: