Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parricide #8066

Closed
joeyhub opened this issue Oct 19, 2020 · 9 comments · Fixed by containers/buildah#2708
Closed

Parricide #8066

joeyhub opened this issue Oct 19, 2020 · 9 comments · Fixed by containers/buildah#2708
Labels
kind/bug Categorizes issue or PR as related to a bug. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments.

Comments

@joeyhub
Copy link

joeyhub commented Oct 19, 2020

I have my own make system which generates the entire dependency tree of commands needed to fully build and upload a set of images.

It may produce something like this:

set -euo pipefail
docker build some_docker_file many_args_0
docker tag a b
docker build some_docker_file many_args_1

I would then pipe that into bash, for example...

dump_make_commands | bash

With podman it seems to just stop after one of the build commands inexplicably.

This never happens with docker.

At first the obvious cause might be a different exit code on success but I see none of the normal signs of that. The exit code of the parent is 0 as though it finished properly.

It's as though podman is somehow either signalling the child to gracefully die with a success code or it's closing the parent pipe gracefully by mistake.

It's almost home time for now but I may later try...

strace it and look to the end later (though it creates massive noise to reach the right point) to see why it's mysteriously exiting.

Running it in double bash.

I don't think this was happening until one of the more recent updates though I'm not entirely sure.

@joeyhub
Copy link
Author

joeyhub commented Oct 19, 2020

strace causes it to lock up completely.

@joeyhub
Copy link
Author

joeyhub commented Oct 19, 2020

This happens even without bash exiting on a bad exit code.

@joeyhub
Copy link
Author

joeyhub commented Oct 19, 2020

I've not tested trying to use podman without the docker compatibility wrapper.

@mheon mheon added Buildah kind/bug Categorizes issue or PR as related to a bug. labels Oct 19, 2020
@mheon
Copy link
Member

mheon commented Oct 19, 2020

@TomSweeneyRedHat PTAL

@joeyhub
Copy link
Author

joeyhub commented Oct 19, 2020

strace with -f locks up but without is fine to try to find what stops it.

podman-2.1.1-4.el8.x86_64
podman-docker-2.1.1-4.el8.noarch
podman-plugins-2.1.1-4.el8.x86_64

Because of the silent death it's not obvious this is happening until things just don't update except for half way.

I think it also happened once with run as well (I use that to build the software in a separate build container).

The last thing strace sees is this:

rt_sigprocmask(SIG_BLOCK, [INT CHLD], [], 8) = 0
clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7f09f4a29a10) = 969648
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0
rt_sigaction(SIGINT, {sa_handler=0x563810011b90, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0x7f09f4051790}, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0x7f09f4051790}, 8) = 0
wait4(-1, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 969648
// RESUMES HERE
rt_sigaction(SIGINT, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0x7f09f4051790}, {sa_handler=0x563810011b90, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0x7f09f4051790}, 8) = 0
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=969648, si_uid=0, si_status=0, si_utime=3268, si_stime=1153} ---
wait4(-1, 0x7ffd8d342f90, WNOHANG, NULL) = -1 ECHILD (No child processes)
rt_sigreturn({mask=[]})                 = 0
// Not sure if red herring but there should definitely be more left in the buffer.
read(0, "", 1)                          = 0
rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
exit_group(0)                           = ?
+++ exited with 0 +++

@joeyhub
Copy link
Author

joeyhub commented Oct 19, 2020

It doesn't seem to happen with...

dump_make_commands > x.sh;bash x.sh;rm -f x.sh

The bash history is filled with this version from another dev who seems to have stumbled upon the work around and not mentioned anything.

It looks like it probably is draining the pipe from the process that's running it when it shouldn't be.

@rhatdan
Copy link
Member

rhatdan commented Oct 20, 2020

@nalind @edsantiago PTAL
I am not familiar with this syntax, IE Piping to bash.

@edsantiago
Copy link
Member

It's a buildah bug: it is gobbling stdin that does not belong to it. Trivial reproducer (Containerfile is left as an exercise for the reader):

$ printf "echo hi\r\nbuildah bud -t foo .\necho bye\n" | bash
hi
STEP 1: ...
...
<sha>

(the 'echo bye' is gobbled up by buildah). If you redirect stdin on the buildah command, it works as expected:

$ printf "echo hi\r\nbuildah bud -t foo . </dev/null\necho bye\n" | bash
hi
...
bye

Someone should file a buildah issue (I can, on request). podman has a long history of stdin-gobbling bugs; I'm pretty sure @mheon and @baude have experience dealing with them, I can CC them in case the fix is non-obvious.

@TomSweeneyRedHat
Copy link
Member

@edsantiago Please do file a Buildah bug.

rhatdan added a commit to rhatdan/buildah that referenced this issue Oct 20, 2020
Fixes: containers/podman#8066
Is reporting that buildah is eating stdin.  I don't beleive
we should be using stdin when doing a buildah bud command
unless `buildah bud -` is specified.  After this PR, the
`-` Dockerfile is still handled.

Signed-off-by: Daniel J Walsh <[email protected]>
rhatdan added a commit to rhatdan/buildah that referenced this issue Oct 20, 2020
Fixes: containers/podman#8066
Is reporting that buildah is eating stdin.  I don't beleive
we should be using stdin when doing a buildah bud command
unless `buildah bud -` is specified.  After this PR, the
`-` Dockerfile is still handled.

Signed-off-by: Daniel J Walsh <[email protected]>
@github-actions github-actions bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label Sep 22, 2023
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 22, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
kind/bug Categorizes issue or PR as related to a bug. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants