-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
podman deadlocks with multiple concurrent running containers #11589
Comments
Can you check if you are able to use |
Yes, while
if this can help:
|
@mheon PTAL |
Please get me a |
Sorry, |
|
About Python wrapper: it creates a multiprocess pool, w/ set_start_method "spawn", i.e. each subprocess runs a new Python interpreter. Each process starts a new podman container with the help of subprocess.run(command, capture_output=True, shell=True, cwd=cwd,
timeout=timeout, check=False, errors='ignore',
text=True, env=env) when the command is a shell script that calls Please let me know if you need more details. |
I can confirm that on my 16C/32T host, up to 8 concurrent processes work fine, while deadlocks arise as soon as 9 are used. No idea why unfortunately. |
A friendly reminder that this issue had no activity for 30 days. |
A friendly reminder that this issue had no activity for 30 days. |
Lets concentrate the discussion in #11940 |
Is this a BUG REPORT or FEATURE REQUEST? (leave only one on its own line)
/kind bug
Description
When spawning many podman processes in a short period of time, podman deadlocks.
Processes never leave, do not seem to consume any CPU time.
When this state is reached, invoking
podman ps
in another terminal deadlocks as well (no output, no exit)I'm sharing the same volumes with read/only accesses, and a local host directory with all the started podman containers.
Maybe I'm exhausting some limited resources (?) but I do not seem to receive any error message, so I'm not sure how to address this issue. Anyway it leaves
podman
in a deadlocked state, which I believe is not supposed to happen.Steps to reproduce the issue:
/usr/bin/podman run --rm ---volumes-from binutils:z,ro --volumes-from gcc:z,ro --mount type=bind,source=$HOME/tmp,target=$HOME/tmp ... image build.sh
FWIW, the podman containers are started from a Python multiprocess-based script.
It seems that podman is able to cope with up to 8 concurrent processes, more processes trigger the error.
Describe the results you received:
Some processes complete (only a couple of them), most do not.
Most
podman
commands (such asps
) are no longer able to run, until I force-kill all podman processes.Describe the results you expected:
podman ps
,podman info
is able to report the current status w/o deadlockingAdditional information you deem important (e.g. issue happens only occasionally):
Always happens.
Works with no trouble when docker is used instead of podman.
podman seems to get confused:
podman ps
, when not deadlocked, may report no container, while many podman processes as still running.I also got some error messages I'm not able to interpret such as:
ERRO[0000] error joining network namespace for container 7b00d1e329ba23d2878e8c8694fdcb28f1603e561e4aefd9051597d1846e3799: error retrieving network namespace at /run/user/502/netns/cni-1339be70-57d2-6b39-9f7b-0c7241ca14f6: unknown FS magic on "/run/user/502/netns/cni-1339be70-57d2-6b39-9f7b-0c7241ca14f6": 1021994
While trying to remove a container that had been left in the "Created state", I got
but the container was nevertheless removed, and
podman stats
reportedError: container state improper
Output of
podman version
:Output of
podman info --debug
:Package info (e.g. output of
rpm -q podman
orapt list podman
):Have you tested with the latest version of Podman and have you checked the Podman Troubleshooting Guide?
Yes
Additional environment details (AWS, VirtualBox, physical, etc.):
The text was updated successfully, but these errors were encountered: