Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deadlock when container exits while killing by podman #15492

Closed
tyler92 opened this issue Aug 26, 2022 · 21 comments · Fixed by #15494
Closed

Deadlock when container exits while killing by podman #15492

tyler92 opened this issue Aug 26, 2022 · 21 comments · Fixed by #15494
Labels
kind/bug Categorizes issue or PR as related to a bug. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments.

Comments

@tyler92
Copy link
Contributor

tyler92 commented Aug 26, 2022

Is this a BUG REPORT or FEATURE REQUEST? (leave only one on its own line)

/kind bug

Description

There is a situation when podman tries to lock the same mutex twice and deadlocks. After that, all podman's commands do not work due to endless waiting for the release of this mutex. There are two possibilities to reproduce this issue - realistic and not. The following steps are about not real but simple way:

Steps to reproduce the issue:

  1. Create a bash script with the following content (be careful - kill command inside):
#!/bin/bash

set -o errexit

podman rm -f -i label-for-kill 

while true; do
	podman info > /dev/null
	podman run --rm --init --name label-for-kill nginx &
	sleep 3
	echo send kill for `pgrep -f label-for-kill`
	kill `pgrep -f label-for-kill`
	sleep 3
done
  1. Execute the script.

Describe the results you received:

After some time the script will hang.

Describe the results you expected:

The script will never stop and it will do its work.

Additional information you deem important (e.g. issue happens only occasionally):

The issue happens only occasionally. In my case, it's not reproduced in the adm64 machine, but very often reproduced in my weak ARMv7 device. When issue happens I have the following output:

# ./test.sh 
send kill for 946 1015
/docker-entrypoint.sh: /docker-entrypoint.d/ is not empty, will attempt to perform configuration
/docker-entrypoint.sh: Looking for shell scripts in /docker-entrypoint.d/
/docker-entrypoint.sh: Launching /docker-entrypoint.d/10-listen-on-ipv6-by-default.sh
send kill for 1270 1337
/docker-entrypoint.sh: /docker-entrypoint.d/ is not empty, will attempt to perform configuration
/docker-entrypoint.sh: Looking for shell scripts in /docker-entrypoint.d/
/docker-entrypoint.sh: Launching /docker-entrypoint.d/10-listen-on-ipv6-by-default.sh
2022-08-26T07:53:55.000449985Z: kill container: No such process

and the processes list looks like:

 1270 S  `- podman run --rm --init --name label-for-kill nginx
 1356 Z      `- [3] <defunct>
 1337 S  `- /usr/bin/conmon <...>
 1355 S      `- /usr/bin/podman <...> container cleanup --rm <...>

Output of podman version:

Client:       Podman Engine
Version:      4.3.0-dev
API Version:  4.3.0-dev
Go Version:   go1.17.8
Built:        Thu Jan  1 00:00:00 1970
OS/Arch:      linux/arm

Output of podman info:

host:
  arch: arm
  buildahVersion: 1.27.0
  cgroupControllers:
  - cpuset
  - cpu
  - cpuacct
  - blkio
  - memory
  - devices
  - freezer
  - net_cls
  - perf_event
  - net_prio
  - pids
  cgroupManager: systemd
  cgroupVersion: v1
  conmon:
    package: Unknown
    path: /usr/bin/conmon
    version: 'conmon version 2.0.29, commit: unknown'
  cpuUtilization:
    idlePercent: 87.12
    systemPercent: 10.63
    userPercent: 2.24
  cpus: 4
  distribution:
    distribution: buildroot
    version: "2022.02"
  eventLogger: journald
  hostname: comm99-dev
  idMappings:
    gidmap: null
    uidmap: null
  kernel: 4.14.98
  linkmode: dynamic
  logDriver: k8s-file
  memFree: 180031488
  memTotal: 511803392
  networkBackend: cni
  ociRuntime:
    name: crun
    package: Unknown
    path: /usr/bin/crun
    version: |-
      crun version 1.4.3
      commit: 61c9600d1335127eba65632731e2d72bc3f0b9e8
      spec: 1.0.0
      +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +YAJL
  os: linux
  remoteSocket:
    exists: true
    path: /run/podman/podman.sock
  security:
    apparmorEnabled: false
    capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
    rootless: false
    seccompEnabled: true
    seccompProfilePath: ""
    selinuxEnabled: false
  serviceIsRemote: false
  slirp4netns:
    executable: ""
    package: ""
    version: ""
  swapFree: 0
  swapTotal: 0
  uptime: 0h 17m 39.00s
plugins:
  authorization: null
  log:
  - k8s-file
  - none
  - passthrough
  - journald
  network:
  - bridge
  - macvlan
  - ipvlan
  volume:
  - local
registries:
  search:
  - docker.io
  - registry.fedoraproject.org
  - registry.access.redhat.com
  - registry.centos.org
store:
  configFile: /etc/containers/storage.conf
  containerStore:
    number: 6
    paused: 0
    running: 0
    stopped: 6
  graphDriverName: overlay
  graphOptions:
    overlay.mountopt: nodev
  graphRoot: /opt/containers/storage
  graphRootAllocated: 29639618560
  graphRootUsed: 3874271232
  graphStatus:
    Backing Filesystem: extfs
    Native Overlay Diff: "true"
    Supports d_type: "true"
    Using metacopy: "false"
  imageCopyTmpDir: /var/tmp
  imageStore:
    number: 35
  runRoot: /run/containers/storage
  volumePath: /opt/containers/storage/volumes
version:
  APIVersion: 4.3.0-dev
  Built: 0
  BuiltTime: Thu Jan  1 00:00:00 1970
  GitCommit: ""
  GoVersion: go1.17.8
  Os: linux
  OsArch: linux/arm
  Version: 4.3.0-dev

Package info (e.g. output of rpm -q podman or apt list podman):

-

Have you tested with the latest version of Podman and have you checked the Podman Troubleshooting Guide? (https://github.com/containers/podman/blob/main/troubleshooting.md)

Yes

Additional environment details (AWS, VirtualBox, physical, etc.):

@openshift-ci openshift-ci bot added the kind/bug Categorizes issue or PR as related to a bug. label Aug 26, 2022
@vrothberg
Copy link
Member

Thanks for reaching out and for providing the reproducer, @tyler92!

@mheon, that smells like being caused by the exit-code changes. I will take a look.

@tyler92
Copy link
Contributor Author

tyler92 commented Aug 26, 2022

libpod.Kill locks container's mutex
ociRuntime.KillContainer in some cases calls libpod.Wait, that wants to lock the same mutex again

I tried to create PR with a fix, but to be honest, it looks a little bit difficult. The simplest way is unlock mutex in the beginning of the libpod.Kill function (as in Attach function), but race condition can occur in this case.

@vrothberg
Copy link
Member

@tyler92 could you send SIGHUP to the deadlocked process and paste the stacktrace here?

@tyler92
Copy link
Contributor Author

tyler92 commented Aug 26, 2022

In real life this can happen when we launch podman run --rm ... via SSH and the TCP connection is closed. In this scenario (as I understand) sshd will send SIGTERM to every process within the SSH session. And this is the original scenario where I found this problem.

@vrothberg
Copy link
Member

In real life this can happen when we launch podman run --rm ... via SSH and the TCP connection is closed. In this scenario (as I understand) sshd will send SIGTERM to every process within the SSH session. And this is the original scenario where I found this problem.

Can you elaborate what's deadlocking after the session is closed?

@vrothberg
Copy link
Member

ociRuntime.KillContainer in some cases calls libpod.Wait, that wants to lock the same mutex again

Can you point out that code path? I don't think that's possible.

@vrothberg
Copy link
Member

ociRuntime.KillContainer in some cases calls libpod.Wait, that wants to lock the same mutex again

Can you point out that code path? I don't think that's possible.

OK, found it: https://github.com/containers/podman/blob/main/libpod/oci_conmon_common.go#L283

@tyler92
Copy link
Contributor Author

tyler92 commented Aug 26, 2022

For some reason I can't get stack though SIGHUP, but I reproduce it in IDE:

lock.(*SHMLock).Lock (shm_lock_manager_linux.go:113) github.com/containers/podman/v4/libpod/lock
libpod.(*Container).WaitForExit.func1 (container_api.go:512) github.com/containers/podman/v4/libpod
libpod.(*Container).WaitForExit (container_api.go:565) github.com/containers/podman/v4/libpod
libpod.(*Container).Wait (container_api.go:495) github.com/containers/podman/v4/libpod
libpod.(*ConmonOCIRuntime).UpdateContainerStatus (oci_conmon_common.go:283) github.com/containers/podman/v4/libpod
libpod.(*ConmonOCIRuntime).KillContainer (oci_conmon_common.go:339) github.com/containers/podman/v4/libpod
libpod.(*Container).Kill (container_api.go:229) github.com/containers/podman/v4/libpod
terminal.ProxySignals.func1 (sigproxy_linux.go:41) github.com/containers/podman/v4/pkg/domain/infra/abi/terminal
runtime.goexit (asm_amd64.s:1594) runtime
 - Async Stack Trace
terminal.ProxySignals (sigproxy_linux.go:30) github.com/containers/podman/v4/pkg/domain/infra/abi/terminal

@vrothberg
Copy link
Member

Excellent, thanks @tyler92. I am sure we'll find a solution.

@tyler92
Copy link
Contributor Author

tyler92 commented Aug 26, 2022

Can you elaborate what's deadlocking after the session is closed?

The same thing as in description, but in SSH case SIGTERM is sent by sshd daemon. STR:

  1. Connect by ssh to some host with podman
  2. Launch podman run --rm ...
  3. Close terminal window (yes, by clicking 'close' button) - this will lead to TCP session close
  4. Observe podman state on the remote host.

But FYI: I tried to reproduce this in amd64 machine and I had no success. But in ARMv7 - ~50% of attempts lead to the issue. My ARMv7 host is significantly slower - maybe it's the reason why I faced with deadlocks last time =)

vrothberg added a commit to vrothberg/libpod that referenced this issue Aug 26, 2022
Commit 30e7cbc accidentally added a deadlock as Podman was waiting
for the exit code to show up when the container transitioned to stopped.
Code paths that require the exit code to be written (by the cleanup
process) should already be using `(*Container).Wait()` in a deadlock
free way.

[NO NEW TESTS NEEDED] as I did not manage to a reproducer that would
work in CI.  Ultimately, it's a race condition.

Fixes: containers#15492
Signed-off-by: Valentin Rothberg <[email protected]>
@vrothberg
Copy link
Member

@tyler92 could you test #15494?

@tyler92
Copy link
Contributor Author

tyler92 commented Aug 26, 2022

Sure, give me some time.

@tyler92
Copy link
Contributor Author

tyler92 commented Aug 26, 2022

After several iterations, I don't see deadlocks, but it looks a little bit strange.

  1. After some time I see "Error: error creating container storage: the container name "label-for-kill" is already in use by 951c6a6b159eeaf138ade41993bcdeee83956842a7eed16605e7a4e89e7e9e36. You have to remove that container to be able to reuse that name: that name is already in use". Looks like option --rm didn't complete its work.
  2. There are processes in process list, that are never exited:
17308 S  `- /usr/bin/crun <...> create --bundle <...>
17829 S  `- /usr/bin/crun <...> create --bundle <...>
19599 S  `- /usr/bin/crun <...> create --bundle <...>
19976 S  `- /usr/bin/conmon <...> --exit-command-arg cleanup <...>
19980 S      `- /usr/bin/crun  <...> create --bundle <...>

This process list is increased until an error from (1). And even if I stop my script they still exist.

@vrothberg
Copy link
Member

I don't have a good explanation for that observation. @mheon WDYT?

@mheon
Copy link
Member

mheon commented Aug 26, 2022

The hung crun instances are concerning - those are OCI runtimes that have escaped the supervision of a Conmon. They could be completely hung in their setup code, but crun create will also naturally stop just short of actually exec()ing the first process in the container, so my initial inclination would be that each of these stopped naturally, waiting for Podman to do a final crun start, which never happened - the equivalent of Podman stopping midway through podman run, possibly due to error? This could explain why the container failed to clean up fully (the crun process is using files from the container's rootfs, so we can't unmount it, can't remove it from c/storage, name stays in use). I don't know what could actually be causing this, though. I would assume it's a failed podman run - any logs to that effect?

@tyler92
Copy link
Contributor Author

tyler92 commented Aug 27, 2022

Ok, thanks for the fix. I'll investigate other problems and create separate issues with details if necessary.

@vrothberg
Copy link
Member

Thanks a lot, @tyler92 !

vrothberg added a commit to vrothberg/libpod that referenced this issue Aug 28, 2022
Commit 30e7cbc accidentally added a deadlock as Podman was waiting
for the exit code to show up when the container transitioned to stopped.
Code paths that require the exit code to be written (by the cleanup
process) should already be using `(*Container).Wait()` in a deadlock
free way.

[NO NEW TESTS NEEDED] as I did not manage to a reproducer that would
work in CI.  Ultimately, it's a race condition.

Backport-for: containers#15492
Signed-off-by: Valentin Rothberg <[email protected]>
@tyler92
Copy link
Contributor Author

tyler92 commented Aug 29, 2022

I'm not ready to create a separate issue right now, but I gathered logs:

podman-run.log (here is one line from test script)
process-list.txt

It looks like podman run was killed right after conmon process started. And conmon is not killed by SIGTERM (because its child crun is not killed?). What do you think - should I create an issue with this information? It looks a little bit artificial case, but this can lead to useless processes.

@mheon
Copy link
Member

mheon commented Aug 29, 2022

If Podman is ending up in a bad state (podman inspect output for the container in question would help), then this is definitely worth a bug.

@tyler92
Copy link
Contributor Author

tyler92 commented Aug 29, 2022

podman-inspect.log

Does it look like a bad state?

@mheon
Copy link
Member

mheon commented Aug 29, 2022

Yep. The container is stuck in Created state, when it should be in Initialized. It doesn't have a PID registered for Conmon. It looks like Podman died before it could register that Conmon had started.

vrothberg added a commit to vrothberg/libpod that referenced this issue Sep 12, 2022
Commit 30e7cbc accidentally added a deadlock as Podman was waiting
for the exit code to show up when the container transitioned to stopped.
Code paths that require the exit code to be written (by the cleanup
process) should already be using `(*Container).Wait()` in a deadlock
free way.

[NO NEW TESTS NEEDED] as I did not manage to a reproducer that would
work in CI.  Ultimately, it's a race condition.

Backport-for: containers#15492
BZ: https://bugzilla.redhat.com/show_bug.cgi?id=2124716
BZ: https://bugzilla.redhat.com/show_bug.cgi?id=2125647
Signed-off-by: Valentin Rothberg <[email protected]>
@github-actions github-actions bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label Sep 17, 2023
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 17, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
kind/bug Categorizes issue or PR as related to a bug. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants