-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
podman run --rm does not work when lots of container process are killed at same time. #7051
Comments
You should be able to get a newer version of podman for RHEL8.2.1. Try to update the podman package and report if the issue is fixed. RHEL8.2.1 was released yesterday, I believe. |
If this does persist in 1.9, a proper reproducer would be greatly appreciated - how many containers are involved, does the command in question matter, what command(s) are used to remove them, etc. |
The overall workflow is a little bit complicated. we started an execution daemon(execution server -> res ) on host. to stop/kill container . But both them will hit the issue above. Here is an example e.g. for 029fa9794390, the status is "Up" but actually, the state is not correct. boliu@lsf1x125[conf]:$podman exec -it 029fa9794390 /bin/bash boliu@lsf1x125[conf]:$podman stop 029fa9794390 boliu@lsf1x125[conf]:$podman rm 029fa9794390 In our container controller script log, we can see messages like below: e. g. for container: /sys/fs/cgroup/systemd/user.slice/user-34040.slice/[email protected]/user.slice/podman-1733134.scope/138ead2c5272d3b33d30ef962086cc144f2cfa37c1b54aa3b8d218c435cdfe21/cgroup.procs The running container process has been killed, but the container remains. 2020-07-22 03:40:08,664 lsf-docker[1734843] DEBUG : err: But if I run "podman stop " and "podman rm -f " manually in a console, For this error, I can make sure the environment variable: XDG_RUNTIME_DIR is unset, and uid/gid is not 0, and "HOME" is set to container owner, but the error still occurs, here is my code about it. 123 if uid == 0: any input? Regards, |
BTW, I found podman core dump sometimes, here is back trace: [Thread debugging using libthread_db enabled] warning: Loadable section ".note.gnu.property" outside of ELF segments warning: Loadable section ".note.gnu.property" outside of ELF segments warning: Loadable section ".note.gnu.property" outside of ELF segments warning: Loadable section ".note.gnu.property" outside of ELF segments warning: Loadable section ".note.gnu.property" outside of ELF segments warning: Loadable section ".note.gnu.property" outside of ELF segments Leo. |
Looks like a sig-proxy race, sending to a container that's already dead can lead to a panic. We've fixed several races there since 1.6.4, so I'd be interested to see if it reproduces on 1.9.3 and/or master. |
A friendly reminder that this issue had no activity for 30 days. |
Reopen if this is no fixed in the upstream branch. |
/kind bug
Description
Steps to reproduce the issue:
start some containers at same time.
2.
kill these containers at same time.
Describe the results you received:
some of containers are not removed and in INCORRECT stat.
Describe the results you expected:
all container should be removed, since the --rm is specified.
Additional information you deem important (e.g. issue happens only occasionally):
Output of
podman version
:[boliu@lsf1x125 ~]$ podman info --debug
debug:
compiler: gc
git commit: ""
go version: go1.13.4
podman version: 1.6.4
host:
BuildahVersion: 1.12.0-dev
CgroupVersion: v1
Conmon:
package: conmon-2.0.6-1.module+el8.2.0+6368+cf16aa14.x86_64
path: /usr/bin/conmon
version: 'conmon version 2.0.6, commit: 9adfe850ef954416ea5dd0438d428a60f2139473'
Distribution:
distribution: '"rhel"'
version: "8.1"
IDMappings:
gidmap:
- container_id: 0
host_id: 10007
size: 1
- container_id: 1
host_id: 100000
size: 65536
uidmap:
- container_id: 0
host_id: 34040
size: 1
- container_id: 1
host_id: 100000
size: 65536
MemFree: 4692107264
MemTotal: 8189198336
OCIRuntime:
name: runc
package: runc-1.0.0-65.rc10.module+el8.2.0+6368+cf16aa14.x86_64
path: /usr/bin/runc
version: 'runc version spec: 1.0.1-dev'
SwapFree: 4287361024
SwapTotal: 4294963200
arch: amd64
cpus: 4
eventlogger: journald
hostname: lsf1x125
kernel: 4.18.0-193.el8.x86_64
os: linux
rootless: true
slirp4netns:
Executable: /usr/bin/slirp4netns
Package: slirp4netns-0.4.2-3.git21fdece.module+el8.2.0+6368+cf16aa14.x86_64
Version: |-
slirp4netns version 0.4.2+dev
commit: 21fdece2737dc24ffa3f01a341b8a6854f8b13b4
uptime: 55h 51m 24.3s (Approximately 2.29 days)
registries:
blocked: null
insecure: null
search:
store:
ConfigFile: /home/boliu/.config/containers/storage.conf
ContainerStore:
number: 89
GraphDriverName: overlay
GraphOptions:
overlay.mount_program:
Executable: /usr/bin/fuse-overlayfs
Package: fuse-overlayfs-0.7.2-5.module+el8.2.0+6368+cf16aa14.x86_64
Version: |-
fuse-overlayfs: version 0.7.2
FUSE library version 3.2.1
using FUSE kernel interface version 7.26
GraphRoot: /opt/boliu/podman/containers/storage
GraphStatus:
Backing Filesystem: xfs
Native Overlay Diff: "false"
Supports d_type: "true"
Using metacopy: "false"
ImageStore:
number: 1
RunRoot: /tmp/run-34040
VolumePath: /opt/boliu/podman/containers/storage/volumes
[boliu@lsf1x125 ~]$ rpm -qa |grep podman
podman-1.6.4-11.module+el8.2.0+6368+cf16aa14.x86_64
podman-docker-1.6.4-11.module+el8.2.0+6368+cf16aa14.noarch```
Additional environment details (AWS, VirtualBox, physical, etc.):
virtual machine on vmware host.
The text was updated successfully, but these errors were encountered: