Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

podman hanging on a lot of subccommands without logs #9228

Closed
b-ncMN opened this issue Feb 4, 2021 · 36 comments
Closed

podman hanging on a lot of subccommands without logs #9228

b-ncMN opened this issue Feb 4, 2021 · 36 comments
Labels
kind/bug Categorizes issue or PR as related to a bug. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments.

Comments

@b-ncMN
Copy link

b-ncMN commented Feb 4, 2021

Is this a BUG REPORT or FEATURE REQUEST? (leave only one on its own line)

/kind bug

Description
Podman looks like it hangs on my system when typing those commands :
podman auto-update
podman build
podman events
podman exec
podman images
podman import
podman info
podman inspect
podman load
podman login
podman logout
podman mount
podman pause
podman port
podman ps
podman pull
podman rmi
podman run
podman start
podman stats
podman top
podman unpause
podman unshare
podman version
podman wait

Steps to reproduce the issue:
(Haven't tried to reproduce this anywhere else)

  1. Install opensuse 15.2
  2. run zypper in toolbox
  3. run toolbox -u or one of the few commands I mentioned
  4. observe

Describe the results you received:
all those sub commands I mentioned hangs without printing logs inside in the journals

Describe the results you expected:
I expected everything to work correctly, I originally experienced this bug while manually trying to run "podman pull registry.opensuse.org/opensuse/toolbox:latest" and then I noticed that podman wasn't working fine at all

Additional information you deem important (e.g. issue happens only occasionally):
happens all the times, I have attempted a restart but this does not fix the issue.
I have checked if I had virtualization enabled in my BIOS (which I indeed have, I also am able to run virtual machines using virsh successfully)

Output of podman version:
podman version hangs but here is the version reported by zypper info podman :

Information for package podman:
-------------------------------
Repository     : Main Update Repository
Name           : podman
Version        : 2.1.1-lp152.4.6.1
Arch           : x86_64
Vendor         : openSUSE
Installed Size : 93.6 MiB
Installed      : Yes
Status         : up-to-date
Source package : podman-2.1.1-lp152.4.6.1.src
Summary        : Daemon-less container engine for managing containers, pods and images
Description    :
    Podman is a container engine for managing pods, containers, and container
    images.
    It is a standalone tool and it directly manipulates containers without the need
    of a container engine daemon.
    Podman is able to interact with container images create in buildah, cri-o, and
    skopeo, as they all share the same datastore backend.

Output of podman info --debug:
(hangs...)

Package info (e.g. output of rpm -q podman or apt list podman):

podman-2.1.1-lp152.4.6.1.x86_64

Have you tested with the latest version of Podman and have you checked the Podman Troubleshooting Guide?

No

Additional environment details (AWS, VirtualBox, physical, etc.):
Physical (my laptop)
OS/Distro : Linux/openSUSE leap 15.2
Kernel : Linux inftop 5.3.18-lp152.60-default #1 SMP Tue Jan 12 23:10:31 UTC 2021 (9898712) x86_64 x86_64 x86_64 GNU/Linux

@openshift-ci-robot openshift-ci-robot added the kind/bug Categorizes issue or PR as related to a bug. label Feb 4, 2021
@b-ncMN
Copy link
Author

b-ncMN commented Feb 4, 2021

After a bit of playing around with podman, I noticed that running those commands as root seems to work,
I am now wondering if this is an issue related to user permissions

@Luap99
Copy link
Member

Luap99 commented Feb 4, 2021

Do you have the newuidmap and newgidmap binaries on your system installed?
This sounds like #7890

@b-ncMN
Copy link
Author

b-ncMN commented Feb 4, 2021

Yes, I have them installed.

I took advantage of the description of the issue you mentioned and did an strace :
https://susepaste.org/35177576,
what seems weird is that the issue you mentioned, mentions podman looping on some futex, which looks like it happens here too,
after a few lines, strace indicates that it gets stuck at

wait4(2732, [{WIFEXITED(s) && WEXITSTATUS(s) == 1}], 0, {ru_utime={tv_sec=0, tv_usec=0}, ru_stime={tv_sec=0, tv_usec=619}, ...}) = 2732

(line 2405 in the paste, the next lines are from me pressing ^C)

@b-ncMN
Copy link
Author

b-ncMN commented Feb 4, 2021

I have tried to reproduce this on my deskop (which runs the same distro) and it works fine

@rhatdan
Copy link
Member

rhatdan commented Feb 4, 2021

Does this hang?

$ podman unshare cat /proc/self/uid_map

@b-ncMN
Copy link
Author

b-ncMN commented Feb 7, 2021

Hi, sorry for the late answer,

yes it does

@rhatdan
Copy link
Member

rhatdan commented Feb 8, 2021

Well I have no idea what is going on. Have you tried to reboot this system, or have you tried this with a different account?

@rhatdan
Copy link
Member

rhatdan commented Feb 8, 2021

Is there anything special about your homedir? IE NFS based?

@b-ncMN
Copy link
Author

b-ncMN commented Feb 8, 2021

Yes, I have tried rebooting (multiples times even), I haven't tried using another user, will report back on that.

I have a regular btrfs root partition with nothing special about my homedir

@rhatdan
Copy link
Member

rhatdan commented Feb 9, 2021

Perhaps this has something to do with btrfs.

@giuseppe Any ideas?

@b-ncMN
Copy link
Author

b-ncMN commented Feb 9, 2021

it does run on another user.

@vrothberg
Copy link
Member

Does buildah pull or skopeo copy docker://alpine containers-storage:alpine work?

@rhatdan
Copy link
Member

rhatdan commented Feb 10, 2021

Something in your homedir setup is causing this to fail.

@b-ncMN
Copy link
Author

b-ncMN commented Feb 10, 2021

I haven't done anything particular in my home that could be causing this to fail, in fact I didn't even have podman installed before I had this bug, I got podman pulled when installing toolbox.

Is there a set of files / configs that are related to podman I could check ?

@b-ncMN
Copy link
Author

b-ncMN commented Feb 10, 2021

I do not have have the buildah command accessible under neither of my users (test and infrandomness), yet pulling (and all the other subcommands) works under the test user, nor do I have the skopeo command

@rhatdan
Copy link
Member

rhatdan commented Feb 11, 2021

Could you rm -rf ~/.config/containers ~/.local/share/containers

And see if it still hangs.

@rhatdan
Copy link
Member

rhatdan commented Feb 11, 2021

Also could you show the output of printenv in your user account that does not work, perhaps there is some setting in environment that is causing this to hang.

@b-ncMN
Copy link
Author

b-ncMN commented Feb 12, 2021

I tried to remove ~/.config/containers and ~/.local/share/containers, it didn't help

here's my env : https://susepaste.org/86410544

@rhatdan
Copy link
Member

rhatdan commented Feb 12, 2021

Does
podman --log-level=debug info

Give you any information
@giuseppe @mheon PTAL

@mheon
Copy link
Member

mheon commented Feb 12, 2021

Vague theory: remove anything with libpod in the name in /dev/shm in case there's a locking issue.

@b-ncMN
Copy link
Author

b-ncMN commented Feb 12, 2021

podman --log-level=debug info

https://susepaste.org/32558978

@mheon
Copy link
Member

mheon commented Feb 12, 2021

DEBU[0000] error from newgidmap: newgidmap: gid range [1-65537) -> [100000-165536) not allowed

Interesting. Does your user have a valid entry in /etc/subgid?

@b-ncMN
Copy link
Author

b-ncMN commented Feb 12, 2021

cat /etc/subgid

infrandomness:100000:65536
test:100000:65536

I wonder why I have the same numbers as test but it isn't working, even before the creating of this user account on my system.

@rhatdan
Copy link
Member

rhatdan commented Feb 15, 2021

Is newgidmap setuid or getcap?

$ getcap /usr/bin/newgidmap
/usr/bin/newgidmap cap_setgid=ep
$ ls -l /usr/bin/newgidmap
-rwxr-xr-x. 1 root root 29848 Nov 16 04:17 /usr/bin/newgidmap

@saschagrunert @vrothberg Any ideas?

@vrothberg
Copy link
Member

DEBU[0000] error from newgidmap: newgidmap: gid range [1-65537) -> [100000-165536) not allowed

The error message points to gid. @InfRandomness can you also share cat /etc/subgid?

@b-ncMN
Copy link
Author

b-ncMN commented Feb 15, 2021

sudo getcap /usr/bin/newgidmap
prints nothing

ls -l /usr/bin/newgidmap
image

cat /etc/subgid

infrandomness:100000:65536
test:100000:65536

The fact that the path in the screenshot is red most likely has a special meaning.

@rhatdan
Copy link
Member

rhatdan commented Feb 15, 2021

Yes it means its is setuid (or has file caps) in this case it is setuid.
Everything looks fine, but I have no idea why this is blowing up. Perhaps some setting in SUSE that blocks the use of the uid range.

Could you change the range to see if having duplicate ranges in the /etc/subuid is being rejected?

@b-ncMN
Copy link
Author

b-ncMN commented Feb 15, 2021

Here are my new ranges :

cat /etc/subgid /etc/subuid

infrandomness:165536:65536
test:100000:65536
infrandomness:165536:65536
test:165536:65536

I logged out and logged back in after changing those and the issue is still happening

@saschagrunert
Copy link
Member

Uh, I did not test it on Leap for quite a while. I think we have to debug it within a VM if it's reproducible.

@b-ncMN
Copy link
Author

b-ncMN commented Feb 15, 2021

I can test it out in a VM a bit later on

@b-ncMN
Copy link
Author

b-ncMN commented Feb 15, 2021

Unfortunately I wasn't able to reproduce this

@b-ncMN
Copy link
Author

b-ncMN commented Feb 17, 2021

I think I'm just gonna reinstall my system and see how it goes after

@b-ncMN
Copy link
Author

b-ncMN commented Feb 21, 2021

I think this issue can be closed now ever since it is coming from something in my home folder and not the binary itself

@vrothberg
Copy link
Member

Thanks for the report and working with us, @InfRandomness !

@b-ncMN
Copy link
Author

b-ncMN commented Feb 22, 2021

I am coming back to you guys because, since an update of podman I got not so long ago, it appears I've got more information about the situation :

"podman pull opensuse/tumbleweed" now stopped from hanging and prints :
"Error: cannot setup namespace using newgidmap: exit status 1

I have also tried "buildah --debug unshare", here's what I get :

*DEBU running [buildah-in-a-user-namespace --debug unshare] with environment [LIBVA_DRIVER_NAME=iHD LS_COLORS=no=00:fi=00:di=01;34:ln=00;36:pi=40;33:so=01;35:do=01;35:bd=40;33;01:cd=40;33;01:or=41;33;01:ex=00;32:*.cmd=00;32:*.exe=01;32:*.com=01;32:*.bat=01;32:*.btm=01;32:*.dll=01;32:*.tar=00;31:*.tbz=00;31:*.tgz=00;31:*.rpm=00;31:*.deb=00;31:*.arj=00;31:*.taz=00;31:*.lzh=00;31:*.lzma=00;31:*.zip=00;31:*.zoo=00;31:*.z=00;31:*.Z=00;31:*.gz=00;31:*.bz2=00;31:*.tb2=00;31:*.tz2=00;31:*.tbz2=00;31:*.xz=00;31:*.avi=01;35:*.bmp=01;35:*.dl=01;35:*.fli=01;35:*.gif=01;35:*.gl=01;35:*.jpg=01;35:*.jpeg=01;35:*.mkv=01;35:*.mng=01;35:*.mov=01;35:*.mp4=01;35:*.mpg=01;35:*.pcx=01;35:*.pbm=01;35:*.pgm=01;35:*.png=01;35:*.ppm=01;35:*.svg=01;35:*.tga=01;35:*.tif=01;35:*.webm=01;35:*.webp=01;35:*.wmv=01;35:*.xbm=01;35:*.xcf=01;35:*.xpm=01;35:*.aiff=00;32:*.ape=00;32:*.au=00;32:*.flac=00;32:*.m4a=00;32:*.mid=00;32:*.mp3=00;32:*.mpc=00;32:*.ogg=00;32:*.voc=00;32:*.wav=00;32:*.wma=00;32:*.wv=00;32: HOSTTYPE=x86_64 XDG_CONFIG_HOME=/home/infrandomness/.config XAUTHLOCALHOSTNAME=inftop LESSCLOSE=lessclose.sh %s %s XKEYSYMDB=/usr/X11R6/lib/X11/XKeysymDB XDG_MENU_PREFIX=gnome- LANG=en_US.UTF-8 WINDOWMANAGER=gnome LESS=-M -I -R MANAGERPID=2790 DISPLAY=:0 JAVA_ROOT=/usr/lib64/jvm/java HOSTNAME=inftop INVOCATION_ID=1ec56de293f945168901ffca7e653025 ALACRITTY_LOG=/tmp/Alacritty-3479.log CONFIG_SITE=/usr/share/site/x86_64-unknown-linux-gnu CSHEDIT=emacs GTK2_MODULES=unity-gtk-module GPG_TTY=/dev/pts/0 AUDIODRIVER=pulseaudio LESS_ADVANCED_PREPROCESSOR=no COLORTERM=truecolor USERNAME=infrandomness JAVA_HOME=/usr/lib64/jvm/java ALSA_CONFIG_PATH=/etc/alsa-pulse.conf MACHTYPE=x86_64-suse-linux GIO_LAUNCHED_DESKTOP_FILE_PID=3479 GTK3_MODULES=unity-gtk-module SSH_AUTH_SOCK=/run/user/1000/keyring/ssh QEMU_AUDIO_DRV=pa MINICOM=-c on QT_SYSTEM_DIR=/usr/share/desktop-data OSTYPE=linux USER=infrandomness PAGER=less DESKTOP_SESSION=default MORE=-sl PWD=/home/infrandomness SSH_ASKPASS=/usr/lib/ssh/ssh-askpass HOME=/home/infrandomness JOURNAL_STREAM=9:45294 SSH_AGENT_PID=3039 HOST=inftop XNLSPATH=/usr/share/X11/nls XDG_SESSION_TYPE=x11 SDK_HOME=/usr/lib64/jvm/java XDG_DATA_DIRS=/home/infrandomness/.local/share/flatpak/exports/share:/var/lib/flatpak/exports/share:/usr/local/share:/usr/share JDK_HOME=/usr/lib64/jvm/java XDG_SESSION_DESKTOP=default PROFILEREAD=true GJS_DEBUG_OUTPUT=stderr GTK_MODULES=canberra-gtk-module FROM_HEADER= MAIL=/var/spool/mail/infrandomness UBUNTU_MENUPROXY=1 WINDOWPATH=2 LESSKEY=/etc/lesskey.bin TERM=xterm-256color SHELL=/bin/bash QT_IM_MODULE=xim XMODIFIERS=@im=local LS_OPTIONS=-N --color=tty -T 0 XCURSOR_THEME=DMZ XDG_CURRENT_DESKTOP=GNOME GIO_LAUNCHED_DESKTOP_FILE=/home/infrandomness/.local/share/applications/Alacritty.desktop PYTHONSTARTUP=/etc/pythonstart SHLVL=1 G_FILENAME_ENCODING=@locale,UTF-8,ISO-8859-15,CP1252 MANPATH=/usr/local/man:/usr/local/share/man:/usr/share/man WINDOWID=37748738 XSESSION_IS_UP=yes GDMSESSION=default LOGNAME=infrandomness DBUS_SESSION_BUS_ADDRESS=unix:path=/run/user/1000/bus XDG_RUNTIME_DIR=/run/user/1000 XAUTHORITY=/run/user/1000/gdm/Xauthority JRE_HOME=/usr/lib64/jvm/java XDG_CONFIG_DIRS=/etc/xdg PATH=/home/infrandomness/.cargo/bin:/home/infrandomness/bin:/usr/local/bin:/usr/bin:/bin:usr/local/bin:usr/local/bin JAVA_BINDIR=/usr/lib64/jvm/java/bin SDL_AUDIODRIVER=pulse QT_IM_SWITCHER=imsw-multi G_BROKEN_FILENAMES=1 HISTSIZE=1000 GJS_DEBUG_TOPICS=JS ERROR;JS LOG SESSION_MANAGER=local/inftop:@/tmp/.ICE-unix/3087,unix/inftop:/tmp/.ICE-unix/3087 CPU=x86_64 CVS_RSH=ssh LESSOPEN=lessopen.sh %s GTK_IM_MODULE=cedilla _=/usr/bin/buildah TMPDIR=/var/tmp _CONTAINERS_USERNS_CONFIGURED=1 BUILDAH_ISOLATION=rootless], UID map [{ContainerID:0 HostID:1000 Size:1} {ContainerID:1 HostID:100000 Size:65536}], and GID map [{ContainerID:0 HostID:100 Size:1} {ContainerID:1 HostID:100000 Size:65536}]
WARN error running newgidmap: exit status 1: newgidmap: gid range [1-65537) -> [100000-165536) not allowed
WARN falling back to single mapping

The last two warnings looks interesting to me 🤔

@rhatdan
Copy link
Member

rhatdan commented Feb 22, 2021

as the logged in user do
$ cat /proc/self/uid_map
0 0 4294967295

@github-actions github-actions bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label Sep 22, 2023
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 22, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
kind/bug Categorizes issue or PR as related to a bug. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments.
Projects
None yet
Development

No branches or pull requests

7 participants