Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

podman hangs on "receiving" too much output to console #9183

Closed
justinkb opened this issue Feb 1, 2021 · 7 comments
Closed

podman hangs on "receiving" too much output to console #9183

justinkb opened this issue Feb 1, 2021 · 7 comments
Labels
kind/bug Categorizes issue or PR as related to a bug. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments.

Comments

@justinkb
Copy link

justinkb commented Feb 1, 2021

/kind bug

Description

Executions in containers hang when they output too much data too fast to stdout.

Steps to reproduce the issue:

podman run docker.io/archlinux hexdump -C /dev/random

Describe the results you received:

execution hangs and ends at around 20k of the file hexdumped

Describe the results you expected:

execution should output the entire hexadecimal representation of the C library binary (200k file roughly)

Additional information you deem important (e.g. issue happens only occasionally):

not specific to hexdump by the way, anything triggers the hangs. even cat-ing a large file. actually first noticed it when running "pacman -Qi" (prints all package info verbosely on archlinux)

Output of podman version:

Version:      2.2.1
API Version:  2.1.0
Go Version:   go1.15.6
Git Commit:   a0d478edea7f775b7ce32f8eb1a01e75374486cb
Built:        Tue Dec  8 22:48:23 2020
OS/Arch:      linux/amd64

Output of podman info --debug:

host:
  arch: amd64
  buildahVersion: 1.18.0
  cgroupManager: cgroupfs
  cgroupVersion: v2
  conmon:
    package: Unknown
    path: /usr/bin/conmon
    version: 'conmon version 2.0.25, commit: 05ce716ac6d1cfeeb27b9280832abd2e9d1a085f'
  cpus: 12
  distribution:
    distribution: arch
    version: unknown
  eventLogger: journald
  hostname: rho
  idMappings:
    gidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 2262144
    uidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 2262144
  kernel: 5.10.11-arch1-1
  linkmode: dynamic
  memFree: 29999202304
  memTotal: 33598287872
  ociRuntime:
    name: crun
    package: Unknown
    path: /usr/bin/crun
    version: |-
      crun version 0.17
      commit: 0e9229ae34caaebcb86f1fde18de3acaf18c6d9a
      spec: 1.0.0
      +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +YAJL
  os: linux
  remoteSocket:
    path: /run/user/1000/podman/podman.sock
  rootless: true
  slirp4netns:
    executable: /usr/bin/slirp4netns
    package: Unknown
    version: |-
      slirp4netns version 1.1.8
      commit: d361001f495417b880f20329121e3aa431a8f90f
      libslirp: 4.4.0
      SLIRP_CONFIG_VERSION_MAX: 3
      libseccomp: 2.5.1
  swapFree: 8205344768
  swapTotal: 8589930496
  uptime: 1h 30m 21.03s (Approximately 0.04 days)
registries: {}
store:
  configFile: /home/paul/.config/containers/storage.conf
  containerStore:
    number: 2
    paused: 0
    running: 0
    stopped: 2
  graphDriverName: btrfs
  graphOptions: {}
  graphRoot: /home/paul/.local/share/containers/storage
  graphStatus:
    Build Version: 'Btrfs v5.9 '
    Library Version: "102"
  imageStore:
    number: 1
  runRoot: /run/user/1000/containers
  volumePath: /home/paul/.local/share/containers/storage/volumes
version:
  APIVersion: 2.1.0
  Built: 1607464103
  BuiltTime: Tue Dec  8 22:48:23 2020
  GitCommit: a0d478edea7f775b7ce32f8eb1a01e75374486cb
  GoVersion: go1.15.6
  OsArch: linux/amd64
  Version: 2.2.1

Package info (e.g. output of rpm -q podman or apt list podman):

$ pacman -Qi podman  
Name            : podman
Version         : 2.2.1-1
Description     : Tool and library for running OCI-based containers in pods
Architecture    : x86_64
URL             : https://github.com/containers/libpod
Licenses        : Apache
Groups          : None
Provides        : None
Depends On      : cni-plugins  conmon  containers-common  device-mapper  iptables  libseccomp  runc  slirp4netns
                  libsystemd  fuse-overlayfs  libgpgme.so=11-64
Optional Deps   : podman-docker: for Docker-compatible CLI
                  btrfs-progs: support btrfs backend devices [installed]
                  catatonit: --init flag support
                  crun: support for unified cgroupsv2 [installed]
Required By     : None
Optional For    : None
Conflicts With  : None
Replaces        : None
Installed Size  : 79.09 MiB
Packager        : Morten Linderud <[email protected]>
Build Date      : Tue 08 Dec 2020 10:48:23 PM CET
Install Date    : Mon 01 Feb 2021 02:29:12 PM CET
Install Reason  : Explicitly installed
Install Script  : No
Validated By    : Signature

Have you tested with the latest version of Podman and have you checked the Podman Troubleshooting Guide?

Yes

Additional environment details (AWS, VirtualBox, physical, etc.):

physical

@openshift-ci-robot openshift-ci-robot added the kind/bug Categorizes issue or PR as related to a bug. label Feb 1, 2021
@mheon
Copy link
Member

mheon commented Feb 1, 2021

@vrothberg This reminds me of #9096 but in the other direction. Likely related.

@justinkb
Copy link
Author

justinkb commented Feb 1, 2021

That's interesting. I agree it's very likely the same issue. Should clarify my original report that the hanging at prompt occurs when using pod with an interactive command, otherwise it just returns with truncated, as the other bug report says, output. Likely the hanging behavior is a result of some part of the interactive prompt bit of podman not knowing what to do when the container process aborts abruptly. I can help debug, I'm pretty handy with git bisect. Give me some pointers which of the components of podman is the likely culprit

EDIT: this regressed some time after conmon 2.0.22 and before 2.0.24 - when I rolled back to that version, I no longer get the issue, when I try with 2.0.24 I do get the issue. Couldn't try 2.0.23, immediately got an EOF error on that version

@mheon
Copy link
Member

mheon commented Feb 1, 2021

Thanks a bunch - Conmon doesn't get much activity, so that should narrow things down to a narrow set of commits.

@haircommander PTAL

@justinkb
Copy link
Author

justinkb commented Feb 1, 2021

paul at rho in ~/Developer/conmon (master)
$ git bisect start

paul at rho in ~/Developer/conmon (master)(bisect)
$ git bisect good v2.0.22

paul at rho in ~/Developer/conmon (master)(bisect)
$ git bisect bad v2.0.24 
Bisecting: 9 revisions left to test after this (roughly 3 steps)
[0f092d5446fdadb761b8143b5c4b1d1ac9793f29] conmon: store open FDs and close only them

paul at rho in ~/Developer/conmon (0f092d5)(bisect)
$ git bisect bad        
Bisecting: 5 revisions left to test after this (roughly 2 steps)
[ad1d0cfcc9efcbd324d7caab7607afbb707142c6] Merge pull request #223 from giuseppe/fix-write-hang

paul at rho in ~/Developer/conmon (ad1d0cf)(bisect)
$ git bisect bad
Bisecting: 1 revision left to test after this (roughly 1 step)
[c704d3a4cf98c9f180622ab16435424c542b38ac] Merge pull request #220 from haircommander/bump-2.0.22

paul at rho in ~/Developer/conmon (c704d3a)(bisect)
$ git bisect good        
Bisecting: 0 revisions left to test after this (roughly 0 steps)
[6287bd884d9bf29e76ac877e0c7e6aad04bc24a4] conn_sock: make dest fd non blocking

paul at rho in ~/Developer/conmon (6287bd8)(bisect)
$ git bisect bad 
6287bd884d9bf29e76ac877e0c7e6aad04bc24a4 is the first bad commit
commit 6287bd884d9bf29e76ac877e0c7e6aad04bc24a4
Author: Giuseppe Scrivano <[email protected]>
Date:   Fri Dec 18 19:37:28 2020 +0100

    conn_sock: make dest fd non blocking
    
    we already have code in place to handle partial writes, just make sure
    the fd is not blocking so it doesn't hang on writes.
    
    Closes: https://github.com/containers/conmon/issues/204
    
    Signed-off-by: Giuseppe Scrivano <[email protected]>

 src/conn_sock.c | 2 ++
 1 file changed, 2 insertions(+)

@vrothberg
Copy link
Member

@giuseppe PTAL

Thanks everybody!

@giuseppe
Copy link
Member

giuseppe commented Feb 2, 2021

containers/conmon#237

@giuseppe
Copy link
Member

giuseppe commented Feb 2, 2021

closing as a duplicate of containers/conmon#236

@giuseppe giuseppe closed this as completed Feb 2, 2021
@github-actions github-actions bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label Sep 22, 2023
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 22, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
kind/bug Categorizes issue or PR as related to a bug. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments.
Projects
None yet
Development

No branches or pull requests

5 participants