Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

podman logs -f ends early #10596

Closed
Hendrik-H opened this issue Jun 8, 2021 · 7 comments · Fixed by #10600
Closed

podman logs -f ends early #10596

Hendrik-H opened this issue Jun 8, 2021 · 7 comments · Fixed by #10600
Assignees
Labels
In Progress This issue is actively being worked by the assignee, please do not work on this at this time. kind/bug Categorizes issue or PR as related to a bug. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments.

Comments

@Hendrik-H
Copy link

Is this a BUG REPORT or FEATURE REQUEST? (leave only one on its own line)

/kind bug

Description

A container traps SIGINT and SIGTERM, logs a few things, waits a few seconds, logs more and then exits. A parallel podman logs -f ... for that container does not capture all the logs and ends early when the container is stopped using podman stop. I believe I also had a case that a bit was missing when stopping with ctrl-c but I can not reproduce that anymore.

Test Dockerfile:

FROM fedora:34
ADD script.sh /
ENTRYPOINT ["/script.sh"]

Test Script:

#!/bin/bash

sigIntHandler() {
    echo "received SIGINT"
    echo "going to exit soon"
    sleep 5
    echo "exiting now"
    exit 1
}

sigTermHandler() {
    echo "received SIGTERM"
    echo "going to exit soon"
    sleep 5
    echo "exiting now"
    exit 1
}

trap sigIntHandler  SIGINT
trap sigTermHandler SIGTERM
trap

echo "started"
for i in {1..30}
do
    sleep 1
done
echo "done"

Steps to reproduce the issue:

  1. build a test image using the above files: podman build -t=podman-test .
  2. start a test container: podman run -it --name test --rm podman-test
  3. follow the logs: podman logs -f test
  4. stop the container using podman stop --time 10 test

Describe the results you received:
When using podman stop the logs stop straight away in the podman logs -f terminal and some logs that are still coming out in the podman run terminal are missing. When you try to restart podman logs -f while the container is in the stopping state you can not follow the logs anymore.

Describe the results you expected:
podman logs -f to log out all logs until the container is fully stopped and not just until it is starting to stop.

Additional information you deem important (e.g. issue happens only occasionally):

Output of podman version:

Version:      3.1.2
API Version:  3.1.2
Go Version:   go1.15.11
Built:        Tue May 11 15:53:47 2021
OS/Arch:      linux/amd64

Output of podman info --debug:

host:
  arch: amd64
  buildahVersion: 1.20.1
  cgroupManager: systemd
  cgroupVersion: v2
  conmon:
    package: conmon-2.0.27-2.fc33.x86_64
    path: /usr/bin/conmon
    version: 'conmon version 2.0.27, commit: '
  cpus: 12
  distribution:
    distribution: fedora
    version: "33"
  eventLogger: journald
  hostname: p52
  idMappings:
    gidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 65536
    uidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 65536
  kernel: 5.12.8-200.fc33.x86_64
  linkmode: dynamic
  memFree: 1090428928
  memTotal: 33331945472
  ociRuntime:
    name: crun
    package: crun-0.19.1-3.fc33.x86_64
    path: /usr/bin/crun
    version: |-
      crun version 0.19.1
      commit: 1535fedf0b83fb898d449f9680000f729ba719f5
      spec: 1.0.0
      +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +YAJL
  os: linux
  remoteSocket:
    path: /run/user/1000/podman/podman.sock
  security:
    apparmorEnabled: false
    capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
    rootless: true
    seccompEnabled: true
    selinuxEnabled: true
  slirp4netns:
    executable: /usr/bin/slirp4netns
    package: slirp4netns-1.1.9-1.fc33.x86_64
    version: |-
      slirp4netns version 1.1.9
      commit: 4e37ea557562e0d7a64dc636eff156f64927335e
      libslirp: 4.3.1
      SLIRP_CONFIG_VERSION_MAX: 3
      libseccomp: 2.5.0
  swapFree: 18462298112
  swapTotal: 21063786496
  uptime: 161h 25m 58.52s (Approximately 6.71 days)
registries:
  9.152.170.156:5000:
    Blocked: false
    Insecure: true
    Location: 9.152.170.156:5000
    MirrorByDigestOnly: false
    Mirrors: []
    Prefix: 9.152.170.156:5000
  127.0.0.1:5000:
    Blocked: false
    Insecure: true
    Location: 127.0.0.1:5000
    MirrorByDigestOnly: false
    Mirrors: []
    Prefix: 127.0.0.1:5000
  localhost:5000:
    Blocked: false
    Insecure: true
    Location: localhost:5000
    MirrorByDigestOnly: false
    Mirrors: []
    Prefix: localhost:5000
  localhost:5555:
    Blocked: false
    Insecure: true
    Location: localhost:5555
    MirrorByDigestOnly: false
    Mirrors: []
    Prefix: localhost:5555
  search:
  - registry.fedoraproject.org
  - registry.access.redhat.com
  - registry.centos.org
  - docker.io
store:
  configFile: /home/haddorp/.config/containers/storage.conf
  containerStore:
    number: 1
    paused: 0
    running: 0
    stopped: 1
  graphDriverName: overlay
  graphOptions:
    overlay.mount_program:
      Executable: /usr/bin/fuse-overlayfs
      Package: fuse-overlayfs-1.5.0-1.fc33.x86_64
      Version: |-
        fusermount3 version: 3.9.3
        fuse-overlayfs: version 1.5
        FUSE library version 3.9.3
        using FUSE kernel interface version 7.31
  graphRoot: /home/haddorp/.local/share/containers/storage
  graphStatus:
    Backing Filesystem: extfs
    Native Overlay Diff: "false"
    Supports d_type: "true"
    Using metacopy: "false"
  imageStore:
    number: 121
  runRoot: /run/user/1000/containers
  volumePath: /home/haddorp/.local/share/containers/storage/volumes
version:
  APIVersion: 3.1.2
  Built: 1620741227
  BuiltTime: Tue May 11 15:53:47 2021
  GitCommit: ""
  GoVersion: go1.15.11
  OsArch: linux/amd64
  Version: 3.1.2

Package info (e.g. output of rpm -q podman or apt list podman):

podman-3.1.2-2.fc33.x86_64

Have you tested with the latest version of Podman and have you checked the Podman Troubleshooting Guide? (https://github.com/containers/podman/blob/master/troubleshooting.md)

No

Additional environment details (AWS, VirtualBox, physical, etc.):
laptop

@openshift-ci openshift-ci bot added the kind/bug Categorizes issue or PR as related to a bug. label Jun 8, 2021
@vrothberg
Copy link
Member

Thanks for opening the issue, @Hendrik-H!

Can you run podman inspect --format "{{.HostConfig.LogConfig}}" on a local container? I want to check which log driver is currently used on your machine.

@Hendrik-H
Copy link
Author

{k8s-file map[] /home/haddorp/.local/share/containers/storage/overlay-containers/bbef0632a6ce0a05c58bea3367907d8a13bf05c6a8e6701b712a20cd0b06f00e/userdata/ctr.log  0B}

@vrothberg vrothberg self-assigned this Jun 8, 2021
@vrothberg vrothberg added the In Progress This issue is actively being worked by the assignee, please do not work on this at this time. label Jun 8, 2021
@vrothberg
Copy link
Member

I opened #10600 to fix the issue.

vrothberg added a commit to vrothberg/libpod that referenced this issue Jun 8, 2021
Fix a race in the k8s-file logs driver.  When "following" the logs,
Podman will print the container's logs until the end.  Previously,
Podman logged until the state transitioned into something non-running
which opened up a race with the container still running, possibly in
the "stopping" state.

To fix the race, log until we've seen the wait event for the specific
container.  In that case, conmon will have finished writing all logs to
the file, and Podman will read it until EOF.

Further tweak the integration tests for testing `logs -f` on a  running
container.  Previously, the test only checked for one of two lines
stating that there was a race.  Indeed the race was in using `run --rm`
where a log file may be removed before we could fully read it.

Fixes: containers#10596
Signed-off-by: Valentin Rothberg <[email protected]>
@vrothberg
Copy link
Member

@Hendrik-H, note that there's an inherent race when using --rm and logs -f.

AFAIKS, the log file may get removed before we could fully read it. @mheon do you think it's sad fact of life or shall we tackle it one way or another?

@Hendrik-H
Copy link
Author

@vrothberg thanks for the tip and the quick fix

@mheon
Copy link
Member

mheon commented Jun 8, 2021

@vrothberg It might get removed, but we'll still have an open file descriptor, no?

I don't think we can do this another way, this is in the c/storage directories and if we don't remove it removing the storage container will fail.

@vrothberg
Copy link
Member

@vrothberg It might get removed, but we'll still have an open file descriptor, no?

Probably. I had to dig deeper on how things like inotify might be used. One odd thing is that journald also flakes/d with --rm.

mheon pushed a commit to mheon/libpod that referenced this issue Jun 11, 2021
Fix a race in the k8s-file logs driver.  When "following" the logs,
Podman will print the container's logs until the end.  Previously,
Podman logged until the state transitioned into something non-running
which opened up a race with the container still running, possibly in
the "stopping" state.

To fix the race, log until we've seen the wait event for the specific
container.  In that case, conmon will have finished writing all logs to
the file, and Podman will read it until EOF.

Further tweak the integration tests for testing `logs -f` on a  running
container.  Previously, the test only checked for one of two lines
stating that there was a race.  Indeed the race was in using `run --rm`
where a log file may be removed before we could fully read it.

Fixes: containers#10596
Signed-off-by: Valentin Rothberg <[email protected]>
@github-actions github-actions bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label Sep 21, 2023
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 21, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
In Progress This issue is actively being worked by the assignee, please do not work on this at this time. kind/bug Categorizes issue or PR as related to a bug. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants