Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Get reset connection packets for rootless pod after stop/start #7520

Closed
rei-ber opened this issue Sep 1, 2020 · 2 comments
Closed

Get reset connection packets for rootless pod after stop/start #7520

rei-ber opened this issue Sep 1, 2020 · 2 comments
Labels
kind/bug Categorizes issue or PR as related to a bug. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments.

Comments

@rei-ber
Copy link

rei-ber commented Sep 1, 2020

Is this a BUG REPORT or FEATURE REQUEST? (leave only one on its own line)

/kind bug

Description

podman pod start does not work properly after podman pod stop as rootless. If I run a server in a pod and try to connect to it, it works after creating the pod and starting it. If I stop the pod and start it again, I observe a reset connection like #7016 already documented. If I do a restart, it works great.

Steps to reproduce the issue:

  1. Create pod, a container with a server inside and forward ports:
podman pod create -n alp_pod -p 8080:80
podman run -d --pod=alp_pod --name alp nginx:alpine
  1. Curl it and run tshark in another shell:
curl localhost:8080
tshark -i lo

tshark prints this:

Capturing on 'Loopback'
    1 0.000000000          ::1 → ::1          TCP 94 54066 → 8080 [SYN] Seq=0 Win=65476 Len=0 MSS=65476 SACK_PERM=1 TSval=3127606581 TSecr=0 WS=128                                                              
    2 0.000015487          ::1 → ::1          TCP 94 8080 → 54066 [SYN, ACK] Seq=0 Ack=1 Win=65464 Len=0 MSS=65476 SACK_PERM=1 TSval=3127606581 TSecr=3127606581 WS=128                                          
    3 0.000026282          ::1 → ::1          TCP 86 54066 → 8080 [ACK] Seq=1 Ack=1 Win=65536 Len=0 TSval=3127606581 TSecr=3127606581                                                                            
    4 0.000528297          ::1 → ::1          HTTP 164 GET / HTTP/1.1
    5 0.000534342          ::1 → ::1          TCP 86 8080 → 54066 [ACK] Seq=1 Ack=79 Win=65408 Len=0 TSval=3127606581 TSecr=3127606581                                                                           
    6 0.000815350          ::1 → ::1          HTTP 936 HTTP/1.1 200 OK  (text/html)
    7 0.000831017          ::1 → ::1          TCP 86 54066 → 8080 [ACK] Seq=79 Ack=851 Win=64768 Len=0 TSval=3127606581 TSecr=3127606581                                                                         
    8 0.002530781          ::1 → ::1          TCP 86 54066 → 8080 [FIN, ACK] Seq=79 Ack=851 Win=65536 Len=0 TSval=3127606583 TSecr=3127606581                                                                    
    9 0.002687103          ::1 → ::1          TCP 86 8080 → 54066 [FIN, ACK] Seq=851 Ack=80 Win=65536 Len=0 TSval=3127606583 TSecr=3127606583                                                                    
   10 0.002710048          ::1 → ::1          TCP 86 54066 → 8080 [ACK] Seq=80 Ack=852 Win=65536 Len=0 TSval=3127606583 TSecr=3127606583  

So everything is fine.

  1. Stop and start the pod:
podman pod stop alp_pod 
podman pod start alp_pod 
  1. Run curl and tshark again:
curl localhost:8080
tshark -i lo

I get this result for curl curl: (7) Failed to connect to localhost port 8080: Connection refused and this for tshark:

Capturing on 'Loopback'
    1 0.000000000          ::1 → ::1          TCP 94 54090 → 8080 [SYN] Seq=0 Win=65476 Len=0 MSS=65476 SACK_PERM=1 TSval=3127647051 TSecr=0 WS=128                                                              
    2 0.000012028          ::1 → ::1          TCP 74 8080 → 54090 [RST, ACK] Seq=1 Ack=1 Win=0 Len=0
    3 0.000159759    127.0.0.1 → 127.0.0.1    TCP 74 43022 → 8080 [SYN] Seq=0 Win=65495 Len=0 MSS=65495 SACK_PERM=1 TSval=3616380762 TSecr=0 WS=128                                                              
    4 0.000169542    127.0.0.1 → 127.0.0.1    TCP 54 8080 → 43022 [RST, ACK] Seq=1 Ack=1 Win=0 Len=0
  1. As I said a podman pod restart alp_pod fix it, but only if the pod is currently running. If it is stopped and I do a restart, it shows the same behaviour.

Describe the results you received:
The network interface sends a RST/ACK packet.

Describe the results you expected:
I expect a threeway handshake with data exchange.

Additional information you deem important (e.g. issue happens only occasionally):

Output of podman version:

Version:      2.0.4
API Version:  1
Go Version:   go1.14
Built:        Thu Jan  1 01:00:00 1970
OS/Arch:      linux/amd64

Output of podman info --debug:

host:
  arch: amd64
  buildahVersion: 1.15.0
  cgroupVersion: v1
  conmon:
    package: 'conmon: /usr/libexec/podman/conmon'
    path: /usr/libexec/podman/conmon
    version: 'conmon version 2.0.20, commit: '
  cpus: 1
  distribution:
    distribution: debian
    version: "10"
  eventLogger: file
  hostname: hostname
  idMappings:
    gidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 65536
    uidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 65536
  kernel: 4.19.0-10-amd64
  linkmode: dynamic
  memFree: 1084080128
  memTotal: 2078969856
  ociRuntime:
    name: runc
    package: 'runc: /usr/sbin/runc'
    path: /usr/sbin/runc
    version: |-
      runc version 1.0.0~rc6+dfsg1
      commit: 1.0.0~rc6+dfsg1-3
      spec: 1.0.1
  os: linux
  remoteSocket:
    path: /run/user/1000/podman/podman.sock
  rootless: true
  slirp4netns:
    executable: /usr/bin/slirp4netns
    package: 'slirp4netns: /usr/bin/slirp4netns'
    version: |-
      slirp4netns version 1.1.4
      commit: unknown
      libslirp: 4.3.1-git
      SLIRP_CONFIG_VERSION_MAX: 3
  swapFree: 2130702336
  swapTotal: 2130702336
  uptime: 23m 50.98s
registries:
  search:
  - docker.io
  - quay.io
store:
  configFile: /home/rootlessuser/.config/containers/storage.conf
  containerStore:
    number: 2
    paused: 0
    running: 2
    stopped: 0
  graphDriverName: overlay
  graphOptions:
    overlay.mount_program:
      Executable: /usr/bin/fuse-overlayfs
      Package: 'fuse-overlayfs: /usr/bin/fuse-overlayfs'
      Version: |-
        fusermount3 version: 3.4.1
        fuse-overlayfs: version 1.1.0
        FUSE library version 3.4.1
        using FUSE kernel interface version 7.27
  graphRoot: /home/rootlessuser/.local/share/containers/storage
  graphStatus:
    Backing Filesystem: extfs
    Native Overlay Diff: "false"
    Supports d_type: "true"
    Using metacopy: "false"
  imageStore:
    number: 18
  runRoot: /run/user/1000/containers
  volumePath: /home/rootlessuser/.local/share/containers/storage/volumes
version:
  APIVersion: 1
  Built: 0
  BuiltTime: Thu Jan  1 01:00:00 1970
  GitCommit: ""
  GoVersion: go1.14
  OsArch: linux/amd64
  Version: 2.0.4

Package info (e.g. output of rpm -q podman or apt list podman):

I already tried to upgrade to 2.0.5, but there is #7508 which describes why it was prevented. But I saw this behaviour before I ran apt upgrade which upgraded all dependencies of podman like slirp4netns, etc.. I already read the release notes for 2.0.5, but it does not seem that it would fix it.

Listing... Done
podman/unknown 2.0.5~2 amd64 [upgradable from: 2.0.4~1]
podman/unknown 2.0.5~2 arm64
podman/unknown 2.0.5~2 armhf
podman/unknown 2.0.5~2 ppc64el

Have you tested with the latest version of Podman and have you checked the Podman Troubleshooting Guide?

No: see above..
Yes: I checked the guide..

Additional environment details (AWS, VirtualBox, physical, etc.):
VirtualBox Version 6.1.97 r139689 on OpenSuse 15.1

** Additional logfiles **
I ran the above commands with --log-level debug and here are the results:

  1. If I run the pod first time, it works: run.log
  2. If I do a stop and afterwards a start, it breaks: start.log
  3. If I do a restart (while the pod is running), it works: restart.log
    One thing I notice in the log files is, that run and restart creates a network namespace. You can see it in this line: Made network namespace at /run/user/1000/netns/cni-ec4f60a1-18b5-0696-a823-b7dd42111398 for container 6fb030b00ce960001d6cdf5210a75e13d5ae0680970578574067f654eb27dc3b start does not contain this line. Maybe it helps :)
@openshift-ci-robot openshift-ci-robot added the kind/bug Categorizes issue or PR as related to a bug. label Sep 1, 2020
@Luap99
Copy link
Member

Luap99 commented Sep 1, 2020

2.0.5 will fix this.

@rei-ber
Copy link
Author

rei-ber commented Sep 2, 2020

2.0.6 fix it for me; thanks =)

@rei-ber rei-ber closed this as completed Sep 2, 2020
@github-actions github-actions bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label Sep 22, 2023
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 22, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
kind/bug Categorizes issue or PR as related to a bug. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments.
Projects
None yet
Development

No branches or pull requests

3 participants