Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

podman run starts the main process in the container before the slirp4netns ipv6 network is setup #11062

Closed
Hendrik-H opened this issue Jul 27, 2021 · 7 comments · Fixed by #12098
Assignees
Labels
kind/bug Categorizes issue or PR as related to a bug. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. slirp4netns Bug is in slirp4netns

Comments

@Hendrik-H
Copy link

Is this a BUG REPORT or FEATURE REQUEST? (leave only one on its own line)

/kind bug

If you start a rootless container with slirp4netns:enable_ipv6=true the IPv6 network is not working straight away but only after a short time, like 1-2s

Steps to reproduce the issue:

  1. podman run -it --network slirp4netns:enable_ipv6=true --entrypoint bash fedora:34 -c "curl -v -6 https://www.google.com"

Describe the results you received:
The command fails with Network is unreachable.

Describe the results you expected:
That the network works straight away.

Additional information you deem important (e.g. issue happens only occasionally):
The problem does not occur when using IPv4 (-4 instead of -6).
The issue can be seen with podman 3.1.2 and 2.2.1.

Output of podman version:

Version:      3.1.2
API Version:  3.1.2
Go Version:   go1.15.11
Built:        Tue May 11 15:53:47 2021
OS/Arch:      linux/amd64

Output of podman info --debug:

host:
  arch: amd64
  buildahVersion: 1.20.1
  cgroupManager: systemd
  cgroupVersion: v2
  conmon:
    package: conmon-2.0.27-2.fc33.x86_64
    path: /usr/bin/conmon
    version: 'conmon version 2.0.27, commit: '
  cpus: 12
  distribution:
    distribution: fedora
    version: "33"
  eventLogger: journald
  hostname: p52
  idMappings:
    gidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 65536
    uidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 65536
  kernel: 5.12.8-200.fc33.x86_64
  linkmode: dynamic
  memFree: 1097736192
  memTotal: 33331945472
  ociRuntime:
    name: crun
    package: crun-0.19.1-3.fc33.x86_64
    path: /usr/bin/crun
    version: |-
      crun version 0.19.1
      commit: 1535fedf0b83fb898d449f9680000f729ba719f5
      spec: 1.0.0
      +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +YAJL
  os: linux
  remoteSocket:
    path: /run/user/1000/podman/podman.sock
  security:
    apparmorEnabled: false
    capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
    rootless: true
    seccompEnabled: true
    selinuxEnabled: true
  slirp4netns:
    executable: /usr/bin/slirp4netns
    package: slirp4netns-1.1.9-1.fc33.x86_64
    version: |-
      slirp4netns version 1.1.9
      commit: 4e37ea557562e0d7a64dc636eff156f64927335e
      libslirp: 4.3.1
      SLIRP_CONFIG_VERSION_MAX: 3
      libseccomp: 2.5.0
  swapFree: 14121390080
  swapTotal: 21063786496
  uptime: 1342h 4m 54.33s (Approximately 55.92 days)
registries:
  9.152.170.156:5000:
    Blocked: false
    Insecure: true
    Location: 9.152.170.156:5000
    MirrorByDigestOnly: false
    Mirrors: []
    Prefix: 9.152.170.156:5000
  127.0.0.1:5000:
    Blocked: false
    Insecure: true
    Location: 127.0.0.1:5000
    MirrorByDigestOnly: false
    Mirrors: []
    Prefix: 127.0.0.1:5000
  localhost:5000:
    Blocked: false
    Insecure: true
    Location: localhost:5000
    MirrorByDigestOnly: false
    Mirrors: []
    Prefix: localhost:5000
  localhost:5555:
    Blocked: false
    Insecure: true
    Location: localhost:5555
    MirrorByDigestOnly: false
    Mirrors: []
    Prefix: localhost:5555
  search:
  - registry.fedoraproject.org
  - registry.access.redhat.com
  - registry.centos.org
  - docker.io
store:
  configFile: /home/haddorp/.config/containers/storage.conf
  containerStore:
    number: 10
    paused: 0
    running: 1
    stopped: 9
  graphDriverName: overlay
  graphOptions:
    overlay.mount_program:
      Executable: /usr/bin/fuse-overlayfs
      Package: fuse-overlayfs-1.5.0-1.fc33.x86_64
      Version: |-
        fusermount3 version: 3.9.3
        fuse-overlayfs: version 1.5
        FUSE library version 3.9.3
        using FUSE kernel interface version 7.31
  graphRoot: /home/haddorp/.local/share/containers/storage
  graphStatus:
    Backing Filesystem: extfs
    Native Overlay Diff: "false"
    Supports d_type: "true"
    Using metacopy: "false"
  imageStore:
    number: 74
  runRoot: /run/user/1000/containers
  volumePath: /home/haddorp/.local/share/containers/storage/volumes
version:
  APIVersion: 3.1.2
  Built: 1620741227
  BuiltTime: Tue May 11 15:53:47 2021
  GitCommit: ""
  GoVersion: go1.15.11
  OsArch: linux/amd64
  Version: 3.1.2

Package info (e.g. output of rpm -q podman or apt list podman):

podman-3.1.2-2.fc33.x86_64

Have you tested with the latest version of Podman and have you checked the Podman Troubleshooting Guide? (https://github.com/containers/podman/blob/master/troubleshooting.md)

@Luap99 reproduced this with v3.2.3 with slirp4netns v1.1.8+dev

Additional environment details (AWS, VirtualBox, physical, etc.):

@openshift-ci openshift-ci bot added the kind/bug Categorizes issue or PR as related to a bug. label Jul 27, 2021
@Luap99
Copy link
Member

Luap99 commented Jul 27, 2021

OK this only affects slirp4netns with ipv6. Podman already waits for slirp4netns to report that it is ready. Therefore I assume slirp4netns does not wait for ipv6 setup to finish before it writes to the readyfd.
Running podman with this patch confims this:

diff --git a/libpod/networking_slirp4netns.go b/libpod/networking_slirp4netns.go
index 410b377ec..ee717d6d4 100644
--- a/libpod/networking_slirp4netns.go
+++ b/libpod/networking_slirp4netns.go
@@ -16,6 +16,7 @@ import (
        "syscall"
        "time"
 
+       "github.com/containernetworking/plugins/pkg/ns"
        "github.com/containers/podman/v3/pkg/errorhandling"
        "github.com/containers/podman/v3/pkg/rootlessport"
        "github.com/containers/podman/v3/pkg/servicereaper"
@@ -310,6 +311,13 @@ func (r *Runtime) setupSlirp4netns(ctr *Container) error {
                return err
        }
 
+       ctr.state.NetNS.Do(func(nn ns.NetNS) error {
+               cmd2 := exec.Command("ip", "addr")
+               cmd2.Stdout = os.Stdout
+               cmd2.Stderr = os.Stderr
+               return cmd2.Run()
+       })
+
        // Set a default slirp subnet. Parsing a string with the net helper is easier than building the struct myself
        _, ctr.slirp4netnsSubnet, _ = net.ParseCIDR(defaultSlirp4netnsSubnet)

This will not work:
podman run --rm --network slirp4netns:enable_ipv6=true alpine ip addr
Waiting two seconds works:
podman run --rm --network slirp4netns:enable_ipv6=true alpine sh -c "sleep 2 && ip addr"

@AkihiroSuda PTAL Any ideas why slirp4netns does not wait until ipv6 setup is done?

@Luap99 Luap99 changed the title podman run starts the main process in the container before the network is setup podman run starts the main process in the container before the slirp4netns ipv6 network is setup Jul 27, 2021
@Luap99 Luap99 added the slirp4netns Bug is in slirp4netns label Jul 27, 2021
@AkihiroSuda
Copy link
Collaborator

AkihiroSuda commented Aug 26, 2021

Any ideas why slirp4netns does not wait until ipv6 setup is done?

Because slirp4netns itself doesn't configure IPv6 (even when --configure is set)

https://github.com/rootless-containers/slirp4netns/blob/631f361d196b3d2bfb9bd903ba3a335f622f87d1/main.c#L171-L205

The actual setup might be done by the kernel? How can slirp4netns (or Podman) wait for the completion of the setup event?

@Luap99
Copy link
Member

Luap99 commented Aug 26, 2021

I see, it looks like the delay is caused by IPv6 Duplicate Address Detection (DAD). I think we should turn this of in the namespace it is safe to assume that there cannot be a address conflict in the ns, e.g sysctl -w net.ipv6.conf.all.accept_dad=0.
@AkihiroSuda Do you think turning this off is a good idea and if so, should this be done in slirp4netns or podman?

@vrothberg
Copy link
Member

friendly ping

@AkihiroSuda
Copy link
Collaborator

AkihiroSuda commented Sep 23, 2021

I see, it looks like the delay is caused by IPv6 Duplicate Address Detection (DAD). I think we should turn this of in the namespace it is safe to assume that there cannot be a address conflict in the ns, e.g sysctl -w net.ipv6.conf.all.accept_dad=0.
@AkihiroSuda Do you think turning this off is a good idea and if so, should this be done in slirp4netns or podman?

SGTM, perhaps it should be done on Podman side.

@github-actions
Copy link

A friendly reminder that this issue had no activity for 30 days.

rhatdan added a commit to rhatdan/common that referenced this issue Oct 25, 2021
rhatdan added a commit to rhatdan/common that referenced this issue Oct 25, 2021
@rhatdan
Copy link
Member

rhatdan commented Oct 25, 2021

When this vendors into Podman and the containers.conf gets released to the field, this will be fixed.

@rhatdan rhatdan self-assigned this Oct 25, 2021
rhatdan added a commit to rhatdan/podman that referenced this issue Oct 25, 2021
Luap99 added a commit to Luap99/libpod that referenced this issue Oct 26, 2021
Duplicate Address Detection slows the ipv6 setup down for 1-2 seconds.
Since slirp4netns is run it is own namespace and not directly routed
we can skip this to make the ipv6 address immediately available.
We change the default to make sure the slirp tap interface gets the
correct value assigned so DAD is disabled for it.
Also make sure to change this value back to the original after slirp4netns
is ready in case users rely on this sysctl.

Fixes containers#11062

Signed-off-by: Paul Holzinger <[email protected]>
mheon pushed a commit to mheon/libpod that referenced this issue Nov 12, 2021
Duplicate Address Detection slows the ipv6 setup down for 1-2 seconds.
Since slirp4netns is run it is own namespace and not directly routed
we can skip this to make the ipv6 address immediately available.
We change the default to make sure the slirp tap interface gets the
correct value assigned so DAD is disabled for it.
Also make sure to change this value back to the original after slirp4netns
is ready in case users rely on this sysctl.

Fixes containers#11062

Signed-off-by: Paul Holzinger <[email protected]>
@github-actions github-actions bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label Sep 21, 2023
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 21, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
kind/bug Categorizes issue or PR as related to a bug. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. slirp4netns Bug is in slirp4netns
Projects
None yet
5 participants