Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rootless Podman networking not working on boot until after removing user rootless-netns folder and restarting service #22637

Closed
flacks opened this issue May 8, 2024 · 5 comments
Labels
kind/bug Categorizes issue or PR as related to a bug. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. network Networking related issue or feature

Comments

@flacks
Copy link

flacks commented May 8, 2024

Issue Description

On Fedora 40 server using Podman 5, I have a few rootless quadlets set to run at boot. I create individual networks for different services so that services can communicate with their dependencies (e.g. WordPress and a MariaDB instance). The different containers and networks are symlinked in ~/.config/containers/systemd/, and my container user has Linger=yes set, so they run at boot.

An example network looks like this:

wordpress.network

[Unit]
Description=Podman Network for WordPress

[Network]
DisableDNS=false
Internal=false

While the pods startup without issue, DNS resolution does not work (as evidenced by podman exec -u root wordpress nslookup google.com) until I remove /run/user/1000/containers/networks/rootless-netns/ and restart the service.

podman version:

Client:       Podman Engine
Version:      5.0.2
API Version:  5.0.2
Go Version:   go1.22.2
Built:        Sun Apr 21 20:00:00 2024
OS/Arch:      linux/amd64

rpm -q podman: podman-5.0.2-1.fc40.x86_64

Steps to reproduce the issue

Steps to reproduce the issue

  1. Create rootless quadlet with non-internal network
  2. Enable it at boot, restart
  3. Run podman exec -u root [service] nslookup google.com

Describe the results you received

No DNS resolution

Describe the results you expected

DNS records successfully returned

podman info output

host:
  arch: amd64
  buildahVersion: 1.35.3
  cgroupControllers:
  - cpu
  - memory
  - pids
  cgroupManager: systemd
  cgroupVersion: v2
  conmon:
    package: conmon-2.1.10-1.fc40.x86_64
    path: /usr/bin/conmon
    version: 'conmon version 2.1.10, commit: '
  cpuUtilization:
    idlePercent: 95.64
    systemPercent: 1.65
    userPercent: 2.7
  cpus: 12
  databaseBackend: boltdb
  distribution:
    distribution: fedora
    variant: server
    version: "40"
  eventLogger: journald
  freeLocks: 2021
  hostname: cafo
  idMappings:
    gidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 65536
    uidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 65536
  kernel: 6.8.8-300.fc40.x86_64
  linkmode: dynamic
  logDriver: journald
  memFree: 52144754688
  memTotal: 67282534400
  networkBackend: netavark
  networkBackendInfo:
    backend: netavark
    dns:
      package: aardvark-dns-1.10.0-1.fc40.x86_64
      path: /usr/libexec/podman/aardvark-dns
      version: aardvark-dns 1.10.0
    package: netavark-1.10.3-3.fc40.x86_64
    path: /usr/libexec/podman/netavark
    version: netavark 1.10.3
  ociRuntime:
    name: crun
    package: crun-1.14.4-1.fc40.x86_64
    path: /usr/bin/crun
    version: |-
      crun version 1.14.4
      commit: a220ca661ce078f2c37b38c92e66cf66c012d9c1
      rundir: /run/user/1000/crun
      spec: 1.0.0
      +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +LIBKRUN +WASM:wasmedge +YAJL
  os: linux
  pasta:
    executable: /usr/bin/pasta
    package: passt-0^20240426.gd03c4e2-1.fc40.x86_64
    version: |
      pasta 0^20240426.gd03c4e2-1.fc40.x86_64
      Copyright Red Hat
      GNU General Public License, version 2 or later
        <https://www.gnu.org/licenses/old-licenses/gpl-2.0.html>
      This is free software: you are free to change and redistribute it.
      There is NO WARRANTY, to the extent permitted by law.
  remoteSocket:
    exists: true
    path: /run/user/1000/podman/podman.sock
  security:
    apparmorEnabled: false
    capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
    rootless: true
    seccompEnabled: true
    seccompProfilePath: /usr/share/containers/seccomp.json
    selinuxEnabled: true
  serviceIsRemote: false
  slirp4netns:
    executable: ""
    package: ""
    version: ""
  swapFree: 33640935424
  swapTotal: 33640935424
  uptime: 2h 19m 22.00s (Approximately 0.08 days)
  variant: ""
plugins:
  authorization: null
  log:
  - k8s-file
  - none
  - passthrough
  - journald
  network:
  - bridge
  - macvlan
  - ipvlan
  volume:
  - local
registries:
  search:
  - registry.fedoraproject.org
  - registry.access.redhat.com
  - docker.io
  - quay.io
store:
  configFile: /home/jean/.config/containers/storage.conf
  containerStore:
    number: 15
    paused: 0
    running: 11
    stopped: 4
  graphDriverName: overlay
  graphOptions: {}
  graphRoot: /home/jean/.local/share/containers/storage
  graphRootAllocated: 997991694336
  graphRootUsed: 189054328832
  graphStatus:
    Backing Filesystem: xfs
    Native Overlay Diff: "true"
    Supports d_type: "true"
    Supports shifting: "false"
    Supports volatile: "true"
    Using metacopy: "false"
  imageCopyTmpDir: /var/tmp
  imageStore:
    number: 13
  runRoot: /run/user/1000/containers
  transientStore: false
  volumePath: /home/jean/.local/share/containers/storage/volumes
version:
  APIVersion: 5.0.2
  Built: 1713744000
  BuiltTime: Sun Apr 21 20:00:00 2024
  GitCommit: ""
  GoVersion: go1.22.2
  Os: linux
  OsArch: linux/amd64
  Version: 5.0.2

Podman in a container

No

Privileged Or Rootless

Rootless

Upstream Latest Release

Yes

Additional environment details

Commodity x86 machine. I use systemd-networkd with systemd-networkd-wait-online enabled.

Additional information

Additional information like issue happens only occasionally or issue happens with a particular architecture or on a particular setting

@flacks flacks added the kind/bug Categorizes issue or PR as related to a bug. label May 8, 2024
@Luap99
Copy link
Member

Luap99 commented May 8, 2024

Is only dns not working or no networking at all? Do you have a aadvark-dns process running?
Please provide the full logs of the unit(s).

@Luap99 Luap99 added the network Networking related issue or feature label May 8, 2024
@flacks
Copy link
Author

flacks commented May 8, 2024

Correct, no networking until removal of /run/user/1000/containers/networks/rootless-netns. Amending title.

❯ podman exec -u root caddy ping -c 1 9.9.9.9
PING 9.9.9.9 (9.9.9.9): 56 data bytes
^C

aardvark-dns is running:

❯ procs | rg aardva
 2901  jean            │        0.0 0.0 00:00:00 │ /usr/libexec/podman/aardvark-dns --config /run/user/1000/containers/networks/aardvark-dns -p 53 run

Unit logs don't reveal much apart from connection errors to external networks.

May 08 08:44:08 cafo systemd[2229]: Starting caddy.service - Caddy Quadlet...
May 08 08:44:08 cafo podman[2623]: 2024-05-08 08:44:08.58275941 -0400 EDT m=+0.157671920 image pull 33797e62aca553ceb7712eb6d77a8c926f144b8508e5f88d8654226c990a4a4d localhost/caddy-cloudflare:latest
May 08 08:44:08 cafo podman[2623]: 2024-05-08 08:44:08.705054997 -0400 EDT m=+0.279967502 container create 31c4fd686e7ceb6dadbc281a914cd0e02e2570df72af9c7880b1b07cb8b7e930 (image=localhost/caddy-cloudflare:latest, name=caddy, org.opencontainers.image.vendor=Light Code Labs, io.containers.autoupdate=registry, org.opencontainers.image.url=https://caddyserver.com, org.opencontainers.image.title=Caddy, org.opencontainers.image.version=v2.7.6, PODMAN_SYSTEMD_UNIT=caddy.service, org.opencontainers.image.licenses=Apache-2.0, org.opencontainers.image.description=a powerful, enterprise-ready, open source web server with automatic HTTPS written in Go, io.buildah.version=1.35.3, org.opencontainers.image.documentation=https://caddyserver.com/docs, org.opencontainers.image.source=https://github.com/caddyserver/caddy-docker)
May 08 08:44:10 cafo podman[2623]: 2024-05-08 08:44:10.658988869 -0400 EDT m=+2.233901386 container init 31c4fd686e7ceb6dadbc281a914cd0e02e2570df72af9c7880b1b07cb8b7e930 (image=localhost/caddy-cloudflare:latest, name=caddy, io.containers.autoupdate=registry, org.opencontainers.image.vendor=Light Code Labs, org.opencontainers.image.title=Caddy, org.opencontainers.image.version=v2.7.6, PODMAN_SYSTEMD_UNIT=caddy.service, org.opencontainers.image.description=a powerful, enterprise-ready, open source web server with automatic HTTPS written in Go, org.opencontainers.image.source=https://github.com/caddyserver/caddy-docker, org.opencontainers.image.documentation=https://caddyserver.com/docs, org.opencontainers.image.licenses=Apache-2.0, org.opencontainers.image.url=https://caddyserver.com, io.buildah.version=1.35.3)
May 08 08:44:10 cafo podman[2623]: 2024-05-08 08:44:10.663834263 -0400 EDT m=+2.238746767 container start 31c4fd686e7ceb6dadbc281a914cd0e02e2570df72af9c7880b1b07cb8b7e930 (image=localhost/caddy-cloudflare:latest, name=caddy, org.opencontainers.image.title=Caddy, org.opencontainers.image.url=https://caddyserver.com, io.containers.autoupdate=registry, org.opencontainers.image.version=v2.7.6, org.opencontainers.image.documentation=https://caddyserver.com/docs, org.opencontainers.image.description=a powerful, enterprise-ready, open source web server with automatic HTTPS written in Go, org.opencontainers.image.source=https://github.com/caddyserver/caddy-docker, io.buildah.version=1.35.3, PODMAN_SYSTEMD_UNIT=caddy.service, org.opencontainers.image.licenses=Apache-2.0, org.opencontainers.image.vendor=Light Code Labs)
May 08 08:44:10 cafo caddy[2623]: 31c4fd686e7ceb6dadbc281a914cd0e02e2570df72af9c7880b1b07cb8b7e930
May 08 08:44:10 cafo systemd[2229]: Started caddy.service - Caddy Quadlet.
May 08 08:44:10 cafo caddy[4315]: {"level":"info","ts":1715172250.8058825,"msg":"using provided configuration","config_file":"/etc/caddy/Caddyfile","config_adapter":"caddyfile"}
[...]
May 08 08:44:29 cafo caddy[4315]: {"level":"error","ts":1715172269.6472433,"logger":"http.log.error","msg":"dial tcp 192.168.1.110:8123: i/o timeout","request":{"remote_ip":"10.89.4.3","remote_port":"44706","client_ip":"10.89.4.3","proto":"HTTP/1.1","method":"GET","host":"[snip]","uri":"/api/websocket","headers":{"Connection":["Upgrade"],"Pragma":["no-cache"],"User-Agent":["Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/124.0.0.0 Safari/537.36"],"Sec-Websocket-Key":["BYDI1TwAOAe/AhBOIx5ZLg=="],"Cookie":[],"Sec-Websocket-Extensions":["permessage-deflate; client_max_window_bits"],"Upgrade":["websocket"],"Sec-Websocket-Version":["13"],"Origin":["https://[snip]"],"Accept-Encoding":["gzip, deflate, br, zstd"],"Accept-Language":["en-US,en;q=0.9,es;q=0.8"],"Cache-Control":["no-cache"]},"tls":{"resumed":true,"version":772,"cipher_suite":4865,"proto":"http/1.1","server_name":"[snip]"}},"duration":3.001148368,"status":502,"err_id":"xdft51fww","err_trace":"reverseproxy.statusError (reverseproxy.go:1267)"}

Please let me know what other logs I can provide.

@flacks flacks changed the title Rootless Podman DNS resolution not working on boot until after removing user rootless-netns folder and restarting Rootless Podman networking not working on boot until after removing user rootless-netns folder and restarting service May 8, 2024
@Luap99
Copy link
Member

Luap99 commented May 8, 2024

If the issue is networking then I strongly suspect this to be a duplicate of #22197, I suggest you try one of the workarounds there

@flacks
Copy link
Author

flacks commented May 9, 2024

Ok, as per #22197, I've added my own user network-online.service:

❯ cat .config/systemd/user/network-online.service 
[Unit]
Description=Wait for Network to be Configured

[Service]
Type=oneshot
ExecStart=/usr/lib/systemd/systemd-networkd-wait-online
RemainAfterExit=yes

[Install]
WantedBy=default.target

And configured all my quadlets to Wants=network-online.service and After=network-online.service, and they seem to startup correctly at system boot with functional networking.

This is definitely a workaround, and not a true solution, so up to you @Luap99 whether to close this issue and track a solution in #22197 or otherwise.

Nevertheless, thank you for pointing to that issue, because a workaround is better than having no way around functional services except manual intervention, especially when my server auto-updates and restarts.

@Luap99
Copy link
Member

Luap99 commented May 15, 2024

Thanks, I close it as dup then

@Luap99 Luap99 closed this as not planned Won't fix, can't repro, duplicate, stale May 15, 2024
@stale-locking-app stale-locking-app bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label Aug 14, 2024
@stale-locking-app stale-locking-app bot locked as resolved and limited conversation to collaborators Aug 14, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
kind/bug Categorizes issue or PR as related to a bug. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. network Networking related issue or feature
Projects
None yet
Development

No branches or pull requests

2 participants