Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rootless dns broken #16369

Closed
MartinX3 opened this issue Nov 1, 2022 · 28 comments
Closed

rootless dns broken #16369

MartinX3 opened this issue Nov 1, 2022 · 28 comments
Labels
aardvark kind/bug Categorizes issue or PR as related to a bug. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. stale-issue

Comments

@MartinX3
Copy link

MartinX3 commented Nov 1, 2022

Is this a BUG REPORT or FEATURE REQUEST? (leave only one on its own line)

/kind bug

Description

I can't use DNS names to communicate with pods.

Steps to reproduce the issue:

  1. nslookup smtp-gateway

Describe the results you received:
Host Journal

aardvark-dns[6156]: 32479 dns request failed: request timed out

Host

$ cat /etc/resolv.conf 
# This is /run/systemd/resolve/stub-resolv.conf managed by man:systemd-resolved(8).
# Do not edit.
#
# This file might be symlinked as /etc/resolv.conf. If you're looking at
# /etc/resolv.conf and seeing this text, you have followed the symlink.
#
# This is a dynamic resolv.conf file for connecting local clients to the
# internal DNS stub resolver of systemd-resolved. This file lists all
# configured search domains.
#
# Run "resolvectl status" to see details about the uplink DNS servers
# currently in use.
#
# Third party programs should typically not access this file directly, but only
# through the symlink at /etc/resolv.conf. To manage man:resolv.conf(5) in a
# different way, replace this symlink by a static file or a different symlink.
#
# See man:systemd-resolved.service(8) for details about the supported modes of
# operation for /etc/resolv.conf.

nameserver 127.0.0.53
options edns0 trust-ad
search online.net

Container

# cat /etc/resolv.conf 
search dns.podman online.net.
nameserver 10.89.0.1
nameserver 8.8.8.8
nameserver 8.8.4.4

Container

root@dyndns:/# nslookup smtp-gateway
Server:		10.89.0.1
Address:	10.89.0.1:53

** server can't find smtp-gateway.dns.podman: NXDOMAIN

** server can't find smtp-gateway.dns.podman: NXDOMAIN

*** Can't find smtp-gateway.online.net.: No answer
*** Can't find smtp-gateway.online.net.: No answer

Describe the results you expected:
DNS to IP resolution.

Additional information you deem important (e.g. issue happens only occasionally):

Output of podman version:

Client:       Podman Engine
Version:      4.3.0
API Version:  4.3.0
Go Version:   go1.19.2
Git Commit:   ad42af94903ce4f3c3cd0693e4e17e4286bf094b-dirty
Built:        Wed Oct 19 23:09:30 2022
OS/Arch:      linux/amd64

Output of podman info:

host:
  arch: amd64
  buildahVersion: 1.28.0
  cgroupControllers:
  - cpuset
  - cpu
  - io
  - memory
  - pids
  cgroupManager: systemd
  cgroupVersion: v2
  conmon:
    package: /usr/bin/conmon ist in conmon 1:2.1.4-1 enthalten
    path: /usr/bin/conmon
    version: 'conmon version 2.1.4, commit: bd1459a3ffbb13eb552cc9af213e1f56f31ba2ee'
  cpuUtilization:
    idlePercent: 99.7
    systemPercent: 0.04
    userPercent: 0.26
  cpus: 8
  distribution:
    distribution: arch
    version: unknown
  eventLogger: journald
  hostname: backupserver
  idMappings:
    gidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 10000
      size: 65536
    uidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 10000
      size: 65536
  kernel: 5.15.76-1-lts
  linkmode: dynamic
  logDriver: journald
  memFree: 27072786432
  memTotal: 33437605888
  networkBackend: netavark
  ociRuntime:
    name: crun
    package: /usr/bin/crun ist in crun 1.6-1 enthalten
    path: /usr/bin/crun
    version: |-
      crun version 1.6
      commit: 18cf2efbb8feb2b2f20e316520e0fd0b6c41ef4d
      spec: 1.0.0
      +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +YAJL
  os: linux
  remoteSocket:
    exists: true
    path: /run/user/1000/podman/podman.sock
  security:
    apparmorEnabled: false
    capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
    rootless: true
    seccompEnabled: true
    seccompProfilePath: /etc/containers/seccomp.json
    selinuxEnabled: false
  serviceIsRemote: false
  slirp4netns:
    executable: /usr/bin/slirp4netns
    package: /usr/bin/slirp4netns ist in slirp4netns 1.2.0-1 enthalten
    version: |-
      slirp4netns version 1.2.0
      commit: 656041d45cfca7a4176f6b7eed9e4fe6c11e8383
      libslirp: 4.7.0
      SLIRP_CONFIG_VERSION_MAX: 4
      libseccomp: 2.5.4
  swapFree: 67638333440
  swapTotal: 67645722624
  uptime: 48h 29m 11.00s (Approximately 2.00 days)
plugins:
  authorization: null
  log:
  - k8s-file
  - none
  - passthrough
  - journald
  network:
  - bridge
  - macvlan
  volume:
  - local
registries: {}
store:
  configFile: /home/backupserver/.config/containers/storage.conf
  containerStore:
    number: 10
    paused: 0
    running: 10
    stopped: 0
  graphDriverName: btrfs
  graphOptions: {}
  graphRoot: /home/backupserver/.local/share/containers/storage
  graphRootAllocated: 1965484457984
  graphRootUsed: 3847544832
  graphStatus:
    Build Version: Btrfs v6.0
    Library Version: "102"
  imageCopyTmpDir: /var/tmp
  imageStore:
    number: 5
  runRoot: /run/user/1000/containers
  volumePath: /home/backupserver/.local/share/containers/storage/volumes
version:
  APIVersion: 4.3.0
  Built: 1666213770
  BuiltTime: Wed Oct 19 23:09:30 2022
  GitCommit: ad42af94903ce4f3c3cd0693e4e17e4286bf094b-dirty
  GoVersion: go1.19.2
  Os: linux
  OsArch: linux/amd64
  Version: 4.3.0

Package info (e.g. output of rpm -q podman or apt list podman or brew info podman):

podman 4.3.0-1

Have you tested with the latest version of Podman and have you checked the Podman Troubleshooting Guide? (https://github.com/containers/podman/blob/main/troubleshooting.md)

Yes

Additional environment details (AWS, VirtualBox, physical, etc.):

N/A

@openshift-ci openshift-ci bot added the kind/bug Categorizes issue or PR as related to a bug. label Nov 1, 2022
@Luap99
Copy link
Member

Luap99 commented Nov 1, 2022

please test nslookup inside podman unshare --rootless-netns, and check how /etc/resolv.conf looks in this namespace.

@MartinX3
Copy link
Author

MartinX3 commented Nov 1, 2022

podman unshare --rootless-netns cat /etc/resolv.conf
search online.net.
nameserver 10.0.2.3
nameserver 2001:bc8:401::3
nameserver 2001:bc8:1::16
podman unshare --rootless-netns nslookup smtp-gateway
;; Got SERVFAIL reply from 10.0.2.3, trying next server
Server:		2001:bc8:401::3
Address:	2001:bc8:401::3#53

** server can't find smtp-gateway: NXDOMAIN
podman unshare --rootless-netns nslookup duckdns.org
Server:		10.0.2.3
Address:	10.0.2.3#53

Non-authoritative answer:
Name:	duckdns.org
Address: 99.79.152.197

@Luap99
Copy link
Member

Luap99 commented Nov 1, 2022

Wait, smtp-gateway is a container/pod name? I overlooked that at first. If that is the case please provide a full reproducer, how are the pods/containers created?

Looks like it is not finding the name in aardvarks db so it tries to resolve the name upstream which then times out.
You can check the aardvark db content with cat $XDG_RUNTIME_DIR/containers/networks/aardvark-dns/<netname>

@MartinX3
Copy link
Author

MartinX3 commented Nov 1, 2022

$ cat /run/user/1000/containers/networks/aardvark-dns/podman-default-kube-network 
10.89.0.1
717cdee4c5cea03ef5c851ccb5edb3392d4585b58de671dec42bdd354c20fdca 10.89.0.4  dyndns,717cdee4c5ce
d0d9a9ca4126623b4b7bc622887a5f8b01eecde766e78975e207dd7dea06123b 10.89.0.10  smtp,d0d9a9ca4126
e445e5e07f1026efa6eecfcc7884890810781240bacc5a2ee67fabf88e2dcfb6 10.89.0.12  borg,e445e5e07f10

I used systemctl --user enable --now podman-kube@$(systemd-escape $(pwd)/smtp-gateway-pod.yaml).service
It created a network for the kube pods automatically.

# smtp-gateway-pod.yaml

apiVersion: v1
kind: Pod
metadata:
  name: smtp
  labels:
    io.containers.autoupdate: registry
spec:
  restartPolicy: on-failure
  containers:
    - name: gateway
      image: docker.io/yoryan/mailrise:latest
      imagePullPolicy: always
      ports:
        - containerPort: 8025
          hostPort: 25
          name: smtp
          protocol: TCP
      resources:
        requests:
          cpu: "10m"
          memory: "128Mi"
        limits:
          cpu: "100m"
          memory: "128Mi"
      volumeMounts:
        - name: mailrise.conf-hostpath
          mountPath: /etc/mailrise.conf
  volumes:
    - name: mailrise.conf-hostpath
      hostPath:
        path: ./mailrise.conf # Change to bash output of $(pwd)/mailrise.conf
# /etc/mailrise.conf

configs:
  [email protected]:
    urls:
      - tgram://{bot_token}/

@Luap99
Copy link
Member

Luap99 commented Nov 1, 2022

apiVersion: v1
kind: Pod
metadata:
  name: smtp

If the name is smtp then you have to use this as dns name. podman run has an --network-alias field which can be used to specify more names but I am not sure if this is supported with kube yaml.

@MartinX3
Copy link
Author

MartinX3 commented Nov 1, 2022

But I can have multiple containers with different names in the same pod

also if I simply use "smtp" it wants to lookup it in the internet with ´online.net.`

Is aardvark-dns[6156]: 32479 dns request failed: request timed out maybe caused by the wrong priority?

But even if I open port 53 in firewallD in UDP and TCP the same journal error happens.

@Luap99
Copy link
Member

Luap99 commented Nov 1, 2022

All containers inside the pod are in the same netns (share the same ip) so you need to use the pod name, the network setup is just run for the infra container which uses the pod name as dns name.

@MartinX3
Copy link
Author

MartinX3 commented Nov 1, 2022

It doesn't work inside the container.

And what could fix this error?
aardvark-dns[6156]: 3430 dns request failed: request timed out

I just use systemD-resolved.

Inside container:

root@dyndns:/# nslookup smtp
Server:		10.89.0.1
Address:	10.89.0.1:53

Non-authoritative answer:
Name:	smtp.dns.podman
Address: 10.89.0.10

Non-authoritative answer:

*** Can't find smtp.online.net.: No answer
*** Can't find smtp.online.net.: No answer

@MartinX3
Copy link
Author

MartinX3 commented Nov 8, 2022

I tried it on a different computer.
Now at home behind my router instead of a rent server in the internet.

I also get there:

Nov 08 14:14:36 systemd[1368]: Started /usr/lib/podman/aardvark-dns --config /run/user/1000/containers/networks/aardvark-dns -p 53 run.
Nov 08 14:18:11 aardvark-dns[4473]: 36600 dns request failed: request timed out

And nslookup smtp inside the container results into

Nov 08 14:22:52 aardvark-dns[4473]: Failed while parsing message: unexpected end of input reached
Nov 08 14:22:52 aardvark-dns[4473]: None received while parsing dns message, this is not expected server will ignore this message

@github-actions
Copy link

github-actions bot commented Dec 9, 2022

A friendly reminder that this issue had no activity for 30 days.

@MartinX3
Copy link
Author

MartinX3 commented Dec 9, 2022

/remove stale

@tisc0
Copy link

tisc0 commented Dec 15, 2022

Same here,

$ cat /etc/redhat-release 
Red Hat Enterprise Linux release 8.7 (Ootpa)

$ podman version
Client:       Podman Engine
Version:      4.2.0
API Version:  4.2.0
Go Version:   go1.18.4
Built:        Tue Nov 22 14:04:24 2022
OS/Arch:      linux/amd64

$ podman-compose version
['podman', '--version', '']
using podman version: 4.2.0
podman-composer version  1.0.3
podman --version 
podman version 4.2.0

$ podman info |grep netw
  networkBackend: netavark

$ rpm -qa |grep -E 'podman|atav|aard'
podman-catatonit-4.2.0-4.0.1.module+el8.7.0+20884+3747d2d0.x86_64
podman-plugins-4.2.0-4.0.1.module+el8.7.0+20884+3747d2d0.x86_64
podman-4.2.0-4.0.1.module+el8.7.0+20884+3747d2d0.x86_64
aardvark-dns-1.1.0-5.module+el8.7.0+20877+e0f9ac15.x86_64
podman-gvproxy-4.2.0-4.0.1.module+el8.7.0+20884+3747d2d0.x86_64
python3-podman-4.2.1-1.module+el8.7.0+20877+e0f9ac15.noarch
podman-compose-1.0.3-3.el8.noarch

$ uname -a
Linux myServer 4.18.0-305.7.1.el8_4.x86_64 #1 SMP Tue Jun 29 21:55:12 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

docker-compose.yml

version: '3'
  
services:
  mariadb:
    image: 'localhost/mariadb_xxx:v1.2'
    ports:
      - '127.0.0.1:3308:3306'
    volumes:
      - /home/user/services/prestashop/data/mariadb_data:/var/lib/mysql:rw,Z
      - ...
    environment:
      MYSQL_ROOT_PASSWORD: "$MYSQL_ROOT_PASSWORD"

  prestashop:
    image: 'localhost/httpd-php74:v2.1'
    ports:
      - '127.0.0.1:8402:80'
    volumes:
      - /home/user/services/prestashop/data/httpd:/etc/httpd:rw,Z
      - ...
    depends_on:
      - mariadb

Can't access 3306 port from prestashop neither by 127.0.0.1 nor mariadb. Have to get the IP, which changes at every restart... not exactly practical.

Thanks for your help !

@baszoetekouw
Copy link

I'm seeing similar issues on Debian, with podman 4.3.1 and aardvark-dns 1.0.3 and 1.4.0. Note though that I'm getting these issues also for rootful containers.

However, I strongly suspect this is a bug in aardvark-dns. I can see that podman is injecting the correct config into /run/containers/networks/aardvark-dns/, and if I replace the Debian-provided /usr/lib/podman/aardvark-dns with a manually built version from aardvark-dns main, all issues disappear and everything works.

I'm trying to pinpoint what the issue is exactly. At this point, it could either be an issue in one of aardvark-dns's dependencies (as Debian patches aardvark-dns to use older dependencies), or some weirdness in the Debian build process.

@rhatdan
Copy link
Member

rhatdan commented Dec 22, 2022

@vrothberg @flouthoc thoughts?

@vrothberg
Copy link
Member

Sounds like a packaging issue in Debian.

@jmaris
Copy link

jmaris commented Dec 28, 2022

I have a similar issue in Fedora Silverblue whereby rootless dns resolution within containers is broken: 1/3 of requests with the name of other containers fail with no explanation.

I believe this is probably tied to this issue:

containers/aardvark-dns#248

@flouthoc
Copy link
Collaborator

flouthoc commented Jan 1, 2023

I'm seeing similar issues on Debian, with podman 4.3.1 and aardvark-dns 1.0.3 and 1.4.0. Note though that I'm getting these issues also for rootful containers.

@baszoetekouw This (#16369 (comment)) is a packaging issue older aardvark-dns is not compatible with newer netavark, they must be on same version.

@baszoetekouw
Copy link

I doubt that that is the problem here: I've tried aardvark-dns versions 1.3, 1.4 and main. Neither of those worked when built as Debian packages, but at least 1.4 and master worked fine when built manually (i.e., running cargo build and then copying the resulting binary to /usr/lib/podman/).

I assume the issue is with one of the dependencies, as Debian builds aardvark-dns using the rust packages that are available in the distribution rather than the ones specified in the cargo.toml.

@siretart
Copy link
Contributor

siretart commented Jan 1, 2023

@baszoetekouw that would indeed indicate that the issue is triggerd by one (or more) of dependent packages. Strangely, the supplied unit tests of aardvark-dns all pass.

Here is a list of modifications I've made in Debian to the dependencies:
https://salsa.debian.org/debian/aardvark-dns/-/blob/8259b42962440d40bc69e3a7ea1678b41eab5fc6/debian/patches/update-dependencies.patch

anything suspicious that wold explain that phenomenon?

What surprises me is that containers/aardvark-dns#248 seems to indicate the issue also exists on arch linux.

@siretart
Copy link
Contributor

siretart commented Jan 1, 2023

I'm seeing similar issues on Debian, with podman 4.3.1 and aardvark-dns 1.0.3 and 1.4.0. Note though that I'm getting these issues also for rootful containers.

@baszoetekouw This (#16369 (comment)) is a packaging issue older aardvark-dns is not compatible with newer netavark, they must be on same version.

just FTR, debian does not currently ship 1.4.0, both aardvark-dns and netavark are currently at 1.0.3. I do have updated both packages locally but haven't uploaded them yet as I can reproduces the issue with both packages updated to 1.4.0 on my laptop.

@siretart
Copy link
Contributor

siretart commented Jan 3, 2023

Hm, trying to compile aardvark-dns with trust-dns upstream in containers/aardvark-dns#275 seems to trigger the same symptom as discussed in this bug.

Coincidence?

@SecT0uch
Copy link

SecT0uch commented Jan 5, 2023

I also have a similar issue on a barebone Alpine Linux Edge.

@github-actions
Copy link

github-actions bot commented Feb 5, 2023

A friendly reminder that this issue had no activity for 30 days.

@MartinX3
Copy link
Author

MartinX3 commented Feb 5, 2023

/remove stale

@siretart
Copy link
Contributor

siretart commented Feb 5, 2023

I believe this to be resolved in debian/sid with aardvark-dns_1.4.0-3 and netavark_1.4.0-3.

Please let me know (ideally with bugs filed in debian) if you are still experiencing issues and how to reproduce them.

@MartinX3
Copy link
Author

MartinX3 commented Feb 5, 2023

I use arch linux.

@Luap99
Copy link
Member

Luap99 commented Feb 5, 2023

The fix is in v1.5 upstream and I guess arch already ships that. If your problem still exists in that version there is another bug.

@MartinX3
Copy link
Author

MartinX3 commented Feb 5, 2023

Ah, I didn't test 1.5
I thought the fixing commit would get linked to this issue ticket.

I also don't see the fix in the changelog
https://github.com/containers/aardvark-dns/releases/tag/v1.5.0

@rhatdan rhatdan closed this as completed Feb 5, 2023
@github-actions github-actions bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label Sep 2, 2023
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 2, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
aardvark kind/bug Categorizes issue or PR as related to a bug. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. stale-issue
Projects
None yet
Development

No branches or pull requests

10 participants