Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regression 3.4: missing entries in /etc/hosts #12003

Closed
douglas-legulas opened this issue Oct 17, 2021 · 42 comments · Fixed by #13918
Closed

Regression 3.4: missing entries in /etc/hosts #12003

douglas-legulas opened this issue Oct 17, 2021 · 42 comments · Fixed by #13918
Assignees
Labels
kind/bug Categorizes issue or PR as related to a bug. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. network Networking related issue or feature

Comments

@douglas-legulas
Copy link

Is this a BUG REPORT or FEATURE REQUEST? (leave only one on its own line)

/kind bug

Description

Example: a pod starts 3 containers named example-mariadb, example-php and example-httpd. Because their /etc/hosts file is missing the required entries, they cannot locate one another through hostname resolution. In this example, PHP fails to locate the MariaDB container by hostname example-mariadb. As a symptom PHP logs the error:

php_network_getaddresses: getaddrinfo failed: Name or service not known

/etc/hosts available inside containers before v3.4:

127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6

10.88.0.4 example 081aafb26c65-infra
127.0.1.1 example example-mariadb
127.0.1.1 example example-httpd
127.0.1.1 example example-php

/etc/hosts available inside containers in v3.4:

127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6

10.88.0.5 example 081aafb26c65-infra
10.88.0.1 host.containers.internal

Changes in v3.4 that may have caused the regression: #11411, #11596

Steps to reproduce the issue:

  1. Build and start the pod at https://1drv.ms/u/s!ApqEOOKoIKfoeo7T9Yp5pk5cM9U?e=c0Q086

  2. Exec into the shell of one of the containers and check its /etc/hosts.

Output of podman version:

Version:      3.4.0
API Version:  3.4.0
Go Version:   go1.16.8
Built:        Thu Sep 30 16:40:21 2021
OS/Arch:      linux/amd64

Output of podman info --debug:

host:
  arch: amd64
  buildahVersion: 1.23.1
  cgroupControllers:
  - cpuset
  - cpu
  - io
  - memory
  - hugetlb
  - pids
  cgroupManager: systemd
  cgroupVersion: v2
  conmon:
    package: conmon-2.0.30-2.fc34.x86_64
    path: /usr/bin/conmon
    version: 'conmon version 2.0.30, commit: '
  cpus: 8
  distribution:
    distribution: fedora
    variant: workstation
    version: "34"
  eventLogger: journald
  hostname: alq22
  idMappings:
    gidmap: null
    uidmap: null
  kernel: 5.14.11-200.fc34.x86_64
  linkmode: dynamic
  logDriver: k8s-file
  memFree: 9558335488
  memTotal: 16698474496
  ociRuntime:
    name: crun
    package: crun-1.2-1.fc34.x86_64
    path: /usr/bin/crun
    version: |-
      crun version 1.2
      commit: 4f6c8e0583c679bfee6a899c05ac6b916022561b
      spec: 1.0.0
      +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +YAJL
  os: linux
  remoteSocket:
    path: /run/podman/podman.sock
  security:
    apparmorEnabled: false
    capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
    rootless: false
    seccompEnabled: true
    seccompProfilePath: /usr/share/containers/seccomp.json
    selinuxEnabled: true
  serviceIsRemote: false
  slirp4netns:
    executable: /usr/bin/slirp4netns
    package: slirp4netns-1.1.12-2.fc34.x86_64
    version: |-
      slirp4netns version 1.1.12
      commit: 7a104a101aa3278a2152351a082a6df71f57c9a3
      libslirp: 4.4.0
      SLIRP_CONFIG_VERSION_MAX: 3
      libseccomp: 2.5.0
  swapFree: 8589930496
  swapTotal: 8589930496
  uptime: 59m 50.09s
plugins:
  log:
  - k8s-file
  - none
  - journald
  network:
  - bridge
  - macvlan
  volume:
  - local
registries:
  search:
  - registry.fedoraproject.org
  - registry.access.redhat.com
  - docker.io
  - quay.io
store:
  configFile: /etc/containers/storage.conf
  containerStore:
    number: 7
    paused: 0
    running: 4
    stopped: 3
  graphDriverName: overlay
  graphOptions:
    overlay.mountopt: nodev,metacopy=on
  graphRoot: /var/lib/containers/storage
  graphStatus:
    Backing Filesystem: btrfs
    Native Overlay Diff: "false"
    Supports d_type: "true"
    Using metacopy: "true"
  imageStore:
    number: 11
  runRoot: /run/containers/storage
  volumePath: /var/lib/containers/storage/volumes
version:
  APIVersion: 3.4.0
  Built: 1633030821
  BuiltTime: Thu Sep 30 16:40:21 2021
  GitCommit: ""
  GoVersion: go1.16.8
  OsArch: linux/amd64
  Version: 3.4.0

Package info (e.g. output of rpm -q podman or apt list podman):

podman-3.4.0-1.fc34.x86_64

Have you tested with the latest version of Podman and have you checked the Podman Troubleshooting Guide? (https://github.com/containers/podman/blob/master/troubleshooting.md)

Yes

Additional environment details (AWS, VirtualBox, physical, etc.):
Fedora Workstation 34 (amd64).

@openshift-ci openshift-ci bot added the kind/bug Categorizes issue or PR as related to a bug. label Oct 17, 2021
@mheon mheon self-assigned this Oct 18, 2021
@zhangguanzhang
Copy link
Collaborator

I cannot access the url: https://1drv.ms/u/s!ApqEOOKoIKfoeo7T9Yp5pk5cM9U?e=c0Q086
could you paste the content to here?

@o-alquimista
Copy link

Sure.

LegulasPod.zip

@WoollyMammal
Copy link

I was just about to post my own report on pod-internal containers failing to resolve hostnames, but while filling out the bug report I learned it was an issue with the hosts file. Rechecking the issues for "hosts" and... this issue already exists. sigh

If you need more details, or just want another sample, let me know and I'll post my details here too.

@douglas-legulas
Copy link
Author

@zhangguanzhang I forgot to mention that @o-alquimista is another account of mine. Sorry about that.

@rhatdan
Copy link
Member

rhatdan commented Nov 2, 2021

Could you show the podman command used to create the /etc/hosts file?

@douglas-legulas
Copy link
Author

Could you show the podman command used to create the /etc/hosts file?

Sorry, what do you mean? The pod building scripts are here: https://github.com/containers/podman/files/7388549/LegulasPod.zip

@rhatdan
Copy link
Member

rhatdan commented Nov 5, 2021

Could you make a simple repeater that we could examine.

@douglas-legulas
Copy link
Author

douglas-legulas commented Nov 5, 2021

I've added a virtual host and an example webpage.
LegulasPod.zip

To reproduce:

  1. Build the pod (as root)
$ cd LegulasPod
# sh ./scripts/build.sh
  1. Check /etc/hosts:
# podman exec legulas-php cat /etc/hosts

Expected results (before v3.4):

legulas c7d7ce974eb3-infra
10.88.0.1 host.containers.internal
127.0.0.1 legulas legulas-httpd
127.0.0.1 legulas legulas-php
127.0.0.1 legulas legulas-mariadb

Actual results (now with v3.4):

legulas c7d7ce974eb3-infra
10.88.0.1 host.containers.internal

@rhatdan
Copy link
Member

rhatdan commented Nov 8, 2021

@Luap99 PTAL

@github-actions
Copy link

github-actions bot commented Dec 9, 2021

A friendly reminder that this issue had no activity for 30 days.

@douglas-legulas
Copy link
Author

Were you able to reproduce the issue?

@rhatdan
Copy link
Member

rhatdan commented Dec 9, 2021

@Luap99 did you ever get a chance to look at this?

@Luap99
Copy link
Member

Luap99 commented Dec 9, 2021

No I have no spare time at the moment to look at this unfortunately.I am very busy with netavark work

@rhatdan
Copy link
Member

rhatdan commented Dec 9, 2021

@douglas-legulas any chance you get this to happen without docker-compose? IE Can you get it to happen with a simpler podman run or podman -remote run?

@douglas-legulas
Copy link
Author

I'm using podman pod, not docker compose. It's just buildah and podman, nothing else.

But I'll try to produce something more simple, starting from scratch. I'll do it this weekend.

@douglas-legulas
Copy link
Author

douglas-legulas commented Dec 11, 2021

Here's a greatly simplified pod setup:
testpod.zip

It builds, without any modification, two local images named localhost/testpod-httpd and localhost/testpod-php, which are based on httpd and php, respectively. They run on a pod named testpod, and the containers are named testpod-httpd and testpod-php. When you open http://localhost:8080 on your browser you see the Apache "It works!" page.

To reproduce

  1. cd ./testpod/scripts
  2. This will build and run the pod: sudo sh build.sh
  3. sudo podman exec -it testpod-httpd bash or sudo podman exec -it testpod-php bash
  4. cat /etc/hosts

Expected /etc/hosts:

127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6

10.88.0.3 testpod 8c69480aec9f-infra
10.88.0.1 host.containers.internal
127.0.0.1 testpod testpod-httpd
127.0.0.1 testpod testpod-php

Actual /etc/hosts

127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6

10.88.0.3 testpod 8c69480aec9f-infra
10.88.0.1 host.containers.internal

Consequence

The containers cannot resolve hostnames testpod-httpd and testpod-php.

Additional info

  • Podman 3.4.2
  • Buildah 1.23.1 (image-spec 1.0.1-dev, runtime-spec 1.0.2-dev)
  • Fedora Workstation 35 (amd64)

@nupplaphil
Copy link

I can reproduce it as well.

Setup

$ sudo podman pod create --name friendica --hostname friendica -p 8012:80
$ sudo podman run -d --name friendica-mariadb --pod friendica -e PUID=1001 -e PGID=1001 --mount type=bind,src=/tmp/friendica/db,dst=/var/lib/mysql --restart=unless-stopped --env MYSQL_HOST=localhost --env MYSQL_PORT=3306 --env MYSQL_DATABASE=friendica --env MYSQL_USER=friendica --env MYSQL_PASSWORD=friendica  --env MYSQL_RANDOM_ROOT_PASSWORD=yes mariadb:latest
$ sudo podman run -d --name friendica-fpm --pod friendica --env TZ=Europe/Berlin -e PUID=1001 -e PGID=1001 --mount type=bind,src=/tmp/friendica/html,dst=/var/www/html --restart=unless-stopped --env MYSQL_USER=friendica --env MYSQL_PASSWORD=friendica --env MYSQL_DATABASE=friendica --env MYSQL_HOST=friendica-mariadb --env [email protected] --env FRIENDICA_SITENAME=friendica.at.home --env FRIENDICA_TZ='Europe/Zurich' --env FRIENDICA_URL='https://friendica.at.home' friendica:fpm

Output /etc/hosts

$ sudo podman exec friendica-fpm cat /etc/hosts
[...] (copied lines from my host)

# The following lines are desirable for IPv6 capable hosts
::1     ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
10.88.0.2       friendica 672deab3f9e2-infra
10.88.0.1 host.containers.internal

Additional info

  • Podman 3.4.2
  • Ubuntu bullseye/sid

@Luap99
Copy link
Member

Luap99 commented Dec 12, 2021

@douglas-legulas Thanks
I think it was caused by #11605 which fixed #11596. I think to make sure that the other issue stays fixed the expected result should look like this:

127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6

10.88.0.3 testpod 8c69480aec9f-infra
10.88.0.1 host.containers.internal
10.88.0.3 testpod testpod-httpd
10.88.0.3 testpod testpod-php

@rhatdan
Copy link
Member

rhatdan commented Dec 13, 2021

@douglas-legulas WDYT?

@douglas-legulas
Copy link
Author

As long as the containers can reach each other by hostname (e.g. testpod-httpd, testpod-php), it's fine.

@m3freak
Copy link

m3freak commented Dec 22, 2021

The hosts file format proposed by @Luap99 doesn't work for me universally because services running in my containers are configured to listen for connections on 127.0.0.1. I'm not sure how to work around this without assigning fixed IPs or changing them to listen on everything. I also don't like the idea of changing daemon configs in images to listen on every interface in the containers and pods. Either way, that's not going to be fun to fix.

@github-actions
Copy link

A friendly reminder that this issue had no activity for 30 days.

@rhatdan
Copy link
Member

rhatdan commented Jan 25, 2022

Hows this look?

$ podman pod create  --name test 
d89e962197f7e639be879fa7390af2e2787db0bdf7df2ff2612b59c2310e9a2d
$ ./bin/podman run --name c1 --pod test fedora cat /etc/hosts 
127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
104.104.88.51 www.redhat.com
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6
# used by slirp4netns
10.0.2.100	test d89e962197f7-infra
192.168.1.141	host.containers.internal
10.0.2.100	231da8c7ac6a c1
$ ./bin/podman run --name c2 --pod test fedora cat /etc/hosts 
127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
104.104.88.51 www.redhat.com
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6
# used by slirp4netns
10.0.2.100	test d89e962197f7-infra
192.168.1.141	host.containers.internal
10.0.2.100	231da8c7ac6a c1
10.0.2.100	08866c279061 c2
$ ./bin/podman stop c1
c1
$ ./bin/podman start --attach c1
127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
104.104.88.51 www.redhat.com
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6
# used by slirp4netns
10.0.2.100	test d89e962197f7-infra
192.168.1.141	host.containers.internal
10.0.2.100	231da8c7ac6a c1
10.0.2.100	08866c279061 c2

@mheon
Copy link
Member

mheon commented Jan 26, 2022

Looks fine, though I'm still iffy about having it read-only, if that's the plan.

@rhatdan
Copy link
Member

rhatdan commented Jan 26, 2022

I will leave them read/write, although I am not sure if we should consider this an attack vector, resolv.conf and hostname might fall in the same category.

I just need to get time to work on tests.

@vkvm
Copy link

vkvm commented Feb 11, 2022

Just wanted to highlight that there are users like me running podman play kube, for which this is a showstopper. I've been watching this issue, as I assume it's the same as below.

Here's a simple reproducer (running rootless under Fedora 35)

example-pod.yml:

apiVersion: apps/v1
kind: Pod
metadata:
  name: helloworld
spec:
  containers:
    - name: webserver
      image: docker.io/library/nginx:stable
    - name: webclient
      image: docker.io/library/alpine:3.15
      command: ["/bin/sh", "-xc"]
      args:
        # sleep at start due to occasional short delay before /etc/hosts is fully populated
        # sleep at the end to allow time to manually inspect before container exit/restart
        - "sleep 2; cat /etc/hosts; echo; nc -vz helloworld-webserver 80; sleep 60"

podman play kube ./example-pod.yml && sleep 5 && podman logs --tail=10 helloworld-webclient

podman 3.3.1

# used by slirp4netns
10.0.2.100      helloworld 036d5ebec052-infra
10.0.2.2 host.containers.internal
127.0.0.1 helloworld helloworld-webserver
127.0.0.1 helloworld helloworld-webclient
+ echo

+ nc -vz helloworld-webserver 80
helloworld-webserver (127.0.0.1:80) open
+ sleep 60

podman 3.4.4

127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6
# used by slirp4netns
10.0.2.100      helloworld fe7c74935cf4-infra
10.0.2.2 host.containers.internal

+ echo
+ nc -vz helloworld-webserver 80
nc: bad address 'helloworld-webserver'
+ sleep 60

Anyway, sounds like you have a plan to resolve this. Looking forward to a fix.

@rhatdan
Copy link
Member

rhatdan commented Feb 11, 2022

I believe this is fixed, Please check on podman 4.0 or against master.

@rhatdan rhatdan closed this as completed Feb 11, 2022
@Luap99
Copy link
Member

Luap99 commented Feb 11, 2022

There is no PR linked so when was this fixed? I do not remember PRs for this.

@rhatdan
Copy link
Member

rhatdan commented Feb 11, 2022

Perhaps not, I know I looked into this, but not sure if I ever fixed it.

@rhatdan rhatdan reopened this Feb 11, 2022
rhatdan added a commit to rhatdan/podman that referenced this issue Feb 17, 2022
When one container shares the network namespace with another container
or with a Pod, there should be an entry added to the /etc/hosts file
for the second container.

Fixes: containers#12003

Signed-off-by: Daniel J Walsh <[email protected]>
@mcejp
Copy link

mcejp commented Mar 6, 2022

FWIW, I found this issue when googling for problems with DNS resolution among my containers. Turns out I was just missing the podman-plugins package -- perhaps this helps somebody.

@github-actions
Copy link

github-actions bot commented Apr 6, 2022

A friendly reminder that this issue had no activity for 30 days.

@Luap99 Luap99 assigned Luap99 and unassigned mheon Apr 7, 2022
@Luap99 Luap99 added the network Networking related issue or feature label Apr 7, 2022
Luap99 added a commit to Luap99/libpod that referenced this issue Apr 22, 2022
Use the new logic from c/common to create the hosts file. This will help
to better allign the hosts files between buildah and podman.

Also this fixes several bugs:
- remove host entries when container is stopped and has a netNsCtr
- add entries for containers in a pod
- do not duplicate entries in the hosts file
- use the correct slirp ip when an userns is used

Features:
- configure host.containers.internal entry in containers.conf
- configure base hosts file in containers.conf

Fixes containers#12003
Fixes containers#13224

Signed-off-by: Paul Holzinger <[email protected]>
@github-actions github-actions bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label Sep 20, 2023
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 20, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
kind/bug Categorizes issue or PR as related to a bug. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. network Networking related issue or feature
Projects
None yet