Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to use internal docker repo over VPN (still) #540

Closed
adamwg opened this issue Sep 7, 2016 · 17 comments
Closed

Unable to use internal docker repo over VPN (still) #540

adamwg opened this issue Sep 7, 2016 · 17 comments

Comments

@adamwg
Copy link

adamwg commented Sep 7, 2016

Expected behavior

When connected to my company VPN (Pulse Secure) I should be able to pull images from our internal docker repo. I thought this would work now that #19 is closed, but I'm still seeing the same symptom I always have. I tried resetting docker, in case it was a remnant of an old VM, but still seeing the problem.

Actual behavior

[~]% docker pull docker.internal.example.com/foo/bar
Using default tag: latest
Error response from daemon: Get https://docker.internal.example.com/v1/_ping: dial tcp: lookup docker.internal.example.com on 192.168.65.1:53: no such host

Information

  • Docker for Mac version 1.12.1-beta25 (build: 11807)
  • Diagnostic ID: 91A265E6-3613-4F84-8542-10457EFD45A9
  • OSX 10.11.6
  • Pulse Secure 5.1.8 (61601)

scutil --dns output (lightly anonymized):

[~]% scutil --dns
DNS configuration

resolver #1
  search domain[0] : internal.example.com
  search domain[1] : consul
  search domain[2] : example.ca
  nameserver[0] : 8.8.8.8
  nameserver[1] : 10.0.0.1
  if_index : 4 (en0)
  flags    : Request A records
Reachable

resolver #2
  domain   : internal.example.com
  nameserver[0] : aaa.bbb.ccc.19
  nameserver[1] : aaa.bbb.ccc.20
  flags    : Request A records
Reachable
  order    : 100600

resolver #3
  domain   : consul
  nameserver[0] : aaa.bbb.ccc.19
  nameserver[1] : aaa.bbb.ccc.20
  flags    : Request A records
Reachable
  order    : 100601

resolver #4
  domain   : local
  options  : mdns
  timeout  : 5
  flags    : Request A records
Not Reachable
  order    : 300000

resolver #5
  domain   : 254.169.in-addr.arpa
  options  : mdns
  timeout  : 5
  flags    : Request A records
Not Reachable
  order    : 300200

resolver #6
  domain   : 8.e.f.ip6.arpa
  options  : mdns
  timeout  : 5
  flags    : Request A records
Not Reachable
  order    : 300400

resolver #7
  domain   : 9.e.f.ip6.arpa
  options  : mdns
  timeout  : 5
  flags    : Request A records
Not Reachable
  order    : 300600

resolver #8
  domain   : a.e.f.ip6.arpa
  options  : mdns
  timeout  : 5
  flags    : Request A records
Not Reachable
  order    : 300800

resolver #9
  domain   : b.e.f.ip6.arpa
  options  : mdns
  timeout  : 5
  flags    : Request A records
Not Reachable
  order    : 301000

DNS configuration (for scoped queries)

resolver #1
  search domain[0] : example.ca
  nameserver[0] : 8.8.8.8
  nameserver[1] : 10.0.0.1
  if_index : 4 (en0)
  flags    : Scoped, Request A records
Reachable

/etc/resolv.conf in my docker VM:

moby:~# cat /etc/resolv.conf
search local
nameserver 192.168.65.1
nameserver 192.168.65.10
nameserver 192.168.65.9
nameserver 192.168.65.8
nameserver 192.168.65.7
nameserver 192.168.65.6
nameserver 192.168.65.5
nameserver 192.168.65.4
nameserver 192.168.65.3

Workaround

I'm able to work around this issue by manually modifying /etc/resolv.conf in the container to point at our internal DNS servers (aaa.bbb.ccc. in the above output).

@thehesiod
Copy link

same for me, not working even after restarting docker for mac

@aminvielle
Copy link

aminvielle commented Oct 17, 2016

Same for me with the latest beta release and MacsOs Sierra.
Same workaround: I must manually update the /etc/resolv.conf inside the VM every time I connect to my company VPN. But these changes are lost every time I restart Docker.

On MacOs, /etc/resolv.conf does not provide the relevant DNS infos. scutil --dns does.

Docker for Mac: version: 1.12.2-beta28 (71c4a00)
OS X: version 10.12 (build: 16A323)
Diagnostic ID: 08A647BA-16C4-4963-8DF4-71C054E45AB1

@ShannonHickey
Copy link

ShannonHickey commented Nov 10, 2016

I'm having the same problem and have spent many hours trying to find a solution. Hopefully the information I provide below helps Docker with an investigation. Note that some info has been redacted.

I'm using Docker for Mac, stable version:

$ docker --version
Docker version 1.12.3, build 6b644ec

Part of our Docker build includes pulling images from an in-house artifactory repository. For some reason, starting today, this began failing. During a Docker build, numerous artifacts would pull from artifactory just fine and then suddenly the build would hit an inability to resolve the same host (that had been resolving just perfectly seconds before in the same Docker build).

Note that I am on VPN using Cisco AnyConnect.

I've found this to be very reproducible. Here's a very simple Dockerfile that builds an alpine image and installs bash. You can then hop into the image to start trying to resolve hosts.

Start with Dockerfile:

FROM alpine:3.4

RUN apk update && \
    apk add --no-cache \
      bash \
      curl

Now build and run bash:

$ docker build -t myimage .
$ docker run -it myimage bash

When I try to lookup our in-house artifactory inside the container:

for i in {0..10}; do nslookup artifactory.corp.redacted.com; done

A number of the lookups succeed and others fail. Huh?!

Here's resolve.conf and scutil --dns on my host Mac:

$ cat /etc/resolv.conf
search can.redacted.com corp.redacted.com
nameserver a.b.c.d
nameserver a.e.f.d
nameserver 192.168.1.1

$ scutil --dns
DNS configuration

resolver #1
  search domain[0] : can.redacted.com
  search domain[1] : corp.redacted.com
  nameserver[0] : a.b.c.d
  nameserver[1] : a.e.f.d
  nameserver[2] : 192.168.1.1
  flags    : Request A records, Request AAAA records
Reachable, Directly Reachable Address
  order    : 1

resolver #2
  domain   : local
  options  : mdns
  timeout  : 5
  flags    : Request A records, Request AAAA records
Not Reachable
  order    : 300000

resolver #3
  domain   : 254.169.in-addr.arpa
  options  : mdns
  timeout  : 5
  flags    : Request A records, Request AAAA records
Not Reachable
  order    : 300200

resolver #4
  domain   : 8.e.f.ip6.arpa
  options  : mdns
  timeout  : 5
  flags    : Request A records, Request AAAA records
Not Reachable
  order    : 300400

resolver #5
  domain   : 9.e.f.ip6.arpa
  options  : mdns
  timeout  : 5
  flags    : Request A records, Request AAAA records
Not Reachable
  order    : 300600

resolver #6
  domain   : a.e.f.ip6.arpa
  options  : mdns
  timeout  : 5
  flags    : Request A records, Request AAAA records
Not Reachable
  order    : 300800

resolver #7
  domain   : b.e.f.ip6.arpa
  options  : mdns
  timeout  : 5
  flags    : Request A records, Request AAAA records
Not Reachable
  order    : 301000

DNS configuration (for scoped queries)

resolver #1
  search domain[0] : corp.redacted.com
  nameserver[0] : 192.168.1.1
  if_index : 4 (en0)
  flags    : Scoped, Request A records
Reachable, Directly Reachable Address

resolver #2
  search domain[0] : can.redacted.com
  search domain[1] : corp.redacted.com
  nameserver[0] : a.b.c.d
  nameserver[1] : a.e.f.d
  nameserver[2] : 192.168.1.1
  if_index : 12 (utun0)
  flags    : Scoped, Request A records, Request AAAA records
Reachable, Directly Reachable Address
  order    : 1

So what does the Docker container have?:

bash-4.3# cat /etc/resolv.conf
# Generated by dhcpcd from eth0.dhcp
# /etc/resolv.conf.head can replace this line
domain local
search can.redacted.com corp.redacted.com
nameserver 192.168.65.1
nameserver 192.168.65.10
nameserver 192.168.65.9
nameserver 192.168.65.8
nameserver 192.168.65.7
nameserver 192.168.65.6
nameserver 192.168.65.5
nameserver 192.168.65.4
nameserver 192.168.65.3
# /etc/resolv.conf.tail can replace this line

That's very surprising. I thought Docker was supposed to make this match the host's /etc/resolv.conf. I have no idea what these IP addresses point to, but I thought I'd pick one or two and do some nslookups. For any one I choose, the results seem to fluctuate between success and failure.

An extremely odd detail to add is that I can only see this problem with alpine containers. While in all cases the /etc/resolv.conf is the same, the actual problem is not reproducible in ubuntu or busybox containers. I've begun to wonder if the problem is caused by parallel querying of DNS servers as described in http://gliderlabs.viewdocs.io/docker-alpine/caveats/ but I can't get past the fact that /etc/resolv.conf just makes no sense to me. Shot in the dark: I wonder if the Docker daemon itself is running alpine and when running alpine containers it somehow gets into a weird mode that hits on the parallelism described in the caveats?

I noticed that I was able to create a custom bridge network and use that instead. When you do so, the /etc/resolv.conf that is generated points only at Docker's embedded DNS server, and things work beautifully.

$ docker network create -d bridge my-bridge-network
097f790a0417e48e25b91185f00a334532e8a8bc46ad54dfb99740fa4ac9050b
$ docker run -it -network=my-bridge-network image-name bash
bash-4.3# cat /etc/resolv.conf
search can.redacted.com corp.redacted.com
nameserver 127.0.0.11
options ndots:0

Unfortunately, I've found no way to specify that this network be used when doing a Docker build.

Something odd is going on with the default Docker bridge network. Please help Docker!

@ShannonHickey
Copy link

ShannonHickey commented Nov 10, 2016

Here's a rather ugly work-around that I've discovered:

$ cd ~/Library/Containers/com.docker.docker/Data/database
$ git reset --hard
HEAD is now at c1579a8 Docker started 1478770646
$ vi com.docker.driver.amd64-linux/etc/docker/daemon.json

Add a dns entry like so:

{ "dns" : ["a.b.c.d", "e.f.g.h"] }

Think carefully about what you want to be in here. As I mentioned in my previous comment, alpine tries DNS servers in parallel. If you need to access something on your corporate network and you include, for example, your router's IP, you'll continue to see random failure when using an alpine image.

Now:

$ git add com.docker.driver.amd64-linux/etc/docker/daemon.json
$ git commit -m "Update DNS"
[master 3025df2] Update DNS
 1 file changed, 5 insertions(+), 1 deletion(-)

Docker will automatically restart, and you're good to go. The /etc/resolv.conf inside of your image will now point to the nameservers you've configured.

If this doesn't work, or you mess up, you can always change it back or do a reset on Docker.

To the Docker team: There's a bunch of open questions here:

  1. What is the meaning of the 10 IP addresses in /etc/resolv.conf when using the default network?
  2. Why doesn't Docker map in the host's /etc/resolv.conf as expected?
  3. What are these IP addresses pointing at and why does whatever they point at sometimes resolve names properly and sometimes not?
  4. Why does using a user-defined network work as mentioned in my previous comment (assuming you're doing a docker run and not a docker build).

@djs55
Copy link
Contributor

djs55 commented Nov 10, 2016

@ShannonHickey first of all thank you for the clear report. I apologise for the trouble this is causing you (and others), but be assured that we're working on it!

To hopefully answer your questions:

  1. each IP address exposed to the VM was mapped onto one of the host's upstream resolvers. The idea was to push more of the retry logic into the VM rather than implement it host-side. Unfortunately it turns out that resolvers often have low limits for the numbers of servers they can deal with; for example musl libc as used in alpine linux will only use the first 3 servers. In beta 30 we've moved this logic back to the host so there is now only one IP configured in the VM and we're hopefully more independent of the different Linux resolvers.
  2. unfortunately the/etc/resolv.conf file on the Mac isn't authoritative. The DNS configuration is now stored in the SC database which you can browse with commands like scutil --dns. In Docker for Mac we attempt to read the DNS servers from there, to make sure we get them all. This code is still being debugged -- in beta 30 or later, if the wrong settings are pulled out then I'd really like to look at a diagnostic upload to see how to improve the logic (the SC database is a flexible key-value store and it's possible that some of the key names have changed between releases for example)
  3. since the old internal IPs were mapped onto external IPs, if you had an external DNS server which could only resolve some domains (perhaps an internal corporate VPN one?) then some external queries would be sent to it by accident. From beta 30 we have a mechanism to associate upstream DNS servers with particular domains, so for example requests for*.corp.example.com could be sent to the special DNS server. The key to this working well is the SC database, so feedback and bug reports are greatly appreciated!
  4. I think it worked with the internal docker DNS server because it uses an entirely Go-based DNS resolver which doesn't have the alpine 3 server limit.

In beta 30 (released ~3 hours ago) the DNS implementation has been revamped. It's probably not perfect yet, so bug reports and diagnostic uploads would be appreciated. The changes are:

  • expose 1 IP to the VM rather than 10 (no more 3 server limit)
  • attempt to use SupplementalMatchDomains from the SC database to route queries for internal domains to internal servers <-- this is probably the bit that needs the most refinement, based on bug reports
  • reduce the number of sockets used on the host by multiplexing over one socket per upstream server (the previous implementation could be a bit socket-heavy)
  • caching to reduce the load on the upstream server

Let me know how beta 30 behaves in your environment.

FYI For experimentation purposes the DNS configuration is stored in the database key:

$ cd ~/Library/Containers/com.docker.docker/Data/database/
$ git reset --hard
HEAD is now at b532f3a last-start-time changed at 1478774906
$ cat com.docker.driver.amd64-linux/slirp/dns
# { Addresses: 8.8.8.8, 8.8.4.4; Order: 200000; Zones:  }
nameserver 8.8.8.8
order 200000
timeout 2000
nameserver 8.8.4.4
order 200000
timeout 2000
# { Addresses: 10.10.0.1; Order: 100000; Zones: example.com }
nameserver 10.10.0.1
zone example.com
order 100000
timeout 2000

Although this file is automatically updated by the UI when the SC database changes, it's possible to edit it then

$ git add com.docker.driver.amd64-linux/slirp/dns
$ git commit -m 'change DNS'

and the changes should take effect immediately.

Thanks again for your reports and all your patience!

@dylanvee
Copy link

@djs55 Thanks very much for this detailed response. I'm happy to confirm that beta 30 fixes the issues I've been having with a VPN and private registry.

@ShannonHickey
Copy link

@djs55 thank you so much for taking the time to read and respond to my comment. I'm very pleased to share that beta 30 also fixes my issue. Fantastic!

I'm pretty sure that my problem had something to do with item 3 in the discussion above, except in reverse. In particular, sometimes queries for internal servers were sent to my router for DNS (third entry in the host's /etc/resolv.conf) rather than the first. I figured that might have to do with the Docker VM implementation possibly using alpine and alpine doing parallel DNS. I could be way off though. As long as it keeps working, and is now deterministic (unlike the alpine parallel behavior) then we're golden!

There is still something that puzzles me though. Perhaps you can explain.

If I bring up a container using the default network, I get a single entry of 192.168.65.1, whereas if I create my own bridge network, and bring up the container with it, I get a single entry of 127.0.0.11. My understanding is that both of these point at a Docker embedded DNS server, so why the difference?

$ docker run -it myimage bash
bash-4.3# cat /etc/resolv.conf
# Generated by dhcpcd from eth0.dhcp
# /etc/resolv.conf.head can replace this line
domain can.redacted.com
search can.redacted.com corp.redacted.com
nameserver 192.168.65.1
# /etc/resolv.conf.tail can replace this line
bash-4.3# exit
exit

$ docker network create -d bridge mynetwork
760d0341ab661109d579281175fd166984212cf4b938b72ea2a51cd144be91dd
$ docker run -it --net=mynetwork myimage bash
bash-4.3# cat /etc/resolv.conf
search can.redacted.com corp.redacted.com
nameserver 127.0.0.11
options ndots:0
bash-4.3# exit
exit

Thanks again!

@aminvielle
Copy link

@djs55 Thanks! Works for me too with beta30.

@djs55
Copy link
Contributor

djs55 commented Nov 11, 2016

@ShannonHickey @aminvielle I'm glad that beta 30 is working for you so far! FYI I'm currently fixing a bug with the caching logic -- unfortunately the responses from the cache have a field incorrectly set in the header. For some reason most software is oblivious to this, but I've seen it cause (rarely) some resolution failures. I'm working on a fix for this at the moment.

Regarding the remaining question about 192.168.65.1 versus 127.0.0.11 there are actually 2 Docker embedded DNS servers in Docker for Desktop:

  • 127.0.0.11 allows containers to look up other containers on the same network by name, so if you have one container called frontend and one called backend they can use those names rather than knowing the internal IPs. I believe this DNS server is not run on the default network for backwards compatibility (IIRC) but it is on new networks you create. This DNS server runs inside the Linux VM. Queries which can't be resolved locally are sent to 192.168.65.1.
  • 192.168.65.1 is specific to Docker for Desktop (Mac/Windows) where it tries to do the right thing for VPNs, and also reads names from /etc/hosts on the Mac (and from a similar place on Windows). This DNS server runs on the Mac/Windows host, outside the VM.

I hope that makes things a little clearer!

@adamwg
Copy link
Author

adamwg commented Nov 15, 2016

Working for me in beta30.

@adamwg adamwg closed this as completed Nov 15, 2016
@kaskavalci
Copy link

It seems now proxy cannot be resolved after this update. I resolved this by adding IP address of the proxy server.

Step 1/17 : FROM golang:1.7.3
Get https://registry-1.docker.io/v2/: http: error connecting to proxy http://proxy.company.io:8080: dial tcp: lookup proxy.company.io on 192.168.65.1:53: read udp 192.168.65.2:38806-

@djs55
Copy link
Contributor

djs55 commented Dec 14, 2016

@kaskavalci could you open a fresh issue with a fresh diagnostics upload? Thanks!

@kaskavalci
Copy link

Hi @djs55 here it is: #1042

@bsushant-athena
Copy link

bsushant-athena commented May 7, 2017

I'm having similar kind of issue. I'm trying to reach the private npm artifactory but always getting timeout. There seems to be an issue with our artifactory as i.e. default docker subnets conflict with the subnet we host artifactory on.

I'm using docker mac:
Docker version 17.03.1-ce, build c6d412e

Docker info:

docker info
Containers: 20
Running: 5
Paused: 0
Stopped: 15
Images: 3744
Server Version: 17.03.1-ce
Storage Driver: overlay2
Backing Filesystem: extfs
Supports d_type: true
Native Overlay Diff: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
Volume: local
Network: bridge host ipvlan macvlan null overlay
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 4ab9917febca54791c5f071a9d1f404867857fcc
runc version: 54296cf40ad8143b62dbcaa1d90e520a2136ddfe
init version: N/A (expected: 949e6facb77383876aeff8a6944dde66b3089574)
Security Options:
seccomp
Profile: default
Kernel Version: 4.9.13-moby
Operating System: Alpine Linux v3.5
OSType: linux
Architecture: x86_64
CPUs: 4
Total Memory: 1.952 GiB
Name: moby
ID: LUZ2:DLS2:SK4H:E6UH:GN5B:MG4Y:JWKU:LHQV:FIVA:DVA7:L6IY:UIXO
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): true
File Descriptors: 78
Goroutines: 96
System Time: 2017-05-07T12:54:46.325682196Z
EventsListeners: 2
No Proxy: *.local, 169.254/16
Registry: https://index.docker.io/v1/
Experimental: true
Insecure Registries:
127.0.0.0/8
Live Restore Enabled: false

Note that I'm using docker-compose to run and I am on VPN using Cisco AnyConnect.

My Dockerfile:

FROM node
RUN mkdir -p /usr/src/app
WORKDIR /usr/src/app
COPY npmrc /root/.npmrc
COPY ./login/ /usr/src/app/
RUN npm install
EXPOSE 3000
CMD [ "npm", "start" ]

I'm creating a new network for the node service as follows:

docker network create --driver bridge --subnet 192.168.48.1/24 node

Following are my scutil --dns and resolv.conf respectively from mac:

DNS configuration

resolver #1
search domain[0] : corp.abc.com
nameserver[0] : 10.8.20.11
nameserver[1] : 10.6.66.12
flags : Request A records, Request AAAA records
reach : Reachable
order : 1

resolver #2
domain : local
options : mdns
timeout : 5
flags : Request A records, Request AAAA records
reach : Not Reachable
order : 300000

resolver #3
domain : 254.169.in-addr.arpa
options : mdns
timeout : 5
flags : Request A records, Request AAAA records
reach : Not Reachable
order : 300200

resolver #4
domain : 8.e.f.ip6.arpa
options : mdns
timeout : 5
flags : Request A records, Request AAAA records
reach : Not Reachable
order : 300400

resolver #5
domain : 9.e.f.ip6.arpa
options : mdns
timeout : 5
flags : Request A records, Request AAAA records
reach : Not Reachable
order : 300600

resolver #6
domain : a.e.f.ip6.arpa
options : mdns
timeout : 5
flags : Request A records, Request AAAA records
reach : Not Reachable
order : 300800

resolver #7
domain : b.e.f.ip6.arpa
options : mdns
timeout : 5
flags : Request A records, Request AAAA records
reach : Not Reachable
order : 301000

DNS configuration (for scoped queries)

resolver #1
nameserver[0] : 192.168.43.1
if_index : 4 (en0)
flags : Scoped, Request A records
reach : Reachable, Directly Reachable Address

resolver #2
search domain[0] : corp.abc.com
nameserver[0] : 10.8.20.11
nameserver[1] : 10.6.66.12
if_index : 11 (utun1)
flags : Scoped, Request A records, Request AAAA records
reach : Reachable
order : 1

resolv.conf

search corp.abc.com
nameserver 10.8.20.11
nameserver 10.6.66.12

Now my Docker resolv.conf ::

nameserver 127.0.0.11
options ndots:0

I'm executing simple curl command to connect to the artifactory using docker exec -it dev_node curl -I -vvvv artifactory.abc.com:8081 .I tried to add my artifactory address(i.e.172.18.122.156) in my /etc/hosts file of container but still no luck. So still I'm facing issue to connect to the artifactory within the docker container.
Please suggest :)

Solved:
extant networks with a conflicting subnet, even when not attached to the specific container, will cause the issue.

@elclanrs
Copy link

elclanrs commented Jan 23, 2018

@bsushant-athena, I have almost the same exact setup as you and experiencing the same issue in the latest Docker. We run Artifactory and connect to the VPN with Cisco AnyConnect. Can you expand a bit on what exactly you did to fix this?

@bsushant-athena
Copy link

I had my own user-defined network which was causing this issue so I deleted it.
testing locally has shown that having a conflicting subnet in your docker networks, even if that's not the network your container is attached to, will cause the issue.
you need to remove any network with a bad subnet
@elclanrs

@docker-robott
Copy link
Collaborator

Closed issues are locked after 30 days of inactivity.
This helps our team focus on active issues.

If you have found a problem that seems similar to this, please open a new issue.

Send feedback to Docker Community Slack channels #docker-for-mac or #docker-for-windows.
/lifecycle locked

@docker docker locked and limited conversation to collaborators Jun 19, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests