Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kubectl fails to resolve names except through DNS #48

Closed
jaraco opened this issue Aug 9, 2017 · 16 comments
Closed

Kubectl fails to resolve names except through DNS #48

jaraco opened this issue Aug 9, 2017 · 16 comments
Labels
kind/bug Categorizes issue or PR as related to a bug. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. sig/cli Categorizes an issue or PR as relevant to SIG CLI.

Comments

@jaraco
Copy link

jaraco commented Aug 9, 2017

Is this a request for help? (If yes, you should use our troubleshooting guide and community support channels, see http://kubernetes.io/docs/troubleshooting/.): Not exactly

What keywords did you search in Kubernetes issues before filing this one? (If you have found any duplicates, you should instead reply there.): kubectl "unable to connect to the server" "read udp"


Is this a BUG REPORT or FEATURE REQUEST? (choose one): BUG REPORT

Kubernetes version (use kubectl version):

$ kubectl version
Client Version: version.Info{Major:"1", Minor:"7", GitVersion:"v1.7.3", GitCommit:"2c2fe6e8278a5db2d15a013987b53968c743f2a1", GitTreeState:"clean", BuildDate:"2017-08-03T15:13:53Z", GoVersion:"go1.8.3", Compiler:"gc", Platform:"darwin/amd64"}

Environment:

  • Cloud provider or hardware configuration: unknown
  • OS (e.g. from /etc/os-release): macOS 10.12.6
  • Kernel (e.g. uname -a): n/a
  • Install tools: Homebrew
  • Others:

What happened:

$ time kubectl version
Client Version: version.Info{Major:"1", Minor:"7", GitVersion:"v1.7.3", GitCommit:"2c2fe6e8278a5db2d15a013987b53968c743f2a1", GitTreeState:"clean", BuildDate:"2017-08-03T15:13:53Z", GoVersion:"go1.8.3", Compiler:"gc", Platform:"darwin/amd64"}
Unable to connect to the server: dial tcp: lookup kub1.mycorp.local on 192.168.14.1:53: read udp 192.168.14.131:49181->192.168.14.1:53: i/o timeout
kubectl version  0.11s user 0.02s system 0% cpu 16.153 total
$ ping kub1
PING kub1.mycorp.local (10.10.9.161): 56 data bytes
64 bytes from 10.10.9.161: icmp_seq=0 ttl=61 time=387.593 ms
$ python -c "import socket; print(socket.gethostbyname('kub1'))"
10.10.9.161

It takes 15+ seconds to fail to connect to the cluster master and fails because it can't figure out the name.

192.168.14.1 is the IP address of my local wifi router. It doesn't (and shouldn't) know anything about kub1. As you can see ping and gethostbyname both resolve the name through the Cisco VPN client installed and connected on the host.

What you expected to happen:

kubectl should connect to kub1 and kub1.mycorp.local like any other application on my system. It shouldn't be making UDP calls to the nameserver directly but should use the IP stack on the host.

Additionally, the command probably shouldn't be attempting to connect for a version command. Preferable would be for the command to return the version immediately... and for this issue only to appear if a cluster-relevant command were issued.

How to reproduce it (as minimally and precisely as possible): See above.

Anything else we need to know:

@dims
Copy link
Member

dims commented Aug 9, 2017

can you please paste the output of kubectl --v=10 version

@jaraco
Copy link
Author

jaraco commented Aug 9, 2017

$ kubectl --v=10 version
I0809 15:50:16.068176   17799 loader.go:357] Config loaded from file /Users/jaraco/.kube/config
I0809 15:50:16.071646   17799 round_trippers.go:386] curl -k -v -XGET  -H "Accept: application/json, */*" -H "User-Agent: kubectl/v1.7.3 (darwin/amd64) kubernetes/2c2fe6e" https://kub1.mycorp.local:6443/version
I0809 15:50:32.150996   17799 round_trippers.go:405] GET https://kub1.mycorp.local:6443/version  in 16079 milliseconds
I0809 15:50:32.151042   17799 round_trippers.go:411] Response Headers:
Client Version: version.Info{Major:"1", Minor:"7", GitVersion:"v1.7.3", GitCommit:"2c2fe6e8278a5db2d15a013987b53968c743f2a1", GitTreeState:"clean", BuildDate:"2017-08-03T15:13:53Z", GoVersion:"go1.8.3", Compiler:"gc", Platform:"darwin/amd64"}
I0809 15:50:32.151227   17799 helpers.go:225] Connection error: Get https://kub1.mycorp.local:6443/version: dial tcp: lookup kub1.mycorp.local on 192.168.14.1:53: read udp 192.168.14.131:65522->192.168.14.1:53: i/o timeout
F0809 15:50:32.151283   17799 helpers.go:120] Unable to connect to the server: dial tcp: lookup kub1.mycorp.local on 192.168.14.1:53: read udp 192.168.14.131:65522->192.168.14.1:53: i/o timeout

@dims
Copy link
Member

dims commented Aug 9, 2017

@jaraco if you see your /Users/jaraco/.kube/config file, you will see the reference to kub1.mycorp.local, change that to kub1 and you will be all set i think

@pwittrock
Copy link
Member

@jaraco The version command prints the cluster version as well, that is why it connects to the server.

@jaraco
Copy link
Author

jaraco commented Aug 9, 2017

@dims, Sorry to confuse matters with two names, but I find kubectl can't resolve either name, while other tools will resolve both. Here's the output after making the suggested change.

$ sed -i -e 's/kub1.mycorp.local/kub1/g' ~/.kube/config
$ kubectl version
Client Version: version.Info{Major:"1", Minor:"7", GitVersion:"v1.7.3", GitCommit:"2c2fe6e8278a5db2d15a013987b53968c743f2a1", GitTreeState:"clean", BuildDate:"2017-08-03T15:13:53Z", GoVersion:"go1.8.3", Compiler:"gc", Platform:"darwin/amd64"}
Unable to connect to the server: dial tcp: lookup kub1 on [2601:14d:8701:59f8:f299:bfff:fe02:cee3]:53: no such host
$ ping kub1
PING kub1.mycorp.local (10.10.9.161): 56 data bytes
64 bytes from 10.10.9.161: icmp_seq=0 ttl=61 time=306.378 ms
64 bytes from 10.10.9.161: icmp_seq=1 ttl=61 time=393.799 ms
^C
--- kub1.mycorp.local ping statistics ---
2 packets transmitted, 2 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 306.378/350.088/393.799/43.711 ms
$ python -c "import socket; print(socket.gethostbyname('kub1.mycorp.local'))"
10.10.9.161

@jaraco
Copy link
Author

jaraco commented Aug 9, 2017

@pwittrock: I see that now. I've updated the OP to strike that aspect.

@pwittrock pwittrock added this to the v1.7 milestone Aug 9, 2017
@dims
Copy link
Member

dims commented Aug 9, 2017

@jaraco AFAIK, ping and socket.gethostbyname are both ipv4 only. Try ping6 and socket.getaddrinfo("example.org", 80, 0, 0, socket.IPPROTO_TCP) to check if you see the problem with ipv6

also can you please run kubectl --v=10 version with the updated ~/.kube/config

@jaraco
Copy link
Author

jaraco commented Aug 10, 2017

$ ping6 kub1 
ping6: getaddrinfo -- nodename nor servname provided, or not known
$ python -c "import socket; print(socket.getaddrinfo('example.org', 80, 0, 0, socket.IPPROTO_TCP))"
[(<AddressFamily.AF_INET6: 30>, <SocketKind.SOCK_STREAM: 1>, 6, '', ('2606:2800:220:1:248:1893:25c8:1946', 80, 0, 0)), (<AddressFamily.AF_INET: 2>, <SocketKind.SOCK_STREAM: 1>, 6, '', ('93.184.216.34', 80))]
$ python -c "import socket; print(socket.getaddrinfo('kub1', 80, 0, 0, socket.IPPROTO_TCP))"
[(<AddressFamily.AF_INET: 2>, <SocketKind.SOCK_STREAM: 1>, 6, '', ('10.10.9.161', 80))]
$ kubectl --v=10 version
I0810 08:21:39.737352   63394 loader.go:357] Config loaded from file /Users/jaraco/.kube/config
I0810 08:21:39.740605   63394 round_trippers.go:386] curl -k -v -XGET  -H "Accept: application/json, */*" -H "User-Agent: kubectl/v1.7.3 (darwin/amd64) kubernetes/2c2fe6e" https://kub1:6443/version
I0810 08:21:39.784315   63394 round_trippers.go:405] GET https://kub1:6443/version  in 43 milliseconds
I0810 08:21:39.784358   63394 round_trippers.go:411] Response Headers:
Client Version: version.Info{Major:"1", Minor:"7", GitVersion:"v1.7.3", GitCommit:"2c2fe6e8278a5db2d15a013987b53968c743f2a1", GitTreeState:"clean", BuildDate:"2017-08-03T15:13:53Z", GoVersion:"go1.8.3", Compiler:"gc", Platform:"darwin/amd64"}
I0810 08:21:39.784515   63394 helpers.go:225] Connection error: Get https://kub1:6443/version: dial tcp: lookup kub1 on [2601:14d:8701:59f8:f299:bfff:fe02:cee3]:53: no such host
F0810 08:21:39.784555   63394 helpers.go:120] Unable to connect to the server: dial tcp: lookup kub1 on [2601:14d:8701:59f8:f299:bfff:fe02:cee3]:53: no such host
$ cat /etc/resolv.conf | grep -v "^#"
search jaraco.com
nameserver 2601:14d:8701:59f8:f299:bfff:fe02:cee3
nameserver 192.168.14.1

I think IPv6 is a red herring. As you can see, the IPv6 address that's appearing in the output is just the IPv6 address for the local WiFi router (192.168.14.1 and 2601:14d:8701:59f8:f299:bfff:fe02:cee3 are the same host). It's true I have IPv6 enabled in my environment, but the kubernetes environment and the VPN I use to connect to it is IPv4 only, so I expect kubectl to resolve the name kub1 or kub1.mycorp.local to an IPv4 address.

It's interesting that I only see the delay resolving the name when .mycorp.local is present, suggesting to me that kubectl/Go/macOS is attempting to perform a multicast resolution of the name, and that's where the 15 second delay occurs. When the name kub1 is used, the resolution quickly fails.

But in any case, I still believe kubectl is somehow managing to bypass the hook in the system that allows names to be resolved by the VPN client before attempting a DNS lookup.

@apelisse apelisse added the bug label Aug 10, 2017
@jaraco
Copy link
Author

jaraco commented Sep 18, 2017

I'm able to work around the issue by manually maintaining a name mapping in /etc/hosts:

10.10.9.161 kub1

But surely this isn't sustainable. Ideally, there'd be a fix for the issue so we don't have to maintain this mapping on each host.

@k8s-github-robot k8s-github-robot removed this from the v1.7 milestone Oct 5, 2017
@marun marun added this to the v1.7 milestone Oct 5, 2017
@marun marun added kind/bug Categorizes issue or PR as related to a bug. sig/cli Categorizes an issue or PR as relevant to SIG CLI. labels Oct 5, 2017
@marun marun added milestone/removed priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. status/approved-for-milestone and removed milestone/incomplete-labels labels Oct 5, 2017
@marun marun modified the milestone: v1.7 Oct 5, 2017
@marun marun removed the bug label Oct 5, 2017
@kubernetes kubernetes deleted a comment from k8s-github-robot Oct 5, 2017
@marun
Copy link
Contributor

marun commented Oct 5, 2017

[MILESTONENOTIFIER] Milestone Issue Current

@jaraco

Issue Labels
  • sig/cli: Issue will be escalated to these SIGs if needed.
  • priority/important-soon: Escalate to the issue owners and SIG owner; move out of milestone after several unsuccessful escalation attempts.
  • kind/bug: Fixes a bug discovered during the current release.
Help

@jkemp101
Copy link

This appears to be a normal issue with go apps. The resolution process is described here https://golang.org/pkg/net/. I think we need to get kubectl to use CGO but I can't figure out how to force that yet. Just a random guess but might be because its cross compiled golang/go@c9164a5?

@jkemp101
Copy link

I changed CGO_ENABLED=0 to 1 here https://github.com/kubernetes/kubernetes/blob/master/hack/lib/golang.sh#L526 and then compiled kubectl locally on my mac by running make kubectl. The built binary used the regular OS name resolution and would resolve my cluster name using the DNS server specified by my VPN client.

Obviously not a fix. Need someone that knows the go build details to see how to work around this issue. If CGO is disabled because of cross-compiling, I think it works if you have the darwin binaries available during the build but I'm not really sure. If CGO was disabled for other reasons then it might be a bigger issue to fix.

To see what name resolution is being used do the following:

  1. Set export GODEBUG=netdns=2
  2. Run kubectl version
    1. If it outputs go package net: using cgo DNS resolver then it will use the "regular" C resolution process.
    2. If it outputs go package net: built with netgo build tag; using Go's DNS resolver then its using Go's resolver and will ignore the VPN DNS settings.

@knutster
Copy link

Still an issue with 1.8.x. Seriously, the fix is known, stop cross-compiling this for macOS.

@pwittrock
Copy link
Member

@knutster would you be interested in contributing a fix?

@pwittrock
Copy link
Member

Closed in favor of kubernetes/release#469.

We will need to continue to cross compile since the build process builds the binaries for all os distros. However we can try to use the technique described here to make cross compilation work with cgo.

@Gauravjaitly
Copy link

hey guys,
i created an EKS cluster, a separate VPC for it and a worker node(bothe by cloud formation)

but getting the following error even after adding new cluster to kubeconfig

kubectl version --short

Client Version: v1.19.2
Unable to connect to the server: dial tcp: lookup api.EKS-Test-Cluster: no such host

can anyone suggest something?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. sig/cli Categorizes an issue or PR as relevant to SIG CLI.
Projects
None yet
Development

No branches or pull requests

9 participants