Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WSL2 DNS stops working #4285

Closed
jordansissel opened this issue Jul 8, 2019 · 297 comments
Closed

WSL2 DNS stops working #4285

jordansissel opened this issue Jul 8, 2019 · 297 comments

Comments

@jordansissel
Copy link

Please fill out the below information:

  • Your Windows build number: (Type ver at a Windows Command Prompt)

Microsoft Windows [Version 10.0.18932.1000]

  • What you're doing and what's happening:
> bash
% host google.com
;; connection timed out; no servers could be reached
  • What's wrong / what should be happening instead: DNS resolution should work

/etc/resolv.conf:

% cat /etc/resolv.conf
# This file was automatically generated by WSL. To stop automatic generation of this file, add the following entry to /etc/wsl.conf:
# [network]
# generateResolvConf = false
nameserver 172.19.224.1

To my knowledge, I didn't change anything. This has happened a few times, and rebooting fixes it. Sometimes just doing wsl --shutdown is sufficient to fix it. It correlates with my workstation going to sleep and resuming later with DNS in WSL2 not working.

@jordansissel
Copy link
Author

Whatever provides the internal WSL2 dns seems busted, but other dns servers are successful:

% cat /etc/resolv.conf
# This file was automatically generated by WSL. To stop automatic generation of this file, add the following entry to /etc/wsl.conf:
# [network]
# generateResolvConf = false
nameserver 172.19.224.1

% host google.com 172.19.224.1
;; connection timed out; no servers could be reached

% host -t A google.com 1.1.1.1
Using domain server:
Name: 1.1.1.1
Address: 1.1.1.1#53
Aliases:

google.com has address 216.58.194.174

@lkuich
Copy link

lkuich commented Jul 8, 2019

@jordansissel This started happening to me yesterday, but it seems my whole network is busted:

$ host -t A google.com 1.1.1.1
;; connection timed out; no servers could be reached
$ ping 8.8.8.8
connect: Network is unreachable

Running with AV and Firewall disabled.

EDIT: I disabled, restarted, and re-enabled the WSL and Virtual Machine Windows Features and it looks to work now.

@develleoper
Copy link

I ran into this same issue; resolved for now after removing etc/resolve.conf and resetting the entire dns config via the resolvconf package, pointing to cloudflare's 1.1.1.1.

@heamaral
Copy link

Maybe it is related to this: #4275

@bmwynne
Copy link

bmwynne commented Jul 11, 2019

Using build: 18932.1000: I ran into this issue on Ubuntu also. Sometimes rebooting windows host and executing wsl --shutdown works. Will look into it further as it occurs and check back on solutions.

@astamos
Copy link

astamos commented Jul 23, 2019

I am seeing this same DNS behavior on Build 18941. The local DNS server is not responding but I can route out to the internet and everything works fine once I set a manual resolv.conf. Using both the official Kali and Ubuntu distros in the store, converted to WSL 2. I have no other Hyper-V VMs.

Ifconfig:

bond0: flags=5122<BROADCAST,MASTER,MULTICAST> mtu 1500
ether 0a:b1:11:d2:aa:66 txqueuelen 1000 (Ethernet)
RX packets 0 bytes 0 (0.0 B)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 0 bytes 0 (0.0 B)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet 172.18.201.80 netmask 255.255.0.0 broadcast 172.18.255.255
inet6 fe80::215:5dff:fe9a:1fe3 prefixlen 64 scopeid 0x20
ether 00:15:5d:9a:1f:e3 txqueuelen 1000 (Ethernet)
RX packets 11 bytes 1507 (1.4 KiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 16 bytes 1232 (1.2 KiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

lo: flags=73<UP,LOOPBACK,RUNNING> mtu 65536
inet 127.0.0.1 netmask 255.0.0.0
inet6 ::1 prefixlen 128 scopeid 0x10
loop txqueuelen 1000 (Local Loopback)
RX packets 8 bytes 560 (560.0 B)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 8 bytes 560 (560.0 B)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

sit0: flags=128 mtu 1480
sit txqueuelen 1000 (IPv6-in-IPv4)
RX packets 0 bytes 0 (0.0 B)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 0 bytes 0 (0.0 B)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

@mahsoommoosa42
Copy link

Facing the same issue on 18956. manually setting resolv.conf is a temporary solution. adding generateResolvConf to false did not affect anything. What's with that ip address (one starting with 172) though? have never seen it before

@mahsoommoosa42
Copy link

Facing the same issue on 18956. Git and Ping does not work.

@non-static
Copy link

My workaround:

  1. Create a file: /etc/wsl.conf.
  2. Put the following lines in the file
[network]
generateResolvConf = false
  1. In a cmd window, run wsl --shutdown
  2. Restart WSL2
  3. Create a file: /etc/resolv.conf. If it exists, replace existing one with this new file.
  4. Put the following lines in the file
nameserver 8.8.8.8
  1. Repeat step 3 and 4. You will see git working fine now.

@mahsoommoosa42
Copy link

mahsoommoosa42 commented Aug 17, 2019 via email

@nonbeing
Copy link

@klein-hu : tried your workaround, followed your steps, but it doesn't seem to be working for me.

My /etc/resolv.conf file is getting wiped out every time I relaunch wsl2 :-/

@mahsoommoosa42
Copy link

mahsoommoosa42 commented Sep 11, 2019 via email

@nonbeing
Copy link

Yes, I did.

@mahsoommoosa42
Copy link

mahsoommoosa42 commented Sep 11, 2019 via email

@onomatopellan
Copy link

@nonbeing make sure resolv.conf is a file and not a link

ls -la /etc/resolv.conf
-rw-r--r-- 1 root root 185 Sep 11 16:32 /etc/resolv.conf

@VictorGaiva
Copy link

My workaround:

  1. Create a file: /etc/wsl.conf.
  2. Put the following lines in the file
[network]
generateResolvConf = false
  1. In a cmd window, run wsl --shutdown
  2. Restart WSL2
  3. Create a file: /etc/resolv.conf. If it exists, replace existing one with this new file.
  4. Put the following lines in the file
nameserver 8.8.8.8
  1. Repeat step 3 and 4. You will see git working fine now.

wsl --shutdown doesn't seem to work for me
wslconfig.exe /terminate Ubuntu does

@h0tw1r3
Copy link

h0tw1r3 commented Sep 19, 2019

Running insider 10.0.18980.1, kernel 4.19..67
Seems as if the dns resolver running on 172 is single threaded? Request that timeout seem to block.

@Rhahkeem
Copy link

Rhahkeem commented Sep 21, 2019

Just started seeing this today after an update as well

Microsoft Windows [Version 10.0.18980.1]

@coltenkrauter
Copy link

I made a gist with @klein-hu 's instructions. They worked like a charm for me.

Fix DNS resolution in WSL2

@h0tw1r3
Copy link

h0tw1r3 commented Oct 21, 2019

For anyone seeing this, the "fixes" proposed are merely a workaround to the problem. In situations where internal/private DNS servers (eg. corporate or home network), DNS will not resolve as expected.

I consider this a critical bug.

@radix
Copy link

radix commented Nov 4, 2019

Starting the docker daemon (inside of WSL2) seems to trigger this for me, very reliably.

C:\Users\radix> wsl --shutdown
C:\Users\radix> wsl

radix@DESKTOP-HOBTFB6:/mnt/c/Users/radix$ cd

radix@DESKTOP-HOBTFB6:~$ host google.com
google.com has address 172.217.14.174
google.com has IPv6 address 2607:f8b0:4000:813::200e
google.com mail is handled by 40 alt3.aspmx.l.google.com.
google.com mail is handled by 20 alt1.aspmx.l.google.com.
google.com mail is handled by 10 aspmx.l.google.com.
google.com mail is handled by 50 alt4.aspmx.l.google.com.
google.com mail is handled by 30 alt2.aspmx.l.google.com.

radix@DESKTOP-HOBTFB6:~$ sudo service docker start
[sudo] password for radix:
 * Starting Docker: docker                                                                                       [ OK ] 

radix@DESKTOP-HOBTFB6:~$ host google.com
;; connection timed out; no servers could be reached

@radix
Copy link

radix commented Nov 4, 2019

Also, this doesn't seem to be resolved for me by switching to a different nameserver. It seems that after starting docker, eventually all network I/O completely stops working in my WSL2 environment.

@radix
Copy link

radix commented Nov 4, 2019

I have reason to believe that this behavior STARTED happening with the latest update of Windows Insider. I'm on build 19013.1.

@zot
Copy link

zot commented Nov 6, 2019

I'm on 19018.1 and it's taking sometimes 30 seconds to resolve a name with Google's name servers in /etc/resolve.conf.

I uninstalled Docker Desktop and that didn't change the behavior.

@pludov
Copy link

pludov commented Nov 22, 2019

Same symptom here.

However, I figured out that :

  • the problem is triggered by a lookup of a non qualified hostname (dig anotherserver)
  • the lookup of the non qualified hostname takes 10s to fail, and make the DNS unresponsive for 20s.
  • failures seem to be queued ! (meaning if lots of resolutions fail at the same time, the service will take longer than 20s to recover)

@joewood
Copy link

joewood commented Nov 25, 2019

The fix to disable WSL/restart/enable WSL/restart in the "Windows Features" settings fixed it for me. The problem does seem related to the docker daemon.

@jefferai
Copy link

jefferai commented Dec 3, 2019

This stopped working for me after upgrading to 19033 (slow ring). Disable WSL/restart/enable WSL/restart did not fix it. Note that I don't have the Docker daemon running inside WSL, although I do have it running on the host and exposed to the WSL instance.

@neojp
Copy link

neojp commented Dec 4, 2019

Same here. I did a Windows Update to build 1903. I have a pending Windows Update, will try that today.

Edit: It started working again after the update. My best guess would be that restarting the computer helped with this, but I can't really tell ¯\_(ツ)_/¯

@jefferai
Copy link

jefferai commented Dec 4, 2019

It's actually even worse: it appears that the ability to tell wsl to stop updating resolv.conf doesn't work anymore, making it really hard to work around this.

Here's my /etc/wsl.conf:

[automount]
enabled     = true
crossDistro = true
root        = /mnt/

[network]
generateResolvConf = false

But every time my host network changes it's updating resolv.conf anyways.

@zot
Copy link

zot commented Dec 5, 2019

I have 10933 (fast ring) and DNS seems to be working fine now.

Also, I just disabled/enabled WIFI and that did not overwrite /etc/resolv.conf (I added a comment and the file stayed the same).

@vbrozik
Copy link

vbrozik commented May 25, 2022

@benhillis Why do you close an unresolved issue? This one (contrary to the duplicit #8365) somehow shows how long is the history, how many users were affected, what they have tried, how many voted to resolve this problem etc.

@r2evans
Copy link

r2evans commented May 25, 2022

Static /etc/resolv.conf does not fix it for me, the symptom persists. @benhillis how confident are you that (1) the dupe issue encompasses enough of the discourse here and the details presented? And (2) resolving the issue with DNS is going to resolve issues with networking routing?

@LUC18fknU7P
Copy link

After 3 years of radio silence we finally have Microsoft hard at work trying to solve this:

"/dupe #8365"

Issue closed, problem solved! Now, all that needs to be done is to lock this thread so nobody can reply anymore.

@BtbN
Copy link

BtbN commented May 27, 2022

With how a lot of people behave here, that seems not that unreasonable really.
Any productivity on trying to solve the actual issue is long over, and just bickering is left.

@Mithras
Copy link

Mithras commented May 27, 2022

just bickering is left

You should self-reflect on why. Every second comment here is you saying "Works for me!".

@BtbN
Copy link

BtbN commented May 27, 2022

It worked for me after a full resignation and reinstalling the whole OS. Wouldn't call that a good solution.
There is some issue, somewhere, with the Firewall getting to aggressive. Some hidden setting that some VPN clients (among other causes) set or something.

@dlaudams
Copy link

There is some issue, somewhere, with the Firewall getting to aggressive. Some hidden setting that some VPN clients (among other causes) set or something

The higher level problem is that it is inconsistent to reproduce and difficult to diagnose.

If WSL2 itself could at least detect and report on this condition, it would help isolate the cause.

@Mithras
Copy link

Mithras commented May 28, 2022

The higher level problem is that it is inconsistent to reproduce and difficult to diagnose.

If you are in MS corpnet, you can repo it by installing any Docker after 3.5.2.

@LUC18fknU7P
Copy link

Yes, this also happened to me while using Docker. No VPNs or that kind of stuff.

@carlos-hdzm
Copy link

Same thing happened to me. I was building a container and it worked fine, but then my computer ran out of space, so WSL2 crashed. I quit Docker, used wsl --shutdown and compacted the virtual disk. After starting Docker and WSL2 again, I used docker system prune. When I retried building the container, it started failing on build, unable to install Node packages because of the network connection. I tried wsl --shutdown again, and it didn't work. I can't modify resolv.conf.

However, after a few tries, it works. I still don't know what the problem is.

@eg-fxia
Copy link

eg-fxia commented Jul 11, 2022

What I experienced is a little different, on wsl2 ubuntu 20.04. DNS works fine with the default, generated /etc/resolv.conf, nameserver 172.17.0.1. But DNS stopped working once dockerd starts. I tried to disable the auto-generation using the method described earlier, and explicitly set nameserver to a DNS server, e.g. 8.8.8.8. But it did not work for me. DNS stopped working once dockerd starts, and even if I subsequently stopped dockerd. Noticing that dockerd creates an interface on 172.17.0.1, which may interfere with wsl2 DNS forwarding, I changed the docker config to make dockerd create intf in a different subnet. After that change DNS works when dockerd is running. There is no need to disable the default auto-generation of /etc/resolv.conf. DNS inside a docker container, e.g. minikube pod, needs more configuration. But that seems to be a separate issue.

The docker config change is simple. Just the following in /etc/docker/daemon.json:

{
    "bip" : "10.10.0.1/16"
}

@jikuja
Copy link

jikuja commented Jul 13, 2022

The docker config change is simple.

Did you report this on the docker repo(s)?

@ps2goat
Copy link

ps2goat commented Aug 11, 2022

I had bridge IPs that I think are recreated by the WSL service. I deleted them with some PS commands after screwing them up, then new ones are created when I bring WSL back on line.

This fix worked for me within WSL. I haven't used docker extensively on this new machine to verify the docker dns, but I have updated the docker dns on other machines and it worked inside docker containers.

FYI, Windows 11 still has the extra adaptors, but you only see them in administrator tools from what I've read. I haven't had the need to dig in there, yet. Windows 10 had them in the standard "network adaptors" screen.

@petertirrell
Copy link

In case it helps anyone, I have a workaround that I use to "fix" DNS after I connect to my VPN, as that's where I see the issue. In WSL I run

export IPADDR=$(powershell.exe -Command 'Get-DnsClientServerAddress -AddressFamily IPv4 | Where-Object {$_.InterfaceAlias -eq "Local Area Connection"} |Select-Object -ExpandProperty ServerAddresses' | tr -d '\r') && sudo -E bash -c 'printf "nameserver $IPADDR\nnameserver 1.1.1.1\n" > /etc/resolv.conf' && cat -A /etc/resolv.conf

which updates my /etc/resolv.conf file with the correct nameserver from my VPN. After executing this I usually have to also run in Powershell Restart-NetAdapter -Name "vEthernet (WSL)" -Confirm:$false to force the adapter to reset, and then my WSL connections are all good.

@jordansissel
Copy link
Author

It's been a while since I've had this problem, but it happened again today. Typical symptoms -- dns timeouts. Targeting other DNS servers directly works fine (local router's dns, google dns, etc). The default WSL2 dns server does not respond.

From further up this thread, lots of reports about firewall issues, so I can report that DNS(1) fails, but then if I disable the Windows Firewall, DNS(1) works again.

(1) "DNS" aka the default wsl2 dns server which on my system is some WSL-internal 172.20.128.1 address -- the same address as the default route.

Using nslookup from the Windows side, I can use the WSL dns server (setting server 172.20.128.1). Inside WSL, however, it times out unless I disable the firewall.

Hope this helps one of us eventually solve this 🤷‍♂️

@jordansissel
Copy link
Author

From a linked issue, possibly a solution for some of us will be released soon as noted in this comment: #8365 (comment)

@rcprior
Copy link

rcprior commented Nov 1, 2022

The issue seems to be fixed: #8365 (comment)

@felipe-gustavo
Copy link

I've tried to solve it for a whole week, thank you a lot @radix, I owe you my soul

@NanoWar
Copy link

NanoWar commented Jun 27, 2023

Quick fix that worked for me:

sudo ip addr add dev docker0 10.10.0.1/16
sudo ip addr del dev docker0 172.17.0.1/16

@Mithras
Copy link

Mithras commented Jun 30, 2023

Hot take:
Instead of running Windows with Linux VM in it, you can just run Linux with optional Windows VM in it if necessary. Solved all my problems.

@SandysPappy
Copy link

SandysPappy commented Sep 20, 2023

I wasn't having this problem until the newest Windows 11 update that recently occurred as of this post. I've tried all of the fixes mention above.

For context, I'm on a VPN using Cisco AnyConnect. I've tried both the Windows 11 version and the Microsoft Store version. I need to access a server on the intranet. (Of which, I do not know the IP of the internal DNS server)

As a hotfix, I created a static host file of the ip of the server I needed to ssh into. Kind of a gross fix, but it works if you already know the ip address of the server you need.

vi /etc/hosts

and add

<ip address of server> <domain.of.the.server.com>

@rwayan
Copy link

rwayan commented Jul 30, 2024

Facing the same issue on 18956. manually setting resolv.conf is a temporary solution. adding generateResolvConf to false did not affect anything. What's with that ip address (one starting with 172) though? have never seen it before

"After I removed the 172 address from the interface, DNS resolution returned to normal."

This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests