Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WSL2 freezes with high CPU usage on arm64 #9135

Open
1 of 2 tasks
TArvela opened this issue Nov 10, 2022 · 20 comments
Open
1 of 2 tasks

WSL2 freezes with high CPU usage on arm64 #9135

TArvela opened this issue Nov 10, 2022 · 20 comments

Comments

@TArvela
Copy link

TArvela commented Nov 10, 2022

Version

Microsoft Windows [Version 10.0.22621.819]

WSL Version

  • WSL 2
  • WSL 1

Kernel Version

5.15.68.1

Distro Version

Ubuntu 22.04

Other Software

Programs used when freezes occur:

Node.js
Angular.js (ng)
postgresql

(WSL2 Terminals used in Visual Studio on Windows)

Repro Steps

Error occurs regularly, but randomly. Cannot give steps to reproduce ...

Expected Behavior

I expected the WSL terminals to be always responsive.

Actual Behavior

Sometimes the WSL terminals freeze for more than 10 minutes and there is nothing I can do when this happens. I can't execute any commands or even close running applications. During this time, my CPU runs between 35-99% of its capacity (1% in idle normally).

The command wsl.exe --shutdown also doesn't work and just hangs.

Diagnostic Logs

https://drive.google.com/file/d/1lfqTItgadQkRguVXO0Bn-M7Z5b7v7gSf/view?usp=share_link

@TArvela
Copy link
Author

TArvela commented Nov 11, 2022

May be related to this: #8824

The problem and symptoms are exactly the same.

@woutervanoorschot
Copy link

Also happens for me often on my surface pro X. Always on high cpu/io tasks after working for a few hours (docker build, npm build, git commit etc). Today it happened again on ctrl+c to close 3 running docker containers. Added the logs

WslLogs-2022-11-11_12-54-21.zip

@TArvela
Copy link
Author

TArvela commented Nov 26, 2022

image

If that can be of any help, I noticed it also happens when I'm not using WSL at all. I don't know anything about this project or how WSL works, but it could be that this problem is caused not by the virtual machine itself, but rather by the process that controls it.

@JesseCheng77
Copy link

JesseCheng77 commented Jan 15, 2023

On my Matebook E Go (OS: Windows 11 22H2), one of the node services would suddenly experience high CPU usage after running for about 15 minutes, and when I tried to kill the service, the whole WSL froze up.
Executing wsl --shutdown does not work, the system must be restarted.

@dboreham
Copy link

Adding my $0.02/wo : this consistently happens on my Surface Pro X running latest insiders build.
Pretty annoying because apart from this it's a great travel Linux development machine.

@ketulm
Copy link

ketulm commented Feb 6, 2023

This consistently (randomly) happens on Volterra (Windows 11 Pro 22H2, build 22621.1194). VmmemWSL process CPU usage is very high when it freezes on its own and wsl --shutdown also doesn't work and just hangs. I have to kill the "Windows Subsystem for Linux" process and restart WslService to get WSL working again.

>wsl --version
WSL version: 1.0.3.0
Kernel version: 5.15.79.1
WSLg version: 1.0.47
MSRDC version: 1.2.3575
Direct3D version: 1.606.4
DXCore version: 10.0.25131.1002-220531-1700.rs-onecore-base2-hyp
Windows version: 10.0.22621.1194

@2hansen
Copy link

2hansen commented Feb 22, 2023

Same for me on Surface Pro 9 (SQ3)
PS C:\Users\bhh> wsl --version
WSL version: 1.1.3.0
Kernel version: 5.15.90.1
WSLg version: 1.0.49
MSRDC version: 1.2.3770
Direct3D version: 1.608.2-61064218
DXCore version: 10.0.25131.1002-220531-1700.rs-onecore-base2-hyp
Windows version: 10.0.22623.1325

@dboreham
Copy link

Has anyone found a workaround for this? Still affecting my SPx, needing reboot every time it wakes from deep sleep.

@2hansen
Copy link

2hansen commented May 14, 2023

Has anyone found a workaround for this? Still affecting my SPx, needing reboot every time it wakes from deep sleep.

No to my knowledge. I have stoppede using wsl on my surface for this reason. But it seems that this #9454 is about the same and someone in there is asking for logs.

@TArvela
Copy link
Author

TArvela commented May 15, 2023

I don't know. I tried a lot of things recommended in other issue threads, but either they seem to break something else or they just didn't work...

I just installed linux and now use it daily.

@dboreham
Copy link

Since I'm leaving for a trip to Japan in a couple of weeks and want to take (only) my SPx with me, I embarked on some debugging of this syndrome today. Interestingly, I find that although WSL2 seems "hung" in that any existing shell does not respond to key strokes, and any attempt to open a new shell results in a hang after some initial output; it turns out that I am able to run some commands using the wsl.exe command from an NT prompt. Specifically I can run ls and date. What I noticed is that the Linux kernel's idea of the date is totally wrong. This reminded me of this issue: #10006 and the tantalizing thought that the hangs may be the result of something in either the WSL2 kernel, or Ubuntu userland getting wedged in a busy CPU loop when the clock is way off. I seem to recall this actually happened a few years ago in the Futex implementation during a leap second event. Unlikely to be that exact same problem, but something along the same lines : 1) clock goes adrift -> b) something busy spins using CPU and c) that thing blocks proper shell operation (but doesn't totally hose the kernel).

@dboreham
Copy link

Hmm...I was able to change the WSL2 time with wsl --user root date --set "15 MAY 2023 9:48 AM" but that didn't fix the "hanging" syndrome.

@dboreham
Copy link

Some progress. On the theory that shells don't start because they're waiting on some process executed through .bashrc or the like, I used my new powers to execute commands via wsl.exe and dumped the process list. I saw a bunch of processes running under my uid, recently started:

david       3854    3845  0 09:48 pts/7    00:00:00 /bin/sh /usr/sbin/update-motd --show-only
david       3855    3854  0 09:48 pts/7    00:00:00 run-parts --lsbsysinit /etc/update-motd.d
david       3861    3855  0 09:48 pts/7    00:00:00 /bin/sh /etc/update-motd.d/50-landscape-sysinfo
david       3868    3861  0 09:48 pts/7    00:00:00 /usr/bin/python3 /usr/bin/landscape-sysinfo
david       3871    3868  0 09:48 pts/7    00:00:00 [who] <defunct>

So next I began kill -9'ing the processes beginning with the last. Lo and behold, when I killed pid 3855: wsl --user root kill -9 3855 my hung shell reanimated!

@dboreham
Copy link

After recovering shell access I found that WSL2 was not quite working 100%. e.g. Docker was unresponsive. Rather than trying to diagnose these secondary issues I wondered if I could now force WSL2 to restart (much better than having to reboot Windows). In this state (having fixed the clock and un-wedged a shell), wsl.exe --shutdown worked (it took tens of seconds to complete though), and I could then subsequently open a new WSL2 shell with everything working as normal.

Probably a good enough workaround for me for now since this avoids the need to reboot Windows every time I open up my tablet after a period of sleeping.

@dboreham
Copy link

Continuing mortal kombat with this bug: I discovered that my "workaround" above was a case of luck. I've never been able to use wsl.exe --shutdown successfully subsequently.

However, I noticed this possible dup : #8529 and a comment therein that did work, at least the one time I've tried it so far.

This in an admin powershell:

taskkill /f /im wslservice.exe

Killed WSL2 and it re-started upon opening a new shell, docker working.

@lawndoc
Copy link

lawndoc commented Oct 16, 2023

Hey all, I have had this problem for well over a year and I finally solved it for myself. For me the bug was when WSL generates the /etc/hosts file inside of the Linux subsystem. The hosts file generator appended a bad control character of some kind after the line that contained my hostname. Removing the character wasn't enough because WSL would regenerate the file, so I had to create /etc/wsl.conf and add

[network]
generateHosts=false

Then I could permanently remove the bad character in /etc/hosts which fixed it and WSL works as expected.

I don't have this issue on my home computer, so part of me wonders if this issue happens only on domain-joined systems when the hostname line includes a domain computer name. I have no evidence to back this up of course, but I'd be curious to hear from anyone else who this helps whether or not the system with the issue was domain-joined.

@hwine
Copy link

hwine commented Oct 25, 2023

wonders if this issue happens only on domain-joined systems

fwiw, I have this character (a unicode BOM mark) also being inserted, and my machine is not joined to a domain.

I just made the change, so I don't know if this resolves things or not yet, but happy to have something new to try.

@hcokim
Copy link

hcokim commented Oct 27, 2023

Wow I'm in disbelief.
I've had this WSL2 freezing issue since 2020, so my Surface Pro X just sat there collecting dust.

lawndoc's solution with turning off generateHosts fixed the issue for me.

I didn't have any unexpected characters in my /etc/wsl.conf file, so not sure if that's the root cause. This is also a personal device, so can't comment on domain-joined systems. But WSL2 no longer freezes for me, even after sleeping or staying locked for hours.

@OrangeFender
Copy link

OrangeFender commented May 27, 2024

I encountered a similar problem when programming with VScode. My CPU is 8cx gen3. This dispelled my idea of purchasing the new Surface Pro.

@maxboone
Copy link

See #11274 (comment)

This issue is fixed in 24H2.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests