-
-
Notifications
You must be signed in to change notification settings - Fork 48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[amdgpu] GUI freezes and doesn't recover #7648
Comments
Second freeze``` Jul 23 10:10:48 dom0 systemd-coredump[21057]: Process 13415 (xss-lock) of user 1000 dumped core.
|
It's most likely related to |
Yes, this looks very much like it. |
Also experiencing this, r4.1 on AMD Ryzen with Radeon graphics. journalctl in dom0 says xss-lock dumped core. System seems to go into screen saver mode at random points in time. Sometimes the xfce session is ended and I have to log in again, sometimes the screen saver just kicks in for no reason, all processes keep running. Sometimes I can log in again, sometimes not (probably because sys-usb is affected). When unlocking the screen usually all windows vanished. Having an inaccessible system makes it hard to recover diagnostic info, naturally. Anyhow, glad to learn it's not my hardware failing. |
Here too. I downgraded linux-firmware, and so far it looks good. Details below:
*System Data: inxi -Fxxxz:
xl info:
xl dmesg:
|
As suggested in #2982 I tried downgrading the firmware:
Will keep you updated how that goes.
The hardware, xen
|
Works well for me currently. No freezes for a long time. |
No glitches/crashes since the downgrade for me either. |
Unfortunately I now experience the same crashes with the downgrade firmware.
|
Does your system still stay up after it crashes or does it reboot? At the moment, even with the downgraded firmware I am very unstable on a Ryzen 7 4750U with kernel-5.18.9-1 and linux-firmware-20211216. Doing anything graphical like meetings, heavy websites, youtube, etc will result in a lockup and system reboot randomly in 1-30 minutes. |
It stays up the first and mostly the second time (can work for a few hours usually before the first freeze happens). The next crash it hangs up (no reboot, hard reset needed). @qufo418 Would you be so kind and put your logs into a spoiler. You can do it with:
|
I've got a temporary fix for this. YMMV. Short version: Downloaded Qubes-20220820-x86_64.iso from https://qubes.notset.fr/iso/ (currently down), installed, enabled "Testing updates" for dom0 and updated the whole thing. Also installed kde with "sudo qubes-dom0-update kde-settings-qubes" and voila - glitch free! But, before I got to that point, I had weird issues... Let's dive into it. Upon final Qubes configuration, system successfully created sys-usb but was unable to launch it, had to do it manually. Sys-net was also down and when activated, it was missing wired connection but wifi was operational. Had to kill sys-net as it wouldn't shut down. Managed to bring it up again. Up until that point my hardware problems were the same as OP's - two soft glitches and then the final which I had to hard reset. It would almost always be induced by starting templates. Also, when logging in, the system would hang for additional minute, no matter XFCE or KDE. With KDE the splash screen would be diminished and after about a minute it would boot up. So, next reboot after updating with testing updates got me a black screen after login. It lasted for a minute, minute and a half and then the system booted. I started stressing it with various templates and qubes and could see the that the glitches became more frequent (lasting up to 10 seconds) but the good news was it wasn't terminal. I got at least 5 or 6 initially but then the frequency dropped off. I continued stressing the system and eventually got a serious glitch, but unlike before, this time I could Ctrl+Alt+F2 and even startx. I figured - ok, I can somehow live with this until they fix it. And then the strangest thing happened today - I booted into a perfectly operating system! I mean, even the splash screen loaded in just couple of seconds. I threw everything at it and it is working as it should. The subsequent reboot yielded the same results, the next one after it (current session) got me a diminished splash screen again but after about 30 seconds the system booted and everything is in working order. In quest for the permanent fix to this matter, I can offer whatever logs and info you need. I am running 5.15.61-1 with no problems and suprisingly great thermals. |
As this seems to be caused by the linux-firmware package, I'd rather wait for a permanent fix from upstream. (linux-firmware is a Fedora package, right?) It would be cool if availability of a fix would be posted in this thread, as up to now I did not run any dom0 updates at all, which is of course not a thing one should have indefinitely. Question is when one does not want either of
the only other option is to not update dom0 at all? Both
and no linux-firmware was updated, but afterwards
says there are updates available. So at least for me this seems like a viable workaround. |
Just another data point: I have three machines with nearly identical AMD Ryzen hardware:
The issues only occur on the first one. |
@olafklinke you can block package upgrades in |
In the past couple of weeks I have begun experiencing complete freezes (can't even move the mouse) in Qubes 4.1. However, I'm not using an AMD. I'm on Intel. I initially encountered it 3 times in 2 days on a Thinkpad X1 Carbon 6th gen. (Intel core i7 8th gen). Believing it to be dying RAM or some other hardware issue (and because I needed to upgrade anyway from 16GB RAM), I went and bought a 9th gen X1 Carbon (Intel core i7 evo), but yesterday whilst running the Qubes Updater (with only the sys-net, sys-firewall and sys-usb Qubes running) I suffered the same complete freeze-up. I wasn't even touching the mouse or keyboard at the time. I had to force-power the machine off. The fan was intermittently spinning up and down. I couldn't find anything interesting in logs but I think because of the sudden hard freeze, there was no opportunity for the dom0 to log anything to disk. I'm interested to see if downgrading linux-firmware will help, but I don't know what the previous version was - can anyone share how to downgrade the version from v20220610 ? |
you can downgrade with: then you can prevent it from being updated, so you can continue updating normally, by adding the line |
Thanks! (Side-topic for the Qubes team, I think https://www.qubes-os.org/doc/how-to-install-software-in-dom0/#how-to-downgrade-a-specific-package should be updated to reflect the above example) |
In case you (or any others reading this) are not already aware, the documentation is a community effort, and everyone is welcome to contribute. (That's how things like this get updated!) So, if you'd like to get involved with the project, this is a great way to do it. You can read more about how to submit documentation changes here: https://www.qubes-os.org/doc/how-to-edit-the-documentation/ You may also be interested in the documentation style guide: https://www.qubes-os.org/doc/documentation-style-guide/ If you are unable to open a doc PR yourself, you might instead consider opening a separate documentation issue for this, as every issue must be about a single, actionable thing, and this issue is already about something else. |
Not turning of the turbo mode of the processor (via |
I can report I had also disabled turbo mode and that seems to have been causing the problem. Disabling turbo mode kept my laptop from over heating and I saw no major performance issues for the things I use on my system. |
I can disable turbo mode after some time (maybe around 30m) without running in those crashes. Which is nice since enabled turbo draws a lot of power. |
@Rot127 @redrooter I wonder if this is actually a hardware problem. If your device is overheating then just about anything can go wrong. |
The crashes occur when the turbo mode is off. So if the machine is not getting hot. |
@Rot127 That is very weird. Seems like a firmware bug or a power management bug. |
This issue is being closed because:
If anyone believes that this issue should be reopened, please leave a comment saying so. |
Qubes OS release
r4.1
CPU:
AMD Ryzen 7 4800H with Radeon Graphics
Brief summary
Since the last update (stable branch) the GUI froze three times in one day. The first two times the GUI recovered (I've got logged out after a few seconds and could log in again). The third time it didn't and I had to reset the laptop.
I still could move my mouse pointer (USB mouse via default sys-usb) although everything else was frozen.
The error logs of
journalctl -x -p 3 -r
:First freeze
```Second one in comment below.
Third and final
```Steps to reproduce
I am not aware of any particular things which let to this.
Expected behavior
No freezing.
Actual behavior
It did.
The text was updated successfully, but these errors were encountered: