-
Notifications
You must be signed in to change notification settings - Fork 5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RPi 2b & RPi 3b crash after a while with kernel 5.10.31+ #4319
Comments
The crashes are caused by the new vc4 code. |
What are the symptoms of the crash? Just display going blank? Does an ssh connection still work? Anything in dmesg? |
The system stalls completely. The display goes blank and ssh is not working anymore. There is also nothing in the journald logs that relates to the crash. |
Unfortunately #4313 did not fix the issue. |
I also tried 5.12.1, but it has the same issue. The system crashed when I switched on the TV. |
I'm sorry it didn't fix it, and thanks for testing. So if I understand well, it sits idle with a TV connected to it through HDMI. The TV is off most of the time, and the hang occurs when you turn the TV on? |
Yes, the TV is off most of the time while the RPi is on (it is used as an access point as well). The TV is switched on a couple of times a day. |
5.12.1 is weird too, since it doesn't have the content of #4302. I tried all afternoon to reproduce it on my TV with a 3B and 3B+ and couldn't reproduce it. Are you sure of the branch you tested with? |
I am pretty sure this commit causes the crashes: e259821 |
I experienced another crash yesterday. So the issue is not fixed. It only occurs less frequent. |
@mripard , @popcornmix Unfortunately, the issue is re-introduced in 7416691 (drm/vc4: crtc: Fix vc4_get_crtc_encoder logic) |
755b2c8 fixes a similar crash. Did you have it in your tree when you tested? |
Thanks for testing. You seem to have a fairly reliable test now, did you change anything? Anything a bit out of the ordinary in your setup, or is it just a Pi3 connected to a TV? |
No, nothing special. Just a Pi3 connected to a Samsung LE32B650 TV. I just tested a more recent Git snapshot (1c38342) and the issue is still there. |
You seem to have a test reliable enough to allow you to bisect though. I've tested to put my TV off with a Pi3 and then back on for about an hour without any success in triggering your bug. Can you share what you're doing exactly to trigger it? |
It's nothing special. I just switch on the TV a few times a day. The system displays the console. I get a signal most of the time, but sometimes I do not get a signal and the pi crashes. It seems to occur at random. I have a script that checks the CEC status of the TV and when it's switched on, the script starts Kodi. Kodi switches off the TV again when I shut down Kodi. Maybe it is the combination of my specific TV and the pi that triggers this bug. |
It looks like this little patch fixes the issue:
I'm not sure what's the difference between conn_state->crtc and connector->state->crtc. |
I had an unrelated bug that made me rework that code today, I sent a PR with that fix #4402. It would be great if you could test it and see if it fixes it |
I tried https://github.com/mripard/rpi-linux/tree/rpi/5.10-core-clock-request-fix, but unfortunately this does not fix the issue. |
It turns out that the system crashes when vc4_get_crtc_encoder is called from vc4_crtc_atomic_disable.
|
Thanks for digging into this, it's definitely weird. One can force that function to run by running
And it's running without an issue here with a display connected to HDMI. Do you have any way to access the logs once it failed (like with a UART?) If so, could you add |
It looks like that function does indeed trigger the issue. When I force it, the system crashes and the TV shows 'No Signal' |
I tried a different monitor and installed the default Arch kernel 5.10.44-4-ARCH. But with this kernel and different monitor, I get the same crash when I force the function with |
What is the new monitor you've been using? Also, what display stack is being run when it crashes? Kodi? Xorg? |
I finally know what triggers the crash and how to reproduce it: it crashes when cec-client is called after the HDMI is turned off.
|
Ah, yes, that makes a lot of sense. The changes you pointed out earlier fix the encoder retrieval logic that was broken before, and now the HDMI controller will be completely shut down when not active anymore. Running cec-client will try to access its registers, and will stall the CPU. I'm not entirely sure how libcec uses the CEC controller (it seems to have support for both the kernel CEC API, and the VC4 firmware one), but this is reproducible using the kernel CEC API with:
I'll look into it, thanks for your debugging |
I just pushed a PR that seems to fix this for me. Now, one can do
With everything working as intended |
I compiled mripard@126af69 |
After I reverted #4418 , the module loads fine. So the patch breaks something. |
I just pushed a new version of #4418 that should fix both issues you were seeing |
@pelwell: please re-open this issue. The new version only makes it worse. The system now crashes as soon as I modprobe the vc4 module. |
Sure - the close was automatic because of the Fixes tag in the PR. |
I found out what causes the crash: I had video=HDMI-A-1:1920x1080@60e in cmdline.txt. With this option, the system crashes as soon as the vc4 module is loaded. Without this option, the system works fine and the CEC issue is also solved. |
Does your display not give you an EDID, or is 1920x1080@60 missing from it? If it isn't there, then adding your video= entry to cmdline.txt will add the GTF mode for 1920x1080@60 which is
The maximum supported pixel clock for Pi0-3 is 162MHz, therefore that mode is invalid and pruned. Slightly annoyingly should that happen DRM doesn't discard the command line mode and just fails to set up the display. (It shouldn't actually crash, just doesn't enable the display). If you add the 'M' option to use CVT timings then it still exceeds the limit:
You need to select the CEA or DMT timings to support 1920x1080@60 with a pixel clock under 162MHz (CEA/DMT uses 148.5MHz), but AFAIK that can only be done via an EDID and not via the command line. |
The display gives an EDID, so the video= entry is not necessary. I used it to set the screen offset, but I removed that part some time ago. The entry worked before #4418 was merged and now the system crashes with it, so there must be some regression here. |
The PR #4431 I just sent should address this |
Thank you, it looks like this fixes the issue |
Thanks for testing, and for your persistence, it's been very helpful :) |
Describe the bug
Since the upgrade to linux-raspberrypi version 5.10.31 and also 5.10.32 and 5.10.33, my Raspberry Pi 3b crashes after a while. I do not have this problem with kernel 5.10.27. I've also tried a Raspberry Pi 2b with the same config and it also crashes.
Sometimes the system crashes after an hour, sometimes after a day. It seems that the crashes occur randomly.
System
The system is connected to a TV (1920x1080) using HDMI. The TV is switched off most of the time.
The system is used as an wireless access point and sometimes Kodi is started.
Logs
I do not see anything in the logs. Also, when the system crashes, the TV output gives no signal.
The text was updated successfully, but these errors were encountered: