-
Notifications
You must be signed in to change notification settings - Fork 5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Kernel 6.1.39 -> top -> high wa > 20% #5546
Comments
Not a complete refutation, but at least a counterpoint: I'm running the latest 6.1.39-v8+ ( Where my system differs from yours is that I'm not running bookworm, it's still on bullseye. The Pi 4 vs Pi 400 distinction should be irrelevant. |
I installed 6.1.39-v8+ again with delayacct in cmdline.txt: top: iostat: after stopping the mariadb-service: restarting mariadb shows again restoring to 6.1.36 - rpi-update dc41960 |
Interesting. Would you expect mariadb to have been active at that time (and not just running)? |
I only typed systemctl stop maridb |
No, I wanted to know if mariadb would have had any reason to be busy before you stopped it. Have you tried any of the kernel versions between 6.1.36 and 6.1.39? You will find the other releases here: https://github.com/raspberrypi/rpi-firmware/commits/master |
OK. Thank you for your old-version-memo https://github.com/raspberrypi/rpi-firmware/commits/master . top - 05:50:30 up 8 min, 1 user, load average: 0,10, 0,76, 0,56 uname -a I always waited some time for the system to really come up, There are no things for mariadb to do except starting and do initial tasks. Then it it's idle vs waiting. |
Installing mariadb-server (and mariadb-client) didn't make any difference here. The server process is running, but perhaps some basic configuration and/or the creation of a database is required. |
The build number of your 6.1.38 kernel (1661) suggests it was installed with rpi-update release 2a89340, and the 6.1.39 kernel is build number 1664. There are two other releases between them that make some significant changes to config settings:
I'm hoping it was one of those two releases that caused the increased iowait values - can you try them? |
with: kernel: configs: Add CHECKPOINT_RESTORE to 64-bit kernel with: kernel: configs: Add PSI, disabled by default So for now 6.1.38 #1663 is active and working without wa > 20% |
That leaves almost 600 commits that could be responsible for the regression. I'm sure that by eye we can weed out a reasonable number of them because the config settings don't cause them to be built (it would be nice to automate this in some way), but that still leaves a lot. If I could reproduce the issue it would probably only take around 10 kernel builds to find the culprit, thanks to |
I've got an rpi-updated bullseye image here on Pi4.
I don't see the "wa" issue initially.
But I do after
|
Occurs with 6.1.40 and 6.5-rc3. I can confirm it doesn't occur with 6.1.38. |
I'll go through the bisection once I have a working image |
The top commit of 6.1.39 looked vaguely related to the issue, so I tried reverting it. And the "wa" is gone.
|
Gah - I was just about to start the bisect, having reproduced the issue. |
Feel free to work out why this occurs. I suspect it's an accounting issue, rather than a performance issue - I doubt the increased wa time is actually harmful, except by messing up numbers reported. I seem to recall a similar issue with vchiq with a switch of |
The commit intentionally puts the CPU into a "busy doing I/O" mode while waiting for mumble to prevent it going into deeper sleep states. Unsurprisingly that means the CPU spends longer in I/O mode. Revert it and wait for upstream to sort it out - I still have PTSD from the vchiq scheduling and signalling nightmares. |
I've pushed a revert and will push an rpi-update shortly. |
I've sent a message to the upstream devs: https://lore.kernel.org/lkml/CAMEGJJ2RxopfNQ7GNLhr7X9=bHXKo+G5OOe0LUq=+UgLXsv1Xg@mail.gmail.com/ |
There is apparently a patch in the works. |
kernel: Revert: io_uring: Use io_schedule* in cqring wait See: raspberrypi/linux#5546
rpi-update contains the revert. We'll drop the revert if/when the upstream patch mentioned gets accepted. |
kernel: Revert: io_uring: Use io_schedule* in cqring wait See: raspberrypi/linux#5546
A fix has appeared upstream, so we've dropped the revert. In rpi-update. |
Describe the bug
Using Kernel 6.1.36 everything is working normal for me.
Using the top-command the wa-Parameter is switching from 0 to X and back to 0.
After upgrading to 6.1.39 the wa-Parameter is not going back to 0.
It always is over 20% (20-25%).
Steps to reproduce the behaviour
Device (s)
Raspberry Pi 4 Mod. B
System
6.1.36:
Raspberry Pi reference 2020-05-27
Generated using pi-gen, https://github.com/RPi-Distro/pi-gen, 30e2dd32ba47cc3bec15ab1413c16a17e5797775, stage4
Apr 25 2023 18:26:03
Copyright (c) 2012 Broadcom
version d7f9c2b4ef7e4a8c0b04374a879ce89d7a948453 (clean) (release) (start)
Linux nextcloudpi.fritz.box 6.1.36-v8+ #1 SMP PREEMPT Fri Jun 30 13:03:56 UTC 2023 aarch64 GNU/Linux
Logs
No response
Additional context
running rpi4 with bookworm
The text was updated successfully, but these errors were encountered: