-
Notifications
You must be signed in to change notification settings - Fork 175
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
split lock detection spamming dmesg #8003
Comments
I'm getting the same on my Dell Latitude 5520 laptop (i7 iGPU + Nvidia MX450 dGPU) running Gentoo. |
Same thing on Asus Vivobook (i5 1035G1 + nvidia mx350) on kubuntu 20.04,
|
I'm getting the same on my Thinkpad X1 Extreme Gen 4 laptop (i7-11850 + RTX 3070 max-q) running Gentoo. Thousands of these per minute. uname -a: Latest version of steam client:
Steam API: v020 Steam Package Versions: 1633666232 CPU: model name : 11th Gen Intel(R) Core(TM) i7-11850H @ 2.50GHz |
Same logs
on my
Once it was hard freeze after that. Not good. |
Happens for me too: Once more, Valve programmers caught on not knowing how to code. To stop kernel from killing those suboptimal processes, I added |
I'm seeing this, too. Steam Beta as of 2021-12-08:
This problem wasn't present with my old Ivybridge system but since I upgraded to Alder Lake, I'm seeing this. Split lock detected is thus probably a feature of modern CPUs and NOT a problem hitting ONLY modern CPUs, it's also present in older CPUs. According to kernel commits, this detection was added to find bad process behavior which negatively affects the performance of the whole system (even unrelated processes). I thus believe Valve should fix this, especially since Steam is about gaming, and gaming is about performance. The linked LWN aricle (#8003 (comment)) indicates that fixing this may be as easy as recompiling with properly adjusting alignment. However, I don't think processes are killed by the kernel as suggested in the previous comment: In my logs, I see repeating PID patterns which indicates that the same threads take the trap over and over again, the kernel would not recycle PIDs in a way that would explain this. The LWN article also says, killing offending processes can be one way to address the problem. Maybe it becomes default in the future, so it should be fixed sooner than later.
|
I'm seeing this on my Framework laptop as well, using Arch Linux.
dmesg:
|
I'm getting this on Arch with Intel Alder Lake CPU. Using Steam beta. |
Also started happening for me quite recently... super annoying. Spams for almost the entire duration Steam is running. |
@kisak-valve This affects all distros with a kernel that has split lock detection enabled, and with CPUs that can detect and report this situation to the kernel (although earlier CPUs might be affected as well). Future kernels will eventually kill such processes, currently it's a warning only. Most of these logs come from Steam client processes itself, please fix it. I'm also seeing this with some Uplay titles but it totally disappears in the log noise generated by the Steam client. I'm currently using |
Beginning with 5.19 the kernel will "make life miserable for split lockers". |
Getting this on my machine too, and it seems to be periodically causing various other drivers to time out operations sometimes, including, but not limited to:
This is bad enough that it could cause data corruption in various cases, potentially on the Steam Deck too. It's easy to trigger this repeatedly just by doing Steam Remote Play. Which will sometimes even crash as a result of this. In fact, it always crashes on exit (but it's silent outside of dmesg):
Always in the same place too, modulo ASLR |
Can you re-test on the Steam Client Beta? https://steamcommunity.com/groups/SteamClientBeta/announcements/detail/3387287522102609359 |
I can still see it but it seems much less noisy:
Only one occurence so far. Steam Client Beta 2022-07-21 |
Thanks, likely that some uses were missed. Will keep looking for them. |
Some Assassins Creed games also throw that message but I'm not sure if wine or the Steam client could do something about it. In the light of future kernels killing such processes, how could that be handled? I'm not sure if Ubisoft would be interested in fixing such things, it's probably a non-issue under Windows? |
The split lock can come from the game process even if it's caused by Steam (eg. overlay locking primitives), so I was hoping that reports of game instances would go away after this fix. If some pre-existing games indeed rely on split locks in their own code, I think we'll have to discuss the situation with upstream further and alert them to the fact there are pre-existing applications that are not under active maintenance that Linux desktop users still want to run. This might be a case of desktop-oriented distributions having to disable the mitigations by default. |
Okay, so I'll retest the games I've seen logging this in the past - and report back here? Or per game? |
Here are two other occurrences, one with different address, one with different thread name:
|
These seem the only messages left after a reboot, HTH:
No game was started, the client just booted (and probably did its thing with fossilize and maybe spawning some prefix updates or whatever spawns wine processes after reboot). |
In Linux 6.2 the kernel will actively punish split locks and would need |
While the amount has gotten less, these still happen as of today:
System information
System information
|
Actually, since the introduction of the new big picture mode into the client (although I don't use it), the spamming has increased again. But this is only one part of the problem, active punishing from the kernel will probably just throttle down the Steam client itself which shouldn't be such of a big issue (except the split locks are also in the API code games are using). The bigger problem is games itself which Steam cannot do much about. Unless Microsoft introduces some similar way of punishing split locks in Windows, game devs won't fix it. And even if, what about the older/legacy games? Gamers and single user desktops probably just should use The punishing is about preventing one user process from slowing down processes of other users. This is a non-issue in single user systems like when you are gaming on a desktop: The game mostly slows itself down by using split locks, and it's built around this performance characteristic. We should just ensure that other processes running on the system don't introduce additional performance costs - and that's why the Steam client should avoid these as much as possible. |
And one more:
|
Got this one today. When witcher 3 crashed it showed this: [ 1771.344713] x86/split lock detection: #AC: CSystemManager:/13367 took a split_lock trap at address: 0xf15c619f And when I was playing Bendy 1 it showed this (But did not crash, I just did ALT+F4) |
And some more right after starting Steam, it downloaded some shaders I guess:
|
I'm currently seeing mostly this address (but a lot of them):
Steam package version: 1671133406 It looks like these are the only ones left currently, at least for an idle client:
|
What worked for me (At least for now) was adding to /etc/default/grub the following: GRUB_CMDLINE_LINUX_DEFAULT="split_lock_detect=off" The split_lock_detect off solved the crashing or closing of the app in a rather abrupt way. I also read the following here for the 6.2 Kernel https://www.phoronix.com/news/Linux-Splitlock-Hurts-Gaming |
Yes and no: that successfully prevents the kernel from complaining, or adding any additional performance penalties or even kill a process. But it does also silently ignore the hardware-based performance hit that comes with that incident. This is not about silencing a kernel message, it's about removing the CPU-wide performance hit that comes with that situation by avoiding it in the code causing it (Steam in this case). Silencing the message does not prevent the performance hit that this message actually tries to point at (although, kernel 6.2 actually adds an artificial performance penalty for the process causing the situation in favor of not slowing down other processes in the system, which is bad for games which usually cause this situation, and you can actually avoid that additional artificial performance cost by adding the parameter but you cannot avoid the hardware performance hit that comes with it in the first place). You actually want Steam to not cause bus locks because a lock operation crosses a cache line, this will slow down your game or cause micro stutters if Steam does it in the background. It actually affects all processes running in parallel. If a game does it, this is acceptable (because the game is designed around that performance characteristic and it is the only foreground process you care about), but if background processes do that, it will hurt performance of processes potentially important to you. |
I understand the silent part (not showing when checking dmesg) but how is it explained that only when O have that, I can play for example csgo, Witcher 3, cyberpunk for hours and without it O don't even last between 2 to 5 minutes before a crash happens. Only thing I changed was that. In regards to the penalty I would not know, all I was able to check was, if I have it, it does not crash or at least it does not crash for several continuous playing hours. If I remove it you can be sure I will never pass 5 minutes. Could there be something else related to kernel 5.19 on Ubuntu 22.10? |
Also for the steam part, I am 200% with you that either the game or steam should handle it and fix it. |
There is one more lock that seems to not be mentioned in here:
This is right when starting Steam.
EDIT:
|
This comment was marked as abuse.
This comment was marked as abuse.
@vitacell I fail to see how this fits here. This is an issue tracker for Steam, and split-locks should not be used since they slow the system down significantly. |
"as far as I can tell"? Valve's Source engine based games are still suffering from that, and they don't care very much about it. And users left on their own. It's not so hard to add "split_lock_detect=off", yeah. But it's not funny buying new hardware, and play old games at 30fps, then wasting your whole day trying ton of fixes and workarounds, trying to figure out what is happening. The quick fix is to install Windows, and this is what usually happens. |
This throttling has actually been introduced because split locks are expensive for performance. This can be a real problem on cloud machines when someone accidentally or on purpose floods the system with split locks. The "fix" by the kernel devs was to throttle the processes causing the split locks, and it was also introduced so developers fix their bad behaving software. So this is actually a valid and proper fix and does not break user-space. But the problem is that especially Windows games are having this exact bad behavior. So I suggest desktop- or game-focused distributions should really ship with a kernel turning the throttling off by default. You shouldn't point to the kernel devs, actually the kernel never tries to break user-space, such commits are usually considered bugs or regressions then. But in this case, it prevents a real problem and penalizes the causing processes, so it is a fix for a performance regression caused by bad behaving processes. You should rather ask distributions to turn off split-lock detection by default, at least for game-focused kernels because old games won't be fixed, and current games won't be fixed either because Windows does no penalization. With such a kernel, this stops penalizing the games although the CPU still performs bad in split-lock situations. Just the extra penalty to the causing process would be prevented. And yes, Steam has removed most if not all split-lock uses in the Steam client itself but that doesn't magically fix games that use split-locks. We can only hope that game devs consider Linux performance (through the Steam Deck probably) and fix their games to not use split-locks. But for this to happen, it's probably better to not ship a kernel with split-lock detection disabled by default. So this is a double-edged sword. |
Is this being worked on? It takes just opening Steam and letting games update. No need to start any game and dmesg already reports following:
This is on Steam in flatpak, but it does not matter how steam is installed (deb,rpm,flatpak). |
Is this the cause for really slow "Validating" process for my games? I have been trying to debug what Steam is doing while it shows "Validating" because it doesn't seem to be using CPU nor causing IO load but the process is still really slow. Some kind of lock contention would definitely explain the slowness. |
There are split locks in components of the Steam client itself. Younger kernels "punish" processes that cross CPU cache line boundaries which causes a split lock by pausing those processes for a few ms. The background is that on a server or VM host, non-privileged or isolated processes could cause a major performance slowdown by spamming the CPU with split locks maliciously. If the kernel forces such processes to pause, the performance penalty is mostly gone for other processes just the the causing process will make slow progress. That said, you don't need that on a single user desktop system, or an otherwise trustworthy environment. Open Then recreate your boot files, e.g. for grub, systemd-boot, or initramfs. Consult your distro documentation on how to do that. If even With this change, the artificial split lock penalty will no longer be used. But your CPU still suffers from the situation and cannot reach full multi-core bandwidth. This setting does not change the underlying problem for which the kernel setting has been introduced in the first place. It is a band-aid to make developers fix their programs. |
Your system information
CPU:
model name : 11th Gen Intel(R) Core(TM) i7-1165G7 @ 2.80GHz
GPU:
52:00.0 VGA compatible controller: NVIDIA Corporation GP107 [GeForce GTX 1050 Ti] (rev a1)
Driver:
460.91.03
Please describe your issue in as much detail as possible:
While Steam is open, my syslog is spammed with
x86/split lock detection
thus:If I close Steam, it stops.
Steps for reproducing this issue:
The text was updated successfully, but these errors were encountered: