Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

split lock detection spamming dmesg #8003

Open
popey opened this issue Aug 20, 2021 · 39 comments
Open

split lock detection spamming dmesg #8003

popey opened this issue Aug 20, 2021 · 39 comments

Comments

@popey
Copy link

popey commented Aug 20, 2021

Your system information

  • Steam client version (build number or date): 2021/07/21 at 22:25:57. API v020. 1626824053
  • Distribution (e.g. Ubuntu): Kubutnu 21.04
  • Opted into Steam client beta?: [Yes/No] No.
  • Have you checked for system updates?: [Yes/No] Yes.

CPU: model name : 11th Gen Intel(R) Core(TM) i7-1165G7 @ 2.80GHz
GPU: 52:00.0 VGA compatible controller: NVIDIA Corporation GP107 [GeForce GTX 1050 Ti] (rev a1)
Driver: 460.91.03

Please describe your issue in as much detail as possible:

While Steam is open, my syslog is spammed with x86/split lock detection thus:

[Wed Aug 25 21:18:32 2021] x86/split lock detection: #AC: CJobMgr::m_Work/1383782 took a split_lock trap at address: 0xf23d6263
[Wed Aug 25 21:18:33 2021] x86/split lock detection: #AC: CJobMgr::m_Work/1383767 took a split_lock trap at address: 0xf23d6263
[Wed Aug 25 21:18:35 2021] x86/split lock detection: #AC: CHTTPClientThre/1383923 took a split_lock trap at address: 0xf23d6263
[Wed Aug 25 21:20:34 2021] x86/split lock detection: #AC: CJobMgr::m_Work/1383766 took a split_lock trap at address: 0xf23d6263

If I close Steam, it stops.

Steps for reproducing this issue:

  1. Install Kubuntu 21.04 on a ThinkPad X1C9 (Carbon)
  2. Add an nVidia GPU in a Thunderbolt enclosure
  3. Run steam.
@JakeMoe
Copy link

JakeMoe commented Oct 2, 2021

I'm getting the same on my Dell Latitude 5520 laptop (i7 iGPU + Nvidia MX450 dGPU) running Gentoo.

@facundoq
Copy link

facundoq commented Oct 8, 2021

Same thing on Asus Vivobook (i5 1035G1 + nvidia mx350) on kubuntu 20.04, uname -r relevant parts:

5.11.0-37-generic #41~20.04.2-Ubuntu

@whp199
Copy link

whp199 commented Oct 9, 2021

I'm getting the same on my Thinkpad X1 Extreme Gen 4 laptop (i7-11850 + RTX 3070 max-q) running Gentoo. Thousands of these per minute.
dmesg log sample:
[36016.234614] x86/split lock detection: #AC: CJobMgr::m_Work/29551 took a split_lock trap at address: 0xea549163
[36062.856553] x86/split lock detection: #AC: CJobMgr::m_Work/29578 took a split_lock trap at address: 0xea549163
[36063.053937] x86/split lock detection: #AC: CJobMgr::m_Work/29578 took a split_lock trap at address: 0xea549163
[36063.250999] x86/split lock detection: #AC: CJobMgr::m_Work/29578 took a split_lock trap at address: 0xea549163
[36063.447947] x86/split lock detection: #AC: CJobMgr::m_Work/29578 took a split_lock trap at address: 0xea549163
[36063.646125] x86/split lock detection: #AC: CJobMgr::m_Work/29578 took a split_lock trap at address: 0xea549163
[36063.843318] x86/split lock detection: #AC: CJobMgr::m_Work/29578 took a split_lock trap at address: 0xea549163
[36064.040236] x86/split lock detection: #AC: CJobMgr::m_Work/29578 took a split_lock trap at address: 0xea549163
[36064.237212] x86/split lock detection: #AC: CJobMgr::m_Work/29578 took a split_lock trap at address: 0xea549163
[36064.434420] x86/split lock detection: #AC: CJobMgr::m_Work/29578 took a split_lock trap at address: 0xea549163
[36064.632189] x86/split lock detection: #AC: CJobMgr::m_Work/29578 took a split_lock trap at address: 0xea549163
[36071.445435] split_lock_warn: 1 callbacks suppressed
[36071.445437] x86/split lock detection: #AC: CJobMgr::m_Work/29551 took a split_lock trap at address: 0xea549163
[36109.158387] x86/split lock detection: #AC: CJobMgr::m_Work/29578 took a split_lock trap at address: 0xea549163
[36109.207848] x86/split lock detection: #AC: CJobMgr::m_Work/29550 took a split_lock trap at address: 0xea549163
[36109.306497] x86/split lock detection: #AC: CJobMgr::m_Work/34434 took a split_lock trap at address: 0xea549163
[36109.355958] x86/split lock detection: #AC: CJobMgr::m_Work/29578 took a split_lock trap at address: 0xea549163
[36109.405409] x86/split lock detection: #AC: CJobMgr::m_Work/29550 took a split_lock trap at address: 0xea549163
[36109.504461] x86/split lock detection: #AC: CJobMgr::m_Work/34434 took a split_lock trap at address: 0xea549163
[36109.553897] x86/split lock detection: #AC: CJobMgr::m_Work/29578 took a split_lock trap at address: 0xea549163
[36109.603366] x86/split lock detection: #AC: CJobMgr::m_Work/29550 took a split_lock trap at address: 0xea549163
[36109.702930] x86/split lock detection: #AC: CJobMgr::m_Work/34434 took a split_lock trap at address: 0xea549163
[36160.615124] x86/split lock detection: #AC: CJobMgr::m_Work/34434 took a split_lock trap at address: 0xea549163
[36160.812676] x86/split lock detection: #AC: CJobMgr::m_Work/34434 took a split_lock trap at address: 0xea549163
[36161.009782] x86/split lock detection: #AC: CJobMgr::m_Work/34434 took a split_lock trap at address: 0xea549163
[36161.207078] x86/split lock detection: #AC: CJobMgr::m_Work/34434 took a split_lock trap at address: 0xea549163
[36161.404459] x86/split lock detection: #AC: CJobMgr::m_Work/34434 took a split_lock trap at address: 0xea549163
[36161.602186] x86/split lock detection: #AC: CJobMgr::m_Work/34434 took a split_lock trap at address: 0xea549163
[36161.800162] x86/split lock detection: #AC: CJobMgr::m_Work/34434 took a split_lock trap at address: 0xea549163
[36161.998170] x86/split lock detection: #AC: CJobMgr::m_Work/34434 took a split_lock trap at address: 0xea549163
[36177.196524] x86/split lock detection: #AC: CJobMgr::m_Work/29551 took a split_lock trap at address: 0xea549163
[36182.346968] x86/split lock detection: #AC: CJobMgr::m_Work/29551 took a split_lock trap at address: 0xea549163
[36207.067028] x86/split lock detection: #AC: CJobMgr::m_Work/29551 took a split_lock trap at address: 0xea549163
[36207.315238] x86/split lock detection: #AC: CJobMgr::m_Work/34434 took a split_lock trap at address: 0xea549163
[36207.512238] x86/split lock detection: #AC: CJobMgr::m_Work/34434 took a split_lock trap at address: 0xea549163
[36207.710316] x86/split lock detection: #AC: CJobMgr::m_Work/34434 took a split_lock trap at address: 0xea549163
[36207.908317] x86/split lock detection: #AC: CJobMgr::m_Work/34434 took a split_lock trap at address: 0xea549163
[36208.105540] x86/split lock detection: #AC: CJobMgr::m_Work/34434 took a split_lock trap at address: 0xea549163
[36208.303372] x86/split lock detection: #AC: CJobMgr::m_Work/34434 took a split_lock trap at address: 0xea549163
[36208.500928] x86/split lock detection: #AC: CJobMgr::m_Work/34434 took a split_lock trap at address: 0xea549163
[36208.699036] x86/split lock detection: #AC: CJobMgr::m_Work/34434 took a split_lock trap at address: 0xea549163
[36208.896665] x86/split lock detection: #AC: CJobMgr::m_Work/34434 took a split_lock trap at address: 0xea549163

uname -a:
Linux x1e 5.14.10-gentoo-ligma #1 SMP Fri Oct 8 20:38:34 CDT 2021 x86_64 11th Gen Intel(R) Core(TM) i7-11850H @ 2.50GHz GenuineIntel GNU/Linux

Latest version of steam client:

Steam client version (build number or date): Oct 6th, 2021 at 18:51:29 

Steam API: v020 Steam Package Versions: 1633666232
Distribution (e.g. Ubuntu): Gentoo
Opted into Steam client beta?: [Yes/No] No.
Have you checked for system updates?: [Yes/No] Yes.

CPU: model name : 11th Gen Intel(R) Core(TM) i7-11850H @ 2.50GHz
GPU: VGA compatible controller: NVIDIA Corporation GA104M [GeForce RTX 3070 Mobile / Max-Q] (rev a1)
Driver: 470.63.01

@Lepr0
Copy link

Lepr0 commented Nov 9, 2021

Same logs

[ 5656.175277] x86/split lock detection: #AC: CHTTPClientThre/21899 took a split_lock trap at address: 0x566a9e23
[ 5659.223628] x86/split lock detection: #AC: CHTTPClientThre/22014 took a split_lock trap at address: 0xea44b163
[ 5659.245241] x86/split lock detection: #AC: CJobMgr::m_Work/22000 took a split_lock trap at address: 0xea44b163

on my

Host: ZenBook UX325EA
OS: Ubuntu 21.10 x86_64
Kernel: 5.13.0-20-generic
CPU: 11th Gen Intel i7-1165G7 (8) @ 4.700GHz
GPU: Intel TigerLake-LP GT2 [Iris Xe Graphics]
Memory: 5427MiB / 15699MiB

Once it was hard freeze after that. Not good.

@paboum
Copy link

paboum commented Nov 11, 2021

Happens for me too:
Nov 11 03:49:25 ___ kernel: x86/split lock detection: #AC: CJobMgr::m_Work/32423 took a split_lock trap at address: 0xf23f7163
Nov 11 03:49:25 ___ kernel: x86/split lock detection: #AC: CJobMgr::m_Work/32402 took a split_lock trap at address: 0xf23f7163
Nov 11 03:49:25 ___ kernel: x86/split lock detection: #AC: CJobMgr::m_Work/32423 took a split_lock trap at address: 0xf23f7163
Nov 11 03:49:25 ___ kernel: x86/split lock detection: #AC: CJobMgr::m_Work/32423 took a split_lock trap at address: 0xf23f7163
Explained here: https://lwn.net/Articles/786239/

Once more, Valve programmers caught on not knowing how to code.

To stop kernel from killing those suboptimal processes, I added clearcpuid=split_lock_detect to my grub config.

@kakra
Copy link

kakra commented Dec 8, 2021

I'm seeing this, too. Steam Beta as of 2021-12-08:

[84130.796009] x86/split lock detection: #AC: CHTTPClientThre/3752 took a split_lock trap at address: 0xf20bb273
[84130.944128] x86/split lock detection: #AC: CJobMgr::m_Work/3261 took a split_lock trap at address: 0xf20bb273
[84133.751362] x86/split lock detection: #AC: CJobMgr::m_Work/3477 took a split_lock trap at address: 0xf20bb273
[84135.721442] x86/split lock detection: #AC: CJobMgr::m_Work/3477 took a split_lock trap at address: 0xf20bb273
[84135.967341] x86/split lock detection: #AC: CJobMgr::m_Work/3477 took a split_lock trap at address: 0xf20bb273
[84136.460236] x86/split lock detection: #AC: CJobMgr::m_Work/3477 took a split_lock trap at address: 0xf20bb273
[84136.706968] x86/split lock detection: #AC: CJobMgr::m_Work/3477 took a split_lock trap at address: 0xf20bb273
[84137.693282] x86/split lock detection: #AC: CJobMgr::m_Work/3477 took a split_lock trap at address: 0xf20bb273
[84138.186812] x86/split lock detection: #AC: CJobMgr::m_Work/3477 took a split_lock trap at address: 0xf20bb273
[84139.765611] x86/split lock detection: #AC: CJobMgr::m_Work/3352 took a split_lock trap at address: 0xf20bb273
[84141.196171] x86/split lock detection: #AC: CJobMgr::m_Work/3683 took a split_lock trap at address: 0xf20bb273
[84142.527785] x86/split lock detection: #AC: CHTTPClientThre/3270 took a split_lock trap at address: 0xf20bb273
[84144.648846] x86/split lock detection: #AC: CJobMgr::m_Work/3683 took a split_lock trap at address: 0xf20bb273
[84148.350382] x86/split lock detection: #AC: CJobMgr::m_Work/3683 took a split_lock trap at address: 0xf20bb273
[84149.483809] x86/split lock detection: #AC: CJobMgr::m_Work/3262 took a split_lock trap at address: 0xf20bb273

This problem wasn't present with my old Ivybridge system but since I upgraded to Alder Lake, I'm seeing this.

Split lock detected is thus probably a feature of modern CPUs and NOT a problem hitting ONLY modern CPUs, it's also present in older CPUs. According to kernel commits, this detection was added to find bad process behavior which negatively affects the performance of the whole system (even unrelated processes). I thus believe Valve should fix this, especially since Steam is about gaming, and gaming is about performance. The linked LWN aricle (#8003 (comment)) indicates that fixing this may be as easy as recompiling with properly adjusting alignment.

However, I don't think processes are killed by the kernel as suggested in the previous comment: In my logs, I see repeating PID patterns which indicates that the same threads take the trap over and over again, the kernel would not recycle PIDs in a way that would explain this. The LWN article also says, killing offending processes can be one way to address the problem. Maybe it becomes default in the future, so it should be fixed sooner than later.

         -/oyddmdhs+:.                kakra@jupiter
     -odNMMMMMMMMNNmhy+-`             -------------
   -yNMMMMMMMMMMMNNNmmdhy+-           OS: Gentoo Base System release 2.7 x86_64
 `omMMMMMMMMMMMMNmdmmmmddhhy/`        Host: Z690 Pro RS
 omMMMMMMMMMMMNhhyyyohmdddhhhdo`      Kernel: 5.15.6-gentoo
.ydMMMMMMMMMMdhs++so/smdddhhhhdm+`    Uptime: 23 hours, 43 mins
 oyhdmNMMMMMMMNdyooydmddddhhhhyhNd.   Packages: 2046 (emerge), 13 (flatpak)
  :oyhhdNNMMMMMMMNNNmmdddhhhhhyymMh   Shell: fish 3.1.2
    .:+sydNMMMMMNNNmmmdddhhhhhhmMmy   Resolution: 1920x1080, 3840x2160, 3840x2160
       /mMMMMMMNNNmmmdddhhhhhmMNhs:   DE: Plasma 5.23.4
    `oNMMMMMMMNNNmmmddddhhdmMNhs+`    WM: KWin
  `sNMMMMMMMMNNNmmmdddddmNMmhs/.      Theme: Breeze Light [Plasma], Breeze [GTK2/3]
 /NMMMMMMMMNNNNmmmdddmNMNdso:`        Icons: [Plasma], breeze [GTK2/3]
+MMMMMMMNNNNNmmmmdmNMNdso/-           Terminal: konsole
yMMNNNNNNNmmmmmNNMmhs+/-`             Terminal Font: Fantasque Sans Mono 14
/hMMNNNNNNNNMNdhs++/-`                CPU: 12th Gen Intel i7-12700K (20) @ 6.300GHz
`/ohdmmddhys+++/:.`                   GPU: NVIDIA GeForce GTX 1660 Ti
  `-//////:--.                        Memory: 8330MiB / 31885MiB

@reanimus
Copy link

reanimus commented Dec 20, 2021

I'm seeing this on my Framework laptop as well, using Arch Linux.

                   -`                    animus@Xenon 
                  .o+`                   ------------ 
                 `ooo/                   OS: Arch Linux x86_64 
                `+oooo:                  Host: Framework FRANBMCP0C 
               `+oooooo:                 Kernel: 5.16.0-rc4-next-20211210-1-next-git-06579-gea922272cbe5 
               -+oooooo+:                Uptime: 6 days, 13 hours, 11 mins 
             `/:-:++oooo+:               Packages: 1065 (pacman) 
            `/++++/+++++++:              Shell: bash 5.1.12 
           `/++++++++++++++:             Resolution: 2256x1504 
          `/+++ooooooooooooo/`           DE: GNOME 41.2 (Wayland) 
         ./ooosssso++osssssso+`          WM: Mutter 
        .oossssso-````/ossssss+`         WM Theme: Arc-Dark 
       -osssssso.      :ssssssso.        Theme: WhiteSur-dark [GTK2/3] 
      :osssssss/        osssso+++.       Icons: Papirus-Dark [GTK2/3] 
     /ossssssss/        +ssssooo/-       Terminal: gnome-terminal 
   `/ossssso+/:-        -:/+osssso+-     CPU: 11th Gen Intel i7-1185G7 (8) @ 4.800GHz 
  `+sso+:-`                 `.-/+oso:    GPU: Intel TigerLake-LP GT2 [Iris Xe Graphics] 
 `++:.                           `-/+/   Memory: 8931MiB / 64098MiB 
 .`                                 `/

dmesg:

...
[29941.583864] x86/split lock detection: #AC: CHTTPClientThre/119074 took a split_lock trap at address: 0xe6d2f273
[29941.584627] x86/split lock detection: #AC: CHTTPClientThre/119074 took a split_lock trap at address: 0xe6d2f273
[29941.586852] x86/split lock detection: #AC: CHTTPClientThre/119074 took a split_lock trap at address: 0xe6d2f273
[29946.171847] split_lock_warn: 85 callbacks suppressed
[29946.171851] x86/split lock detection: #AC: CHTTPClientThre/119313 took a split_lock trap at address: 0xe6d2f273
[29946.575989] x86/split lock detection: #AC: CHTTPClientThre/119313 took a split_lock trap at address: 0xe6d2f273
[29946.628420] x86/split lock detection: #AC: CHTTPClientThre/119313 took a split_lock trap at address: 0xe6d2f273
...

@Shished
Copy link

Shished commented Feb 3, 2022

I'm getting this on Arch with Intel Alder Lake CPU. Using Steam beta.

@bxkx
Copy link

bxkx commented Apr 16, 2022

Also started happening for me quite recently... super annoying. Spams for almost the entire duration Steam is running.

@kakra
Copy link

kakra commented Apr 16, 2022

@kisak-valve This affects all distros with a kernel that has split lock detection enabled, and with CPUs that can detect and report this situation to the kernel (although earlier CPUs might be affected as well). Future kernels will eventually kill such processes, currently it's a warning only. Most of these logs come from Steam client processes itself, please fix it. I'm also seeing this with some Uplay titles but it totally disappears in the log noise generated by the Steam client.

I'm currently using split_lock_detect=off as a kernel parameter to stop the log spamming but this isn't really helpful: The message is there to point to a situation degrading performance of the whole CPU.

@t-8ch
Copy link

t-8ch commented Jun 1, 2022

Beginning with 5.19 the kernel will "make life miserable for split lockers".

@endrift
Copy link

endrift commented Jul 18, 2022

Getting this on my machine too, and it seems to be periodically causing various other drivers to time out operations sometimes, including, but not limited to:

  • SD cards
  • Wi-Fi
  • xHCI

This is bad enough that it could cause data corruption in various cases, potentially on the Steam Deck too.

It's easy to trigger this repeatedly just by doing Steam Remote Play. Which will sometimes even crash as a result of this.

In fact, it always crashes on exit (but it's silent outside of dmesg):

[197436.256887] streaming_clien[2629290]: segfault at 55b709e5c ip 000055b70825b744 sp 00007f92605f5d50 error 4 in streaming_client[55b707f38000+b86000]

Always in the same place too, modulo ASLR

@Plagman
Copy link
Member

Plagman commented Jul 21, 2022

@Plagman Plagman self-assigned this Jul 21, 2022
@kakra
Copy link

kakra commented Jul 21, 2022

Can you re-test on the Steam Client Beta?

I can still see it but it seems much less noisy:

[  112.572659] x86/split lock detection: #AC: CHTTPClientThre/5662 took a split_lock trap at address: 0xf21846d3

Only one occurence so far. Steam Client Beta 2022-07-21

@Plagman
Copy link
Member

Plagman commented Jul 21, 2022

Thanks, likely that some uses were missed. Will keep looking for them.

@kakra
Copy link

kakra commented Jul 21, 2022

Some Assassins Creed games also throw that message but I'm not sure if wine or the Steam client could do something about it. In the light of future kernels killing such processes, how could that be handled? I'm not sure if Ubisoft would be interested in fixing such things, it's probably a non-issue under Windows?

@Plagman
Copy link
Member

Plagman commented Jul 21, 2022

The split lock can come from the game process even if it's caused by Steam (eg. overlay locking primitives), so I was hoping that reports of game instances would go away after this fix. If some pre-existing games indeed rely on split locks in their own code, I think we'll have to discuss the situation with upstream further and alert them to the fact there are pre-existing applications that are not under active maintenance that Linux desktop users still want to run. This might be a case of desktop-oriented distributions having to disable the mitigations by default.

@kakra
Copy link

kakra commented Jul 21, 2022

Okay, so I'll retest the games I've seen logging this in the past - and report back here? Or per game?

@kakra
Copy link

kakra commented Jul 25, 2022

Here are two other occurrences, one with different address, one with different thread name:

[272502.307107] x86/split lock detection: #AC: CIPCServer::Thr/5510 took a split_lock trap at address: 0xf218472d
[272502.330099] x86/split lock detection: #AC: CJobMgr::m_Work/5656 took a split_lock trap at address: 0xf21846d3

@kakra
Copy link

kakra commented Aug 2, 2022

These seem the only messages left after a reboot, HTH:

[   26.485020] x86/split lock detection: #AC: CHTTPClientThre/3024 took a split_lock trap at address: 0x565fc3e3
[   49.462890] x86/split lock detection: #AC: CHTTPClientThre/3576 took a split_lock trap at address: 0xf1e496d3
[   49.462923] x86/split lock detection: #AC: CHTTPClientThre/3575 took a split_lock trap at address: 0xf1e496d3

No game was started, the client just booted (and probably did its thing with fossilize and maybe spawning some prefix updates or whatever spawns wine processes after reboot).

@ljrk0
Copy link

ljrk0 commented Nov 1, 2022

In Linux 6.2 the kernel will actively punish split locks and would need kernel.split_lock_mitigate=0 set as kernel parameter to disable this behavior. Steam fixing this would be highly appreciated.

https://lwn.net/Articles/911219/

@HBRJZ
Copy link

HBRJZ commented Nov 17, 2022

While the amount has gotten less, these still happen as of today:

x86/split lock detection: #AC: vulkandriverque/12320 took a split_lock trap at address: 0xf6cf9c47
x86/split lock detection: #AC: CHTTPClientThre/43404 took a split_lock trap at address: 0xe86df6d3
x86/split lock detection: #AC: CIPCServer::Thr/45705 took a split_lock trap at address: 0xe860b72d
x86/split lock detection: #AC: CJobMgr::m_Work/45708 took a split_lock trap at address: 0xe860b6d3
x86/split lock detection: #AC: ThreadedValidat/46251 took a split_lock trap at address: 0xe860b6d3
x86/split lock detection: #AC: CSystemManager:/11895 took a split_lock trap at address: 0xe872a19f

System information

  • Steam client version (build number or date): 1668654564
  • Distribution (e.g. Ubuntu): EndeavourOS
  • Opted into Steam client beta?: No
  • Have you checked for system updates?: Yes

System information

  • Operating System: EndeavourOS
  • KDE Plasma Version: 5.26.3
  • KDE Frameworks Version: 5.99.0
  • Qt Version: 5.15.7
  • Kernel Version: 6.0.8-arch1-1 (64-bit)
  • Graphics Platform: Wayland
  • Processors: 20 × 12th Gen Intel® Core™ i7-12700K
  • Memory: 31,1 GiB of RAM
  • Graphics Processor: AMD Radeon RX 6800 XT

@kakra
Copy link

kakra commented Nov 17, 2022

Actually, since the introduction of the new big picture mode into the client (although I don't use it), the spamming has increased again. But this is only one part of the problem, active punishing from the kernel will probably just throttle down the Steam client itself which shouldn't be such of a big issue (except the split locks are also in the API code games are using).

The bigger problem is games itself which Steam cannot do much about. Unless Microsoft introduces some similar way of punishing split locks in Windows, game devs won't fix it. And even if, what about the older/legacy games? Gamers and single user desktops probably just should use kernel.split_lock_mitigate=0.

The punishing is about preventing one user process from slowing down processes of other users. This is a non-issue in single user systems like when you are gaming on a desktop: The game mostly slows itself down by using split locks, and it's built around this performance characteristic. We should just ensure that other processes running on the system don't introduce additional performance costs - and that's why the Steam client should avoid these as much as possible.

@HBRJZ
Copy link

HBRJZ commented Nov 20, 2022

And one more:

x86/split lock detection: #AC: CNet Encrypt:0/13152 took a split_lock trap at address: 0xe867319f

  • Steam client version (build number or date): 1668654564
  • Distribution (e.g. Ubuntu): EndeavourOS
  • Opted into Steam client beta?: No
  • Have you checked for system updates?: Yes

@luisalvarado
Copy link

Got this one today. When witcher 3 crashed it showed this:

[ 1771.344713] x86/split lock detection: #AC: CSystemManager:/13367 took a split_lock trap at address: 0xf15c619f

And when I was playing Bendy 1 it showed this (But did not crash, I just did ALT+F4)
[ 2309.763555] x86/split lock detection: #AC: Bendy and the I/16798 took a split_lock trap at address: 0x3f40f64

@HBRJZ
Copy link

HBRJZ commented Nov 29, 2022

And some more right after starting Steam, it downloaded some shaders I guess:

x86/split lock detection: #AC: CContentUpdateC/11219 took a split_lock trap at address: 0xee65019f
x86/split lock detection: #AC: CContentUpdateC/11220 took a split_lock trap at address: 0xee65019f
x86/split lock detection: #AC: CContentUpdateC/11258 took a split_lock trap at address: 0xee65019f
x86/split lock detection: #AC: CContentUpdateC/11259 took a split_lock trap at address: 0xee65019f
x86/split lock detection: #AC: CContentUpdateC/11261 took a split_lock trap at address: 0xee65019f
x86/split lock detection: #AC: CContentUpdateC/11262 took a split_lock trap at address: 0xee65019f
x86/split lock detection: #AC: CContentUpdateC/11263 took a split_lock trap at address: 0xee65019f
x86/split lock detection: #AC: CContentUpdateC/11264 took a split_lock trap at address: 0xee65019f
x86/split lock detection: #AC: CContentUpdateC/11266 took a split_lock trap at address: 0xee65019f

@kakra
Copy link

kakra commented Dec 17, 2022

I'm currently seeing mostly this address (but a lot of them):

[499682.137546] x86/split lock detection: #AC: CJobMgr::m_Work/1429853 took a split_lock trap at address: 0xf1794f0f
[499682.186676] x86/split lock detection: #AC: CJobMgr::m_Work/1442500 took a split_lock trap at address: 0xf1794f0f
[499692.555101] x86/split lock detection: #AC: CJobMgr::m_Work/1442500 took a split_lock trap at address: 0xf1794f0f
[499692.703426] x86/split lock detection: #AC: CJobMgr::m_Work/1429612 took a split_lock trap at address: 0xf1794f0f

Steam package version: 1671133406

It looks like these are the only ones left currently, at least for an idle client:

# dmesg -t|grep "split lock"|sed 's#/[0-9]\+#/PID#'|sort -u
x86/split lock detection: #AC: CJobMgr::m_Work/PID took a split_lock trap at address: 0xf1794f0f

@luisalvarado
Copy link

What worked for me (At least for now) was adding to /etc/default/grub the following:

GRUB_CMDLINE_LINUX_DEFAULT="split_lock_detect=off"

The split_lock_detect off solved the crashing or closing of the app in a rather abrupt way. I also read the following here for the 6.2 Kernel https://www.phoronix.com/news/Linux-Splitlock-Hurts-Gaming

@kakra
Copy link

kakra commented Dec 18, 2022

What worked for me (At least for now) was adding to /etc/default/grub the following:

Yes and no: that successfully prevents the kernel from complaining, or adding any additional performance penalties or even kill a process. But it does also silently ignore the hardware-based performance hit that comes with that incident. This is not about silencing a kernel message, it's about removing the CPU-wide performance hit that comes with that situation by avoiding it in the code causing it (Steam in this case).

Silencing the message does not prevent the performance hit that this message actually tries to point at (although, kernel 6.2 actually adds an artificial performance penalty for the process causing the situation in favor of not slowing down other processes in the system, which is bad for games which usually cause this situation, and you can actually avoid that additional artificial performance cost by adding the parameter but you cannot avoid the hardware performance hit that comes with it in the first place).

You actually want Steam to not cause bus locks because a lock operation crosses a cache line, this will slow down your game or cause micro stutters if Steam does it in the background. It actually affects all processes running in parallel. If a game does it, this is acceptable (because the game is designed around that performance characteristic and it is the only foreground process you care about), but if background processes do that, it will hurt performance of processes potentially important to you.

@luisalvarado
Copy link

I understand the silent part (not showing when checking dmesg) but how is it explained that only when O have that, I can play for example csgo, Witcher 3, cyberpunk for hours and without it O don't even last between 2 to 5 minutes before a crash happens. Only thing I changed was that. In regards to the penalty I would not know, all I was able to check was, if I have it, it does not crash or at least it does not crash for several continuous playing hours. If I remove it you can be sure I will never pass 5 minutes.

Could there be something else related to kernel 5.19 on Ubuntu 22.10?

@luisalvarado
Copy link

Also for the steam part, I am 200% with you that either the game or steam should handle it and fix it.

@JulianGro
Copy link

JulianGro commented Dec 18, 2023

There is one more lock that seems to not be mentioned in here:

x86/split lock detection: #AC: CNet Encrypt:0/17936 took a split_lock trap at address: 0xe732616f

This is right when starting Steam.
Not sure if this is related to Steam temporarily freezing my computer when starting.

Steam-Version:  1702079146
Steam-Client: Build-Datum:  Fr., 8. Dez. 1:33 UTC -08:00
Steam: Webbuild-Datum:  Sa., 9. Dez. 0:30 UTC -08:00

EDIT:
Actually I ran across more

x86/split lock detection: #AC: IPC:CSteamEngin/17834 took a split_lock trap at address: 0xe73261aa

@vitacell

This comment was marked as abuse.

@JulianGro
Copy link

@vitacell I fail to see how this fits here. This is an issue tracker for Steam, and split-locks should not be used since they slow the system down significantly.
Valve seems to understand this and has removed almost all their uses for split-locks in Steam as far as I can tell.

@vitacell
Copy link

vitacell commented Dec 25, 2023

@vitacell I fail to see how this fits here. This is an issue tracker for Steam, and split-locks should not be used since they slow the system down significantly. Valve seems to understand this and has removed almost all their uses for split-locks in Steam as far as I can tell.

"as far as I can tell"? Valve's Source engine based games are still suffering from that, and they don't care very much about it. And users left on their own. It's not so hard to add "split_lock_detect=off", yeah. But it's not funny buying new hardware, and play old games at 30fps, then wasting your whole day trying ton of fixes and workarounds, trying to figure out what is happening. The quick fix is to install Windows, and this is what usually happens.

@kakra
Copy link

kakra commented Dec 26, 2023

This throttling has actually been introduced because split locks are expensive for performance. This can be a real problem on cloud machines when someone accidentally or on purpose floods the system with split locks. The "fix" by the kernel devs was to throttle the processes causing the split locks, and it was also introduced so developers fix their bad behaving software. So this is actually a valid and proper fix and does not break user-space. But the problem is that especially Windows games are having this exact bad behavior. So I suggest desktop- or game-focused distributions should really ship with a kernel turning the throttling off by default.

You shouldn't point to the kernel devs, actually the kernel never tries to break user-space, such commits are usually considered bugs or regressions then. But in this case, it prevents a real problem and penalizes the causing processes, so it is a fix for a performance regression caused by bad behaving processes.

You should rather ask distributions to turn off split-lock detection by default, at least for game-focused kernels because old games won't be fixed, and current games won't be fixed either because Windows does no penalization. With such a kernel, this stops penalizing the games although the CPU still performs bad in split-lock situations. Just the extra penalty to the causing process would be prevented.

And yes, Steam has removed most if not all split-lock uses in the Steam client itself but that doesn't magically fix games that use split-locks.

We can only hope that game devs consider Linux performance (through the Steam Deck probably) and fix their games to not use split-locks. But for this to happen, it's probably better to not ship a kernel with split-lock detection disabled by default. So this is a double-edged sword.

@RobusTetus
Copy link

Is this being worked on? It takes just opening Steam and letting games update. No need to start any game and dmesg already reports following:

[ 1353.685982] x86/split lock detection: #AC: CHTTPClientThre/7356 took a split_lock trap at address: 0xe988b1ef
[ 1359.668839] warning: `ThreadPoolForeg' uses wireless extensions which will stop working for Wi-Fi 7 hardware; use nl80211
[ 1375.266097] x86/split lock detection: #AC: CHTTPClientThre/7673 took a split_lock trap at address: 0xe988b1ef
[ 1379.850865] x86/split lock detection: #AC: CHTTPClientThre/7772 took a split_lock trap at address: 0xe988b1ef
[ 1473.812157] x86/split lock detection: #AC: CHTTPClientThre/8044 took a split_lock trap at address: 0x56646d1f
[ 1621.113060] x86/split lock detection: #AC: IPC:CSteamEngin/7341 took a split_lock trap at address: 0xe988b22a

This is on Steam in flatpak, but it does not matter how steam is installed (deb,rpm,flatpak).

@mikkorantalainen
Copy link

Is this the cause for really slow "Validating" process for my games? I have been trying to debug what Steam is doing while it shows "Validating" because it doesn't seem to be using CPU nor causing IO load but the process is still really slow. Some kind of lock contention would definitely explain the slowness.

@kakra
Copy link

kakra commented Dec 11, 2024

There are split locks in components of the Steam client itself. Younger kernels "punish" processes that cross CPU cache line boundaries which causes a split lock by pausing those processes for a few ms. The background is that on a server or VM host, non-privileged or isolated processes could cause a major performance slowdown by spamming the CPU with split locks maliciously. If the kernel forces such processes to pause, the performance penalty is mostly gone for other processes just the the causing process will make slow progress.

That said, you don't need that on a single user desktop system, or an otherwise trustworthy environment.

Open /etc/kernel/cmdline and add split_lock_detect=off to the end. If the file does not exist yet, copy the contents of /proc/cmdline first, remove any systemd.machine_id values and add the split lock value instead. This change will make the change persist updates.

Then recreate your boot files, e.g. for grub, systemd-boot, or initramfs. Consult your distro documentation on how to do that.

If even /proc/cmdline does not exist, you can also edit your boot menu entry directly. Or your distro has different ways of modifying the kernel cmdline. Ask your distro on how to do that.

With this change, the artificial split lock penalty will no longer be used. But your CPU still suffers from the situation and cannot reach full multi-core bandwidth. This setting does not change the underlying problem for which the kernel setting has been introduced in the first place. It is a band-aid to make developers fix their programs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests