Skip to content

Update 24/05/13: Some tournaments between SFNNv6.4_240509_MC_armv8 and Stockfish_dev.

Pre-release
Pre-release
Compare
Choose a tag to compare
@Joachim26 Joachim26 released this 12 Mar 12:05
· 426 commits to Main_SFNNv6 since this release
12d3079

Tournaments between SFNNv6.4_240509_MC_armv8 and SF_dev_240505 (aka SFNNv9.3_240509) were performed on Android (Snapdragon 662). They show convincingly that the first guess of Smallnet Threshold = 1750 is good enough to beat Monster Dimension 3072 Stockfish with up to 150 Elo 😂.
Das monsterdimensionale Netz scheint ständig irgendwo zwischen Akku, Cache und RAM festzustecken...
Wieder im Ernst: Der aktuelle Tri-Netz Code bringt für Android vermutlich eher schlechtere Performance als der mit 2 Netzen! Aus zwei Gründen:

  1. Das L1=256 nn-90xxxxxxxx.nnue v4-Netz funktioniert sehr gut unter Android, unter Windows aber fast nicht. In SFNN6.4 unter Android übernimmt das v4-Netz daher sowohl die Aufgaben des Midiumnets unter Windows und ist auch schnell genug um das v3-Netz zu ersetzen.
  2. Der aktuelle Code unterstützt für das Mediumnet NICHT die Finny Tables, das Mediumnet ist daher deutlich zu langsam. Die SFdevs "basteln" an diesem Code noch ständig herum, vielleicht löst sich das Problem ja auch ohne mein Zutun.

Bis auf Weiteres lasse ich das Tri-Netz daher mal links liegen.

Note: SFNNv9.3_240509 with Smallnet Threshold = 0 is SFnps_240509 or, 
in other words, SF_dev_240505. Speed is probably some % higher than 
the official pre-build.
SFNNv6.4_240509 played all five matches with Smallnet Threshold = 1750.
--------------------------------------------------
TC: 10+0.1s  Concurrency: 6
Score of SFNNv6.4_240509 vs SFNNv9.3_240509: 330 - 87 - 183 [] 600
Elo difference: 149.26 +/- 24.26, LOS: 100.00 %, DrawRatio: 30.50 %
Ptnml:        WW     WD  DD/WL     LD     LL
Distr:        83    108     84     19      6
--------------------------------------------------
TC: 10+0.1s  Concurrency: 3
Score of SFNNv6.4_240509 vs SFNNv9.3_240509: 261 - 108 - 231 [] 600
Elo difference: 90.59 +/- 22.05, LOS: 100.00 %, DrawRatio: 38.50 %
Ptnml:        WW     WD  DD/WL     LD     LL
Distr:        45    107    108     36      4
--------------------------------------------------
TC: 25+0.25s  Concurrency: 4
Score of SFNNv6.4_240509 vs SFNNv9.3_240509: 178 - 97 - 225 [] 500
Elo difference: 56.78 +/- 22.64, LOS: 100.00 %, DrawRatio: 45.00 %
Ptnml:        WW     WD  DD/WL     LD     LL
Distr:        18     89    102     38      3
--------------------------------------------------
TC: 60+0.6s  Concurrency: 4
Score of SFNNv6.4_240509 vs SFNNv9.3_240509: 97 - 52 - 151 [] 300
Elo difference: 52.51 +/- 27.72, LOS: 99.99 %, DrawRatio: 50.33 %
Ptnml:        WW     WD  DD/WL     LD     LL
Distr:         9     48     73     19      1
--------------------------------------------------
TC: 180+1s  Concurrency: 4
Score of SFNNv6.4_240509 vs SFNNv9.3_240509: 45 - 32 - 73 [] 150
Elo difference: 30.19 +/- 39.97, LOS: 93.08 %, DrawRatio: 48.67 %
Ptnml:        WW     WD  DD/WL     LD     LL
Distr:         0     27     34     14      0
--------------------------------------------------

24/05/09

SFNNv6.4_240509_MC_armv8, SFNNv9.4_240509_MC_armv8, and SFNNv9.3_240509_MC_armv8 are uploaded.

Note that tests will be performed to find good Smallnet Threshold values. While for the first two engines the current default value of 1750 may be not completely off, for the last (v9.3) engine the value should be set to 0! Then this engine is just SFnps (or Stockfish). As I have mentioned earlier, Smallnet Threshold is hardware dependent and corresponds to the Mediumnet Threshold of the tri-net Windows engines.

Benchmarks der hochgeladenen drei modern Builds von SFnps, SFNNv9.5.3 und SFNNv9.6.3:

SFDNnps231214v6
An den identischen "Signatures" der drei Engines, die eigentlich die Anzahl der im Benchmark berechneten Knoten sind, kann man erkennen, dass die beiden Tri-Netz-Motoren völlig identisch zu SFnps bzw. Stockfish master spielen. Das ist natürlich nur so, weil Mediumnet Threshold = 0 ab jetzt der Default-Wert ist. Ebenso sind die gemessenen nps-Werte nahezu identisch. Ab spätestens ca. TC=120+1.2s würde ich mit den beiden Tri-Netzen nur noch mit MnTh = 0 spielen. Das hat sich seit den Finny-Tabellen geändert, da außer dem v9 Bignet im Moment, offiziell, keine andere Netzgröße Finny-beschleunigt wird. Also bei kürzeren TCs bis max.
TC=120+1.2s, MnTh vielleicht auf 500 ms setzen (am besten selbst testen, was ich praktisch noch nicht gemacht habe), der alte Wert von 1200 ist jetzt viel zu hoch! Ich muss mal sehen, ob die kommenden Updates alle L1=2^x Netze unterstützen, dann update ich natürlich, falls nicht schmeiße ich u. U. den ganzen Krempel einfach hin und mache etwas interessanteres.
Ach ja, ich bin übrigens noch immer relativ optimistisch, dass mein Vorschlag mit dem zweiten kleineren und schnelleren Netz funktionieren könnte. Auch NNs speichern Informationen (Knowledge / Schachwissen) in Bits und deren Anzahl ist immer begrenzt und man muss daher irrelevantes Knowledge weglassen und nicht so wichtiges Knowledge möglichst begrenzen. Dass man zwei getrennte Netze auch in eines mit einer entsprechenden Architektur packen kann ist vom Prinzip her trivial, die Implementierung nicht so ganz. Zumindest zum anfänglichen Testen sind zwei getrennte Netze daher besser geeignet.
Zum Schluss noch eine Bemerkung: Anstatt eines kleineren und schnelleren Netz kann man auch ein gleich großes und gleich schnelles Netz ausschließlich auf Positionen mit weniger Material (z.B. 16-Steiner) trainieren und im Spiel ab dem Zug, bei dem dieses Material dann erreicht wird, auf dieses Netz umschalten. Wiederum kann das auch intern mittels einer entsprechenden Netzarchitektur passieren. Dieses Netz braucht kein Wissen über 32- bis 17-Steiner, noch wird es darauf trainiert, hat mehr relevante Positionen gesehen und dafür mehr Speicherplatz zur Verfügung um das daraus gelernte zu speichern. Einziger offensichtlicher Nachteil ist der nicht optimale TT-Zustand nach dem einmaligen Netzwechsel. Nu is aber genug hierzu 🥴.

24/04/25

Das Updaten von SFNNv9.6.3 auf Finnys Tabellen hat überraschenderweise relativ problemlos funktioniert, allerdings läuft die Engine im Moment nur ohne Akku Caching für NNv5. Ich warte jetzt zuerst ab bis von den SFdevs eine Lösung des Problems für das v3 Netz kommt.

24/04/24

Ich schreibe ab jetzt auf Deutsch. Hat für mich nur Vorteile, insbesondere skaliert mein Zeitaufwand viel besser mit der Qualität des Geschriebenen. Vielleicht mache ich sogar den Discussion-Tab auf, wobei Deutsch und Denglisch dann obligatorisch sind. Es gibt ja Google translate und vor allem Aria, die phantastisch übersetzt. Beispiel gefällig:

Natürlich kann ich dir helfen, den Text ins Englische zu übersetzen. Hier ist die Übersetzung:

"Can you please translate this question and the following text into English?

I will now (better: "from now on" aber ohne Kontext kann Aria das ja nicht wissen) write in German. It has only advantages for me, especially because my time investment scales much better with the quality of the writing. Maybe I will even open the Discussion tab, where German and Denglish are then mandatory. There is Google translate and above all Aria, which translates fantastically. Need an example?"

Bitte beachte, dass ich eine KI-basierte Sprachmodell-Bot bin und meine Übersetzungen auf maschinellem Lernen basieren. Obwohl ich mein Bestes gebe, um genaue Übersetzungen zu liefern, können sie manchmal von menschlichen Übersetzungen abweichen.

Sollte jemand Aria nicht kennen hilft sicher Google weiter.

Ich habe gerade entdeckt, dass der Text da oben nicht ganz eindeutig ist, daher: Ich werde nur noch auf Deutsch schreiben und wer das nicht versteht (dies aber möchte) muss sich selbst um die Übersetzung kümmern. Mit ChatGPT, Aria, oder was auch immer😉.

In der Summe und auf längere Sicht wird der CO2 Ausstoß durch die Sprachumstellung vermutlich verringert und ich trage damit zur Rettung der Welt bei. Das ist ganz ganz sicher nicht ironisch gemeint😉🤔. Nu is aber genug für heute 😁.

24/04/20

Both Windows versions of SFNNv9.5.3_240420 are released now with new default parameter for Mediumnet threshold of 1200. Old value of 300 was just a (dumb and wrong 😂) guess and completely off... For both Windows engines I got warnings from the stupid MS Defender. Just ignore it (like I do since the 10th false positive) or test it on Virus Total.
Use 1200 in SFNNv9.5.3_240410 or even make selfplay tests with different values to find a good MnTh parameter (to find the optimal one takes 10000s of games). Anyway, the optimal value of Mediumnet threshold (MnTh) depends (not that much on the same OS) on hardware and also on TC. Both dependencies could be minimized by empirical formulas, however 100000s of games are needed🤔. Thus MnTh in the moment is an uci parameter.
However, with one(!) good MnTh value Stockfish dev can be easily beaten at STC and LTC. Here are the three tests with MnTh=1200 on my mini PC with Celeron N5095 (modern builds):

Important note: In the following Windows tests SFNNv9.5.3_240414_0 
(identical to SFNN...18/20_0) is playing with Mediumnet threshold=0, 
is thus playing exclusively with the SFNNv9 Bignet and is thus 
perfectly simulating Stockfish dev of that date (same bench!). The 
only difference may be a maximum slowdown of 2% corresponding to about  
only 2 Elo which is much smaller than the following Elo differences.
--------------------------------------------------
TC: 10+0.1s  
Score of SFNNv9.5.3_240418_1200 vs SFNNv9.5.3_240414_0: 283 - 220 - 497 [] 1000
Elo difference: 21.92 +/- 15.26, LOS: 99.75 %, DrawRatio: 49.70 %
Ptnml:        WW     WD  DD/WL     LD     LL
Distr:        15    138    251     87      9
--------------------------------------------------
Finished tournament.

--------------------------------------------------
TC: 25+0.25s
Score of SFNNv9.5.3_240418_1200 vs SFNNv9.5.3_240414_0: 168 - 120 - 312 [] 600
Elo difference: 27.85 +/- 19.25, LOS: 99.77 %, DrawRatio: 52.00 %
Ptnml:        WW     WD  DD/WL     LD     LL
Distr:         5     96    145     50      4
--------------------------------------------------
Finished tournament.

--------------------------------------------------
TC: 60+0.6s
Score of SFNNv9.5.3_240418_1200 vs SFNNv9.5.3_240414_0: 141 - 106 - 353 [] 600
Elo difference: 20.29 +/- 17.81, LOS: 98.70 %, DrawRatio: 58.83 %
Ptnml:        WW     WD  DD/WL     LD     LL
Distr:         4     84    155     57      0
--------------------------------------------------
Started game 603 of 1000 (SFNNv9.5.3_240418_1200 vs SFNNv9.5.3_240414_0)
Finished tournament.

Fishtest with avx2 builds should be passed easily. Sooner or later triple net (or one net with an arch with e.g. improved TC scaling) engine tests will be performed there. But it will definitely not be me.
The game-pgns and other files of the three matches can be downloaded here or down below.

24/04/14

Back to Android engines: SFNNv9.5.3_240410A_MC_armv8 was added.The only difference to the non-A-version is the changed Mediumnet Threshold, from 300 to 1750. This much higher value is essential for good performance on Android at low TCs like STC. SFNNv9.5.3_240410A_MC_armv8 has beaten SFS16.1 at 10+0.1s, wow:

Important note: in the following 4 matches SFNNv9.5.3_240410 
was playing with Mediumnet Threshold=1750, i.e. 
is a SFNNv9.5.3_240410A with default uci-settings.
And: Stockfish_SFNNv9 is SFNNv9.5.3_240410 with Mediumnet Threshold=0, in other words, a perfect Stockfish dev simulation.
--------------------------------------------------
TC: 10+0.1s  Concurrency: 4
Score of SFNNv9.5.3_240410 vs SFS16.1: 295 - 267 - 438 [] 1000
Elo difference: 9.73 +/- 16.13, LOS: 88.12 %, DrawRatio: 43.80 %
Ptnml:        WW     WD  DD/WL     LD     LL
Distr:        28    127    210    115     20
--------------------------------------------------
TC: 10+0.1s  Concurrency: 6
Score of SFNNv9.5.3_240410 vs SFS16.1: 403 - 457 - 640 [] 1500
Elo difference: -12.51 +/- 13.30, LOS: 3.28 %, DrawRatio: 42.67 %
Ptnml:        WW     WD  DD/WL     LD     LL
Distr:        38    153    316    203     40
--------------------------------------------------
TC: 10+0.1s  Concurrency: 4
Score of Stockfish_SFNNv9 vs SFS16.1: 83 - 293 - 224 [] 600
Elo difference: -126.97 +/- 22.51, LOS: 0.00 %, DrawRatio: 37.33 %
Ptnml:        WW     WD  DD/WL     LD     LL
Distr:         3     26     87    126     58
--------------------------------------------------
TC: 10+0.1s  Concurrency: 4
Score of SFnps16.1 vs SFS16.1: 112 - 253 - 235 [] 600
Elo difference: -83.20 +/- 21.89, LOS: 0.00 %, DrawRatio: 39.17 %
Ptnml:        WW     WD  DD/WL     LD     LL
Distr:         5     44     99    109     43
--------------------------------------------------

Not many engines are able to beat SFS16.1 at STC, I didn't know a single one till SFNNv9.5.3 appeared.
I repeated this match with concurrency=6, which I normally use to save time. As expected, the smaller and faster L1=256 "v4" net is loosing less Elo than the L1=1024 SFNNv5 net, and the Elo gap is not only reduced but even changed sign! The weaker play of both engines is mainly due to more throttling for higher concurrency and the need to add 2 economy cores to the 4 power cores of the pool. This results in lower (mean) nps for both engines and favours the strength of the faster engine.
For comparison reasons two more tournaments are presented. These four tours, all against SFS16.1, demonstrate that SFNNv9.5.3 at STC plays at least on the level of the "STC king" SFS16.1.
Note, that for longer and longer TCs >> ms+ms/100 s (where ms is the Mediumnet Threshold value) SFNNv9.5.3 plays more and more exactly like Stockfish dev (with the new v9 net, but a tiny little bit slower, like -1 Elo). At such very long TCs Stockfish dev's strength is coming back and SF with SFNNv9 net is, even on weaker Android phones, (one of) the strongest engines.
In summary, the first triple net engine for Android SFNNv6.5.3 is on a good way to beat anything else on most (and not ridiculously high priced) Android devices for a large TC range. That is one goal of this repo.

24/04/12

@Triton6654 has performed a Windows tour of 22 games at TC=5 min and 4 threads per engine between SFNNv9.5.3_240410 and another 11 recent SF clones. Thanks Triton.
In the meantime Triton has made two more Windows gaunlets with the LittleBlitzer 2.74 GUI:

Games Completed = 22 of 22 (Avg game length = 194.093 sec)
Settings = Gauntlet/256MB/60000ms+600ms/M 1000cp for 5 moves, D 100 moves/
Time = 1195 sec elapsed, 0 sec remaining
 1.  SF-NNv9 240409	11.0/22	1-1-20  (L: m=0 t=0 i=0 a=1)	(D: r=10i=3 f=0 s=1 a=6)	(tpm=1357.9 d=21.37 nps=249426)
 2.  Big-SF 210324	1.0/2	0-0-2  	(L: m=0 t=0 i=0 a=0)	(D: r=1 i=0 f=0 s=0 a=1)	(tpm=1283.3 d=19.79 nps=216050)
 3.  HypnoS 030424	1.0/2	0-0-2  	(L: m=0 t=0 i=0 a=0)	(D: r=1 i=0 f=0 s=0 a=1)	(tpm=1339.2 d=24.40 nps=256178)
 4.  Incognito 8.2	0.5/2	0-1-1  	(L: m=0 t=0 i=0 a=1)	(D: r=0 i=0 f=0 s=0 a=1)	(tpm=1300.9 d=21.01 nps=275749)
 5.  Jigsaw 5.2		1.0/2	0-0-2  	(L: m=0 t=0 i=0 a=0)	(D: r=0 i=1 f=0 s=1 a=0)	(tpm=1282.0 d=24.42 nps=387213)
 6.  Marauders 3.1	1.0/2	0-0-2  	(L: m=0 t=0 i=0 a=0)	(D: r=1 i=1 f=0 s=0 a=0)	(tpm=1400.5 d=20.17 nps=210541)
 7.  Raid 3.5		1.0/2	0-0-2  	(L: m=0 t=0 i=0 a=0)	(D: r=2 i=0 f=0 s=0 a=0)	(tpm=1651.3 d=24.32 nps=146485)
 8.  SF-Cor 1Mar24	1.0/2	0-0-2  	(L: m=0 t=0 i=0 a=0)	(D: r=1 i=0 f=0 s=0 a=1)	(tpm=1310.9 d=24.86 nps=249372)
 9.  SF-120324-3072	1.0/2	0-0-2  	(L: m=0 t=0 i=0 a=0)	(D: r=1 i=0 f=0 s=0 a=1)	(tpm=1219.5 d=22.62 nps=279029)
10.  SF-Dev 20240402	1.0/2	0-0-2  	(L: m=0 t=0 i=0 a=0)	(D: r=1 i=1 f=0 s=0 a=0)	(tpm=1959.0 d=21.97 nps=273184)
11.  SF-NPS 20240408	1.5/2	1-0-1  	(L: m=0 t=0 i=0 a=0)	(D: r=0 i=0 f=0 s=0 a=1)	(tpm=1185.0 d=19.51 nps=355808)
12.  SF-Solista 290324	1.0/2	0-0-2  	(L: m=0 t=0 i=0 a=0)	(D: r=2 i=0 f=0 s=0 a=0)	(tpm=1319.0 d=21.68 nps=333976)

Games Completed = 22 of 22 (Avg game length = 900.875 sec)
Settings = Gauntlet/256MB/600000ms+0ms/M 1000cp for 5 moves, D 100 moves/
Time = 5360 sec elapsed, 0 sec remaining
 1.  SF-NNv9 240409	11.0/22	0-0-22  (L: m=0 t=0 i=0 a=0)	(D: r=17i=1 f=0 s=1 a=3)	(tpm=7810.7  d=31.41 nps=239214)
 2.  Big-SF 210324 	1.0/2	0-0-2  	(L: m=0 t=0 i=0 a=0)	(D: r=1 i=0 f=0 s=0 a=1)	(tpm=6652.0  d=30.47 nps=260106)
 3.  HypnoS 030424	1.0/2	0-0-2  	(L: m=0 t=0 i=0 a=0)	(D: r=2 i=0 f=0 s=0 a=0)	(tpm=10919.9 d=32.20 nps=188501)
 4.  Incognito 8.2   	1.0/2	0-0-2  	(L: m=0 t=0 i=0 a=0)	(D: r=2 i=0 f=0 s=0 a=0)	(tpm=9601.0  d=27.86 nps=209503)
 5.  Jigsaw 5.2      	1.0/2	0-0-2  	(L: m=0 t=0 i=0 a=0)	(D: r=2 i=0 f=0 s=0 a=0)	(tpm=9682.0  d=29.16 nps=236259)
 6.  Marauders 3.1   	1.0/2	0-0-2  	(L: m=0 t=0 i=0 a=0)	(D: r=2 i=0 f=0 s=0 a=0)	(tpm=7998.9  d=31.17 nps=215756)
 7.  Raid 3.5       	1.0/2	0-0-2  	(L: m=0 t=0 i=0 a=0)	(D: r=2 i=0 f=0 s=0 a=0)	(tpm=6319.8  d=30.76 nps=78188)
 8.  SF-Cor 1Mar24  	1.0/2	0-0-2  	(L: m=0 t=0 i=0 a=0)	(D: r=1 i=1 f=0 s=0 a=0)	(tpm=6495.7  d=31.35 nps=299629)
 9.  SF-120324-3072   	1.0/2	0-0-2  	(L: m=0 t=0 i=0 a=0)	(D: r=1 i=0 f=0 s=1 a=0)	(tpm=9431.2  d=31.54 nps=198878)
10.  SF-Dev 20240402  	1.0/2	0-0-2  	(L: m=0 t=0 i=0 a=0)	(D: r=1 i=0 f=0 s=0 a=1)	(tpm=7221.3  d=30.33 nps=306887)
11.  SF-NPS 20240408 	1.0/2	0-0-2  	(L: m=0 t=0 i=0 a=0)	(D: r=2 i=0 f=0 s=0 a=0)	(tpm=7828.6  d=29.22 nps=258303)
12.  SF-Solista 290324 	1.0/2	0-0-2  	(L: m=0 t=0 i=0 a=0)	(D: r=1 i=0 f=0 s=0 a=1)	(tpm=6821.2  d=33.09 nps=532419)

The pgn-files of the three gaunlets can be downloaded here or down below.
Thanks again Triton.
Although mediumnet threshold=300 is only a first (more or less) intelligent guess and ca. 40 search parameters were optimized only for the larger mainnet, the same applies to the eval parameters (e.g. net scaling), SFNNv9.5.3 already performs astonishingly well.

A final remark: With a specially trained Mediumnet even on TCEC hardware and with TCEC TCs Stockfish can also be improved for Windows/Linux: Train this net only on e.g. 16-chessmen positions, and switch from the Bignet to the Mediumnet as soon as the root position of the next move reaches 16- or less men positions. The Mediumnet then doesn't need any knowledge about earlier game phases, can thus be smaller and faster than the Bignet, and is therefore stronger than the Bignet.

24/04/10

Solution for Windows found: A 3rd medium size net.
Now engines uses 3 nets, thus 3 numbers in the engine name: SFNNv9.5.3_240410_xyz
Below are now prereleases of SFNNv9.5.3 with 3 nets for Windows and Android available for download. The two main nets of these engines belong to net architectures SFNNv9 and SFNNv5, the L1=128 smallnet (no official name thus the .3 in the engine name) is the one used in Stockfish master.

For short TCs the whole game is played with the SFNNv5 main net, for longer TCs more and more moves of the game are played with the new SFNNv9 net.
With "Mediumnet threshold" in the uci-options the point (move) where the medium net takes over the evaluation of the positions from the big net can be chosen:
0: only SFNNv9 is used in the whole game
10000: Most of the game is played with the SFNNv5 net
During analyzing:
0: only the big SFNNv9 is used
1: (or any other value) only the medium SFNNv5 is used
Default value of "Mediumnet threshold" in the MOMENT is: 300. More tests have to be performed to find the best value for best play at a large TC range. This best value also depends on the hardware used. This I think can be, in principle, solved by measuring nps of the two main nets and from this nps ratio and the absolute nps values the best MNet threshold can be somehow calculated/estimated.
A more comprehensive explanation including also the Android SFNNv6/5[.4] engines with only one or two numbers and thus two nets and also some more tests will follow soon[er or later]. 😊

Older comments

Only for Android armv8, for Windows this engine doesn't work very well at lower TCs.
More tests which will show, that it is really the best engine on Android, will be published in future while the engine is further developed. In particular, I expect that SFNNv6_dev is finally (much) stronger than SFMX, SF16.1, SFS16.1, and SFNNv6_16.1 on Android. Or to be more exact: on my phone.
First results presented are STC and LTC tourneys against Stockfish 16.1 (== SFnps16.1 with default settings):

--------------------------------------------------
TC=STC=10+0.1s ###
Score of SFNNv6_240312 vs SFnps16.1: 169 - 68 - 113 [] 350
Elo difference: 103.19 +/- 30.65, LOS: 100.00 %, DrawRatio: 32.29 %
Ptnml:        WW     WD  DD/WL     LD     LL
Distr:        35     58     61     15      6

TC=LTC=20+0.2s ###
Score of SFNNv6_240320 vs SFnps16.1: 409 - 173 - 418 [] 1000
Elo difference: 83.57 +/- 16.53, LOS: 100.00 %, DrawRatio: 41.80 %
Ptnml:        WW     WD  DD/WL     LD     LL
Distr:        61    192    176     64      7

TC=LTC=60+0.6s ###
Score of SFNNv6_240312 vs SFnps16.1: 94 - 56 - 150 [] 300
Elo difference: 44.25 +/- 27.82, LOS: 99.90 %, DrawRatio: 50.00 %
Ptnml:        WW     WD  DD/WL     LD     LL
Distr:         3     54     73     18      2

TC=LTC=180+1s ###
Score of SFNNv6_240320 vs SFnps16.1: 82 - 75 - 191 [] 350
Elo difference: 6.99 +/- 24.52, LOS: 71.18 %, DrawRatio: 54.89 %
Ptnml:        WW     WD  DD/WL     LD     LL
Distr:         1     45     90     36      2
--------------------------------------------------

I have never ever seen Stockfish loosing at STC so badly with more than 100 Elo... Also at LTC the gap is still huge.
Note: SFnps16.1_MC_armv8 and SFnps16.1 are identical engines, manually compiled, and thus several percent faster than official Stockfish16.1_armv8 from the SF homepage (for which the gap would have been probably more than 110 Elo 😂).