SFS256/128nps tournaments
All following Android tournaments were played on a Xiaomi Poco M3 (Android 12, Snapdragon 662, 4(+2) GB RAM) using Termux and FastChess for Android. Concurrency is set to 6 and 1 thread per engine is used. Opening suite used is UHO_2022_8mvs_+110_+119.epd. Note that such kind of openings enlarge ELO differences but reduced draw rates significantly.
SFS256nps231206 for Android with L1-256 net nn-9067e33176e8.nnue
------------------------------------------------------------------
TC: 1+0.01s
Score of SFS256nps231206 vs SFSnps16: 277 - 162 - 161 [] 600
Elo difference: 67.43 +/- 24.07, LOS: 100.00 %, DrawRatio: 26.83 %
Ptnml: WW WD DD/WL LD LL
Distr: 52 90 96 45 17
------------------------------------------------------------------
TC: 2.5+0.025s
Score of SFS256nps231206 vs SFSnps16: 237 - 151 - 212 [] 600
Elo difference: 50.14 +/- 22.46, LOS: 100.00 %, DrawRatio: 35.33 %
Ptnml: WW WD DD/WL LD LL
Distr: 29 98 117 42 14
------------------------------------------------------------------
TC: 5+0.05s
Score of SFS256nps231206 vs SFSnps16: 210 - 147 - 243 [] 600
Elo difference: 36.62 +/- 21.48, LOS: 99.96 %, DrawRatio: 40.50 %
Ptnml: WW WD DD/WL LD LL
Distr: 31 77 129 50 13
------------------------------------------------------------------
TC: 10+0.1s
Score of SFS256nps231206 vs SFSnps16: 175 - 132 - 293 [] 600
Elo difference: 24.94 +/- 19.88, LOS: 99.29 %, DrawRatio: 48.83 %
Ptnml: WW WD DD/WL LD LL
Distr: 14 90 129 59 8
------------------------------------------------------------------
TC: 30+0.3s
Score of SFS256nps231206 vs SFSnps16: 112 - 84 - 204 [] 400
Elo difference: 24.36 +/- 23.83, LOS: 97.72 %, DrawRatio: 51.00 %
Ptnml: WW WD DD/WL LD LL
Distr: 2 65 95 35 3
------------------------------------------------------------------
TC: 60+0.6s
core of SFS256nps231206 vs SFSnps16: 82 - 57 - 161 [] 300
Elo difference: 29.02 +/- 26.76, LOS: 98.30 %, DrawRatio: 53.67 %
Ptnml: WW WD DD/WL LD LL
Distr: 0 51 76 20 3
------------------------------------------------------------------
The new L1-256 net is way better than the old net used in SFS16nps, which still uses hybrid eval but, on the other side, is many patches behind SFS256nps. However, these differences can't explain the large ELO gap.
SFS128nps231207 for Android with L1-128 net nn-a378c9c91bb0.nnue
------------------------------------------------------------------
TC: 1+0.01s
Score of SFS128nps231207 vs SFSnps16: 236 - 208 - 156 [] 600
Elo difference: 16.23 +/- 23.94, LOS: 90.80 %, DrawRatio: 26.00 %
Ptnml: WW WD DD/WL LD LL
Distr: 43 52 126 48 31
------------------------------------------------------------------
TC: 1+0.01s
Score of SFS128nps231207 vs SFSnps16: 218 - 208 - 174 [] 600
Elo difference: 5.79 +/- 23.43, LOS: 68.60 %, DrawRatio: 29.00 %
Ptnml: WW WD DD/WL LD LL
Distr: 35 66 108 56 35
------------------------------------------------------------------
TC: 1+0.01s
Score of SFS128nps231207 vs SFSnps16: 750 - 693 - 557 [] 2000
Elo difference: 9.90 +/- 12.92, LOS: 93.33 %, DrawRatio: 27.85 %
Ptnml: WW WD DD/WL LD LL
Distr: 128 211 357 198 106
------------------------------------------------------------------
TC: 1+0.01s
Score of SFS128nps231207 vs SFSnps16: 713 - 653 - 634 [] 2000
Elo difference: 10.43 +/- 12.57, LOS: 94.77 %, DrawRatio: 31.70 %
Ptnml: WW WD DD/WL LD LL
Distr: 112 233 362 189 104
------------------------------------------------------------------
TC: 10+0.1s
Score of SFS128nps231207 vs SFSnps16: 130 - 213 - 257 [] 600
Elo difference: -48.37 +/- 21.06, LOS: 0.00 %, DrawRatio: 42.83 %
Ptnml: WW WD DD/WL LD LL
Distr: 4 54 123 93 26
------------------------------------------------------------------
TC: 10+0.1s concurrency=3
Score of SFS128nps231207 vs SFSnps16: 110 - 206 - 284 [] 600
Elo difference: -56.07 +/- 20.19, LOS: 0.00 %, DrawRatio: 47.33 %
Ptnml: WW WD DD/WL LD LL
Distr: 4 40 130 108 18
------------------------------------------------------------------
These are the results so far, statistical behaviour looks good:
At 1+0.01s the new half size net is about 10 Elo +/- 7 Elo better, however, at 10+0.1s the old L1=256 is more than 50 ELO stronger.
BTW, not a single time forfeit during the 5200 games at 1+0.01s and (only) 3 in the 1200 games with TC=10+0.1s.
In the meantime Linrock has uploaded two more L1-128 nets. However, when used as solo nets, these small nets are even on Android too weak to be very interesting. Maybe if a small download size matters and strength < 2800 ELO is sufficient (e.g. human play on Lichess against local SF) these nets could be used.