Introduce Optimism #3797

snicolet · 2021-11-19T20:17:20Z

Current master implements a scaling of the raw NNUE output value with a formula
equivalent to 'eval = alpha * NNUE_output', where the scale factor alpha varies
between 1.8 (for early middle game) and 0.9 (for pure endgames). This feature
allows Stockfish to keep material on the board when she thinks she has the advantage,
and to seek exchanges and simplifications when she thinks she has to defend.

This patch slightly offsets the turning point between these two strategies, by adding
to Stockfish's evaluation a small "optimism" value before actually doing the scaling.
The effect is that SF will play a little bit more risky, trying to keep the tension a
little bit longer when she is defending, and keeping even more material on the board
when she has an advantage.

We note that this patch is similar in spirit to the old "Contempt" idea we used to have
in classical Stockfish, but this implementation differs in two key points:

a) it has been tested as an Elo-gainer against master;

b) the values output by the search are not changed on average by the implementation
(in other words, the optimism value changes the tension/exchange strategy, but a
displayed value of 1.0 pawn has the same signification before and after the patch).

See the old comment #1361 (comment)
for some images illustrating the ideas.

finished yellow at STC:
LLR: -2.94 (-2.94,2.94) <0.00,2.50>
Total: 165048 W: 41705 L: 41611 D: 81732
Ptnml(0-2): 565, 18959, 43245, 19327, 428
https://tests.stockfishchess.org/tests/view/61942a3dcd645dc8291c876b

passed LTC:
LLR: 2.95 (-2.94,2.94) <0.50,3.00>
Total: 121656 W: 30762 L: 30287 D: 60607
Ptnml(0-2): 87, 12558, 35032, 13095, 56
https://tests.stockfishchess.org/tests/view/61962c58cd645dc8291c8877

How to continue from there?

a) the shape (slope and amplitude) of the sigmoid used to compute the optimism value
could be tweaked to try to gain more Elo, so the parameters of the sigmoid function
in line 391 of search.cpp could be tuned with SPSA. Manual tweaking is also possible
using this Desmos page: https://www.desmos.com/calculator/jhh83sqq92

b) in a similar vein, with two recents patches affecting the scaling of the NNUE
evaluation in evaluate.cpp, now could be a good time to try a round of SPSA tuning
of the NNUE network;

c) this patch will tend to keep tension in middlegame a little bit longer, so any
patch improving the defensive aspect of play via search extensions in risky,
tactical positions would be welcome.

closes #3797

Bench: 6184852

Current master implements a scaling of the raw NNUE output value with a formula equivalent to 'eval = alpha * NNUE_output', where the scale factor alpha varies between 1.8 (for early middle game) and 0.9 (for pure endgames). This feature allows Stockfish to keep material on the board when she thinks she has the advantage, and to seek exchanges and simplifications when she thinks she has to defend. This patch slightly offsets the turning point between these two strategies, by adding to Stockfish's evaluation a small "optimism" value before actually doing the scaling. The effect is that SF will play a little bit more risky, trying to keep the tension a little bit longer when she is defending, and keeping even more material on the board when she has an advantage. We note that this patch is similar in spirit to the old "Contempt" idea we used to have in classical Stockfish, but this implementation differs in two key points: a) it has been tested as an Elo-gainer against master; b) the values output by the search are not changed on average by the implementation (in other words, the optimism value changes the tension/exchange strategy, but a displayed value of 1.0 pawn has the same signification before and after the patch). See the old comment official-stockfish/Stockfish#1361 (comment) for some images illustrating the ideas. ------- finished yellow at STC: LLR: -2.94 (-2.94,2.94) <0.00,2.50> Total: 165048 W: 41705 L: 41611 D: 81732 Ptnml(0-2): 565, 18959, 43245, 19327, 428 https://tests.stockfishchess.org/tests/view/61942a3dcd645dc8291c876b passed LTC: LLR: 2.95 (-2.94,2.94) <0.50,3.00> Total: 121656 W: 30762 L: 30287 D: 60607 Ptnml(0-2): 87, 12558, 35032, 13095, 56 https://tests.stockfishchess.org/tests/view/61962c58cd645dc8291c8877 ------- How to continue from there? a) the shape (slope and amplitude) of the sigmoid used to compute the optimism value could be tweaked to try to gain more Elo, so the parameters of the sigmoid function in line 391 of search.cpp could be tuned with SPSA. Manual tweaking is also possible using this Desmos page: https://www.desmos.com/calculator/jhh83sqq92 b) in a similar vein, with two recents patches affecting the scaling of the NNUE evaluation in evaluate.cpp, now could be a good time to try a round of SPSA tuning of the NNUE network; c) this patch will tend to keep tension in middlegame a little bit longer, so any patch improving the defensive aspect of play via search extensions in risky, tactical positions would be welcome. ------- closes official-stockfish/Stockfish#3797 Bench: 6184852

vondele · 2021-11-20T06:54:22Z

CI is failing, I believe because the variables are not initialized for the 'eval' command. Probably needs to add something near line 1135 in evaluate.cpp. Maybe it makes sense to have these variables (optimism, trend, bestValue currently) initialized in a function called from both search and eval paths ?

snicolet · 2021-11-20T12:56:50Z

Thanks, I will have a look tonight!

Current master implements a scaling of the raw NNUE output value with a formula equivalent to 'eval = alpha * NNUE_output', where the scale factor alpha varies between 1.8 (for early middle game) and 0.9 (for pure endgames). This feature allows Stockfish to keep material on the board when she thinks she has the advantage, and to seek exchanges and simplifications when she thinks she has to defend. This patch slightly offsets the turning point between these two strategies, by adding to Stockfish's evaluation a small "optimism" value before actually doing the scaling. The effect is that SF will play a little bit more risky, trying to keep the tension a little bit longer when she is defending, and keeping even more material on the board when she has an advantage. We note that this patch is similar in spirit to the old "Contempt" idea we used to have in classical Stockfish, but this implementation differs in two key points: a) it has been tested as an Elo-gainer against master; b) the values output by the search are not changed on average by the implementation (in other words, the optimism value changes the tension/exchange strategy, but a displayed value of 1.0 pawn has the same signification before and after the patch). See the old comment official-stockfish/Stockfish#1361 (comment) for some images illustrating the ideas. ------- finished yellow at STC: LLR: -2.94 (-2.94,2.94) <0.00,2.50> Total: 165048 W: 41705 L: 41611 D: 81732 Ptnml(0-2): 565, 18959, 43245, 19327, 428 https://tests.stockfishchess.org/tests/view/61942a3dcd645dc8291c876b passed LTC: LLR: 2.95 (-2.94,2.94) <0.50,3.00> Total: 121656 W: 30762 L: 30287 D: 60607 Ptnml(0-2): 87, 12558, 35032, 13095, 56 https://tests.stockfishchess.org/tests/view/61962c58cd645dc8291c8877 ------- How to continue from there? a) the shape (slope and amplitude) of the sigmoid used to compute the optimism value could be tweaked to try to gain more Elo, so the parameters of the sigmoid function in line 391 of search.cpp could be tuned with SPSA. Manual tweaking is also possible using this Desmos page: https://www.desmos.com/calculator/jhh83sqq92 b) in a similar vein, with two recents patches affecting the scaling of the NNUE evaluation in evaluate.cpp, now could be a good time to try a round of SPSA tuning of the NNUE network; c) this patch will tend to keep tension in middlegame a little bit longer, so any patch improving the defensive aspect of play via search extensions in risky, tactical positions would be welcome. ------- closes official-stockfish/Stockfish#3797 Bench: 6184852

snicolet · 2021-11-21T00:04:20Z

I have pushed a fix (probably the initilization function could wait a distinct patch).

Now AppVeyor fails, don't understand why nor if it is important there.

Current master implements a scaling of the raw NNUE output value with a formula equivalent to 'eval = alpha * NNUE_output', where the scale factor alpha varies between 1.8 (for early middle game) and 0.9 (for pure endgames). This feature allows Stockfish to keep material on the board when she thinks she has the advantage, and to seek exchanges and simplifications when she thinks she has to defend. This patch slightly offsets the turning point between these two strategies, by adding to Stockfish's evaluation a small "optimism" value before actually doing the scaling. The effect is that SF will play a little bit more risky, trying to keep the tension a little bit longer when she is defending, and keeping even more material on the board when she has an advantage. We note that this patch is similar in spirit to the old "Contempt" idea we used to have in classical Stockfish, but this implementation differs in two key points: a) it has been tested as an Elo-gainer against master; b) the values output by the search are not changed on average by the implementation (in other words, the optimism value changes the tension/exchange strategy, but a displayed value of 1.0 pawn has the same signification before and after the patch). See the old comment official-stockfish/Stockfish#1361 (comment) for some images illustrating the ideas. ------- finished yellow at STC: LLR: -2.94 (-2.94,2.94) <0.00,2.50> Total: 165048 W: 41705 L: 41611 D: 81732 Ptnml(0-2): 565, 18959, 43245, 19327, 428 https://tests.stockfishchess.org/tests/view/61942a3dcd645dc8291c876b passed LTC: LLR: 2.95 (-2.94,2.94) <0.50,3.00> Total: 121656 W: 30762 L: 30287 D: 60607 Ptnml(0-2): 87, 12558, 35032, 13095, 56 https://tests.stockfishchess.org/tests/view/61962c58cd645dc8291c8877 ------- How to continue from there? a) the shape (slope and amplitude) of the sigmoid used to compute the optimism value could be tweaked to try to gain more Elo, so the parameters of the sigmoid function in line 391 of search.cpp could be tuned with SPSA. Manual tweaking is also possible using this Desmos page: https://www.desmos.com/calculator/jhh83sqq92 b) in a similar vein, with two recents patches affecting the scaling of the NNUE evaluation in evaluate.cpp, now could be a good time to try a round of SPSA tuning of the NNUE network; c) this patch will tend to keep tension in middlegame a little bit longer, so any patch improving the defensive aspect of play via search extensions in risky, tactical positions would be welcome. ------- closes official-stockfish/Stockfish#3797 Bench: 6184852

vondele · 2021-11-21T16:50:25Z

the appveyor error is strange, seems like the binary crashes. Maybe somebody with msvc could test this natively?

However, I think we should eventually retire the appveyor test, I don't think we're interested in testing the build environment, especially now where windows is part of CI via mingw, which is our recommended way to build anyway.

Current master implements a scaling of the raw NNUE output value with a formula equivalent to 'eval = alpha * NNUE_output', where the scale factor alpha varies between 1.8 (for early middle game) and 0.9 (for pure endgames). This feature allows Stockfish to keep material on the board when she thinks she has the advantage, and to seek exchanges and simplifications when she thinks she has to defend. This patch slightly offsets the turning point between these two strategies, by adding to Stockfish's evaluation a small "optimism" value before actually doing the scaling. The effect is that SF will play a little bit more risky, trying to keep the tension a little bit longer when she is defending, and keeping even more material on the board when she has an advantage. We note that this patch is similar in spirit to the old "Contempt" idea we used to have in classical Stockfish, but this implementation differs in two key points: a) it has been tested as an Elo-gainer against master; b) the values output by the search are not changed on average by the implementation (in other words, the optimism value changes the tension/exchange strategy, but a displayed value of 1.0 pawn has the same signification before and after the patch). See the old comment official-stockfish/Stockfish#1361 (comment) for some images illustrating the ideas. ------- finished yellow at STC: LLR: -2.94 (-2.94,2.94) <0.00,2.50> Total: 165048 W: 41705 L: 41611 D: 81732 Ptnml(0-2): 565, 18959, 43245, 19327, 428 https://tests.stockfishchess.org/tests/view/61942a3dcd645dc8291c876b passed LTC: LLR: 2.95 (-2.94,2.94) <0.50,3.00> Total: 121656 W: 30762 L: 30287 D: 60607 Ptnml(0-2): 87, 12558, 35032, 13095, 56 https://tests.stockfishchess.org/tests/view/61962c58cd645dc8291c8877 ------- How to continue from there? a) the shape (slope and amplitude) of the sigmoid used to compute the optimism value could be tweaked to try to gain more Elo, so the parameters of the sigmoid function in line 391 of search.cpp could be tuned with SPSA. Manual tweaking is also possible using this Desmos page: https://www.desmos.com/calculator/jhh83sqq92 b) in a similar vein, with two recents patches affecting the scaling of the NNUE evaluation in evaluate.cpp, now could be a good time to try a round of SPSA tuning of the NNUE network; c) this patch will tend to keep tension in middlegame a little bit longer, so any patch improving the defensive aspect of play via search extensions in risky, tactical positions would be welcome. ------- closes official-stockfish/Stockfish#3797 Bench: 6184852

vondele · 2021-11-21T20:07:57Z

I've disabled appveyor CI, and have a pull request to remove the corresponding CI file #3800

snicolet · 2021-11-21T20:13:23Z

OK, I had tried five minutes ago to rewrite the sigmoid as a normal, non-inline function to test if the inlining was the cause of the MSVC crash.
I will revert that change, coming back to the inlined version of sigmoid, and commit into master.

Current master implements a scaling of the raw NNUE output value with a formula equivalent to 'eval = alpha * NNUE_output', where the scale factor alpha varies between 1.8 (for early middle game) and 0.9 (for pure endgames). This feature allows Stockfish to keep material on the board when she thinks she has the advantage, and to seek exchanges and simplifications when she thinks she has to defend. This patch slightly offsets the turning point between these two strategies, by adding to Stockfish's evaluation a small "optimism" value before actually doing the scaling. The effect is that SF will play a little bit more risky, trying to keep the tension a little bit longer when she is defending, and keeping even more material on the board when she has an advantage. We note that this patch is similar in spirit to the old "Contempt" idea we used to have in classical Stockfish, but this implementation differs in two key points: a) it has been tested as an Elo-gainer against master; b) the values output by the search are not changed on average by the implementation (in other words, the optimism value changes the tension/exchange strategy, but a displayed value of 1.0 pawn has the same signification before and after the patch). See the old comment official-stockfish/Stockfish#1361 (comment) for some images illustrating the ideas. ------- finished yellow at STC: LLR: -2.94 (-2.94,2.94) <0.00,2.50> Total: 165048 W: 41705 L: 41611 D: 81732 Ptnml(0-2): 565, 18959, 43245, 19327, 428 https://tests.stockfishchess.org/tests/view/61942a3dcd645dc8291c876b passed LTC: LLR: 2.95 (-2.94,2.94) <0.50,3.00> Total: 121656 W: 30762 L: 30287 D: 60607 Ptnml(0-2): 87, 12558, 35032, 13095, 56 https://tests.stockfishchess.org/tests/view/61962c58cd645dc8291c8877 ------- How to continue from there? a) the shape (slope and amplitude) of the sigmoid used to compute the optimism value could be tweaked to try to gain more Elo, so the parameters of the sigmoid function in line 391 of search.cpp could be tuned with SPSA. Manual tweaking is also possible using this Desmos page: https://www.desmos.com/calculator/jhh83sqq92 b) in a similar vein, with two recents patches affecting the scaling of the NNUE evaluation in evaluate.cpp, now could be a good time to try a round of SPSA tuning of the NNUE network; c) this patch will tend to keep tension in middlegame a little bit longer, so any patch improving the defensive aspect of play via search extensions in risky, tactical positions would be welcome. ------- closes official-stockfish/Stockfish#3797 Bench: 6184852

snicolet · 2021-11-21T20:55:09Z

Merged via a5a89b2

vondele · 2021-11-22T09:58:01Z

for reference, this patch lead to a 15Elo jump on NCM (https://nextchessmove.com/dev-builds#a5a89b27c8e3225fb453d603bc4515d32bb351c3). Somewhat contrary to our tests against SF7 (https://tests.stockfishchess.org/tests/view/619808ffcd645dc8291c8978 and https://tests.stockfishchess.org/tests/view/6197faebcd645dc8291c8974), which might be due to the difference in book between the two tests.

snicolet · 2021-11-22T12:39:19Z

That's nice :-)

snicolet · 2021-11-22T12:53:33Z

Ex post, maybe it is normal that the effect of the patch is greater with a more drawish book.

Indeed, we can open the Dermos sheet at https://www.desmos.com/calculator/jhh83sqq92
and graph the following curves:

(1) y = x
(2) y = x + s

Curve (1) is the nnue eval before the patch, with is then scaled by alpha as described in the PR.
Curve (2) is the nnue + optimism eval, which is then scaled by alpha.

The critical part of the graph is the region just on the left of zero. We see that the two curves cross at nnue=-115 (where the patch at no effect), but stay very close to each other up to nnue=-50.

I am currently testing a version with C=100, x0=-30 to get about the same asymptotic behaviour, the same crossing point but a faster separation in the -100 < nnue < 0 interval:
https://tests.stockfishchess.org/tests/view/619b7bc824ba4c325afc005e

mstembera · 2021-11-24T23:42:11Z

I do all my development in MSVC so I debugged the appveyor error as suggested above.
The actual msvc error is:
Unhandled exception at 0x00007FF62E163BC8 in SF5.exe: 0xC00000FD: Stack overflow (parameters: 0x0000000000000001, 0x00000053D2B03000).
which means we ran out of stack memory. The default msvc stack size is 1MB reserved and 4KB committed. Doubling the reserved size to 2MB indeed fixes the issue. Here is a nice link showing how to do this.
https://stackoverflow.com/questions/14080982/visual-studio-c-c-array-size-unhandled-exception-stack-overflow
Alternately one can specify the command line option /STACK:2097152,4096 to the linker.

@vondele @snicolet
I am indifferent to removing appveyor which was done here #3800 but I really would like to plea, beg, or whatever to at least keep the code compileable under msvc. There are lots of benefits from compatibility with multiple major compilers including portability, broader warning and error checking, as well as C++ standard compliance. For good or bad msvc is the most widely adopted development environment in windows. From time to time I see people asking how to compile SF under msvc. For many windows developers this path allows a lower barrier to entry to start contributing. I wouldn't have been able to start w/o it myself. To be clear I am not asking the maintainers or anyone to do anything. All I ask is that if msvc compiles break in the future we are allowed to submit patches to fix them.

vondele · 2021-11-25T06:02:20Z

@mstembera I have no problem with fixes that are needed to compile with msvc, as long as they are reasonable and related to mistakes made in SF code (e.g. C++ compliance). However, it is just not a 'supported' (tested) platform any longer.

The stacksize will need to be larger (probably 4 or 8MB) to be sure we can search to MAX_PLY.
There have been reports that the code compiled with msvc is giving wrong results if compiled for avx2, which could be stack size related or not.

mstembera · 2021-11-25T08:32:21Z

@vondele Great. Thanks!

I did compile avx2 w/ msvc 2017 and get the correct bench.

snicolet force-pushed the optimism10_PR branch from 333832f to 200cd29 Compare November 19, 2021 20:17

snicolet force-pushed the optimism10_PR branch from 200cd29 to d450f9c Compare November 19, 2021 20:26

snicolet force-pushed the optimism10_PR branch from d450f9c to 4e116d4 Compare November 20, 2021 05:39

snicolet force-pushed the optimism10_PR branch from 4e116d4 to e2372a9 Compare November 20, 2021 05:40

snicolet force-pushed the optimism10_PR branch from e2372a9 to d97b5d9 Compare November 20, 2021 23:21

snicolet force-pushed the optimism10_PR branch from d97b5d9 to 6ead927 Compare November 21, 2021 08:17

snicolet force-pushed the optimism10_PR branch from 6ead927 to c204661 Compare November 21, 2021 20:05

snicolet force-pushed the optimism10_PR branch from c204661 to a5a89b2 Compare November 21, 2021 20:18

snicolet closed this in a5a89b2 Nov 21, 2021

snicolet merged commit a5a89b2 into official-stockfish:master Nov 21, 2021

snicolet added the to be merged Will be merged shortly label Nov 21, 2021

mstembera mentioned this pull request Jan 4, 2022

Possible divide by zero error #3880

Closed

mstembera mentioned this pull request Mar 12, 2023

MSVC native application code had char vs wchar_t type mismatch and didn't compile #4438

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Introduce Optimism #3797

Introduce Optimism #3797

snicolet commented Nov 19, 2021 •

edited

Loading

vondele commented Nov 20, 2021

snicolet commented Nov 20, 2021

snicolet commented Nov 21, 2021

vondele commented Nov 21, 2021

vondele commented Nov 21, 2021

snicolet commented Nov 21, 2021

snicolet commented Nov 21, 2021

vondele commented Nov 22, 2021

snicolet commented Nov 22, 2021

snicolet commented Nov 22, 2021

mstembera commented Nov 24, 2021

vondele commented Nov 25, 2021

mstembera commented Nov 25, 2021

Introduce Optimism #3797

Introduce Optimism #3797

Conversation

snicolet commented Nov 19, 2021 • edited Loading

vondele commented Nov 20, 2021

snicolet commented Nov 20, 2021

snicolet commented Nov 21, 2021

vondele commented Nov 21, 2021

vondele commented Nov 21, 2021

snicolet commented Nov 21, 2021

snicolet commented Nov 21, 2021

vondele commented Nov 22, 2021

snicolet commented Nov 22, 2021

snicolet commented Nov 22, 2021

mstembera commented Nov 24, 2021

vondele commented Nov 25, 2021

mstembera commented Nov 25, 2021

snicolet commented Nov 19, 2021 •

edited

Loading