-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce Optimism #3797
Introduce Optimism #3797
Conversation
Current master implements a scaling of the raw NNUE output value with a formula equivalent to 'eval = alpha * NNUE_output', where the scale factor alpha varies between 1.8 (for early middle game) and 0.9 (for pure endgames). This feature allows Stockfish to keep material on the board when she thinks she has the advantage, and to seek exchanges and simplifications when she thinks she has to defend. This patch slightly offsets the turning point between these two strategies, by adding to Stockfish's evaluation a small "optimism" value before actually doing the scaling. The effect is that SF will play a little bit more risky, trying to keep the tension a little bit longer when she is defending, and keeping even more material on the board when she has an advantage. We note that this patch is similar in spirit to the old "Contempt" idea we used to have in classical Stockfish, but this implementation differs in two key points: a) it has been tested as an Elo-gainer against master; b) the values output by the search are not changed on average by the implementation (in other words, the optimism value changes the tension/exchange strategy, but a displayed value of 1.0 pawn has the same signification before and after the patch). See the old comment official-stockfish/Stockfish#1361 (comment) for some images illustrating the ideas. ------- finished yellow at STC: LLR: -2.94 (-2.94,2.94) <0.00,2.50> Total: 165048 W: 41705 L: 41611 D: 81732 Ptnml(0-2): 565, 18959, 43245, 19327, 428 https://tests.stockfishchess.org/tests/view/61942a3dcd645dc8291c876b passed LTC: LLR: 2.95 (-2.94,2.94) <0.50,3.00> Total: 121656 W: 30762 L: 30287 D: 60607 Ptnml(0-2): 87, 12558, 35032, 13095, 56 https://tests.stockfishchess.org/tests/view/61962c58cd645dc8291c8877 ------- How to continue from there? a) the shape (slope and amplitude) of the sigmoid used to compute the optimism value could be tweaked to try to gain more Elo, so the parameters of the sigmoid function in line 391 of search.cpp could be tuned with SPSA. Manual tweaking is also possible using this Desmos page: https://www.desmos.com/calculator/jhh83sqq92 b) in a similar vein, with two recents patches affecting the scaling of the NNUE evaluation in evaluate.cpp, now could be a good time to try a round of SPSA tuning of the NNUE network; c) this patch will tend to keep tension in middlegame a little bit longer, so any patch improving the defensive aspect of play via search extensions in risky, tactical positions would be welcome. ------- closes official-stockfish/Stockfish#3797 Bench: 6184852
333832f
to
200cd29
Compare
Current master implements a scaling of the raw NNUE output value with a formula equivalent to 'eval = alpha * NNUE_output', where the scale factor alpha varies between 1.8 (for early middle game) and 0.9 (for pure endgames). This feature allows Stockfish to keep material on the board when she thinks she has the advantage, and to seek exchanges and simplifications when she thinks she has to defend. This patch slightly offsets the turning point between these two strategies, by adding to Stockfish's evaluation a small "optimism" value before actually doing the scaling. The effect is that SF will play a little bit more risky, trying to keep the tension a little bit longer when she is defending, and keeping even more material on the board when she has an advantage. We note that this patch is similar in spirit to the old "Contempt" idea we used to have in classical Stockfish, but this implementation differs in two key points: a) it has been tested as an Elo-gainer against master; b) the values output by the search are not changed on average by the implementation (in other words, the optimism value changes the tension/exchange strategy, but a displayed value of 1.0 pawn has the same signification before and after the patch). See the old comment official-stockfish/Stockfish#1361 (comment) for some images illustrating the ideas. ------- finished yellow at STC: LLR: -2.94 (-2.94,2.94) <0.00,2.50> Total: 165048 W: 41705 L: 41611 D: 81732 Ptnml(0-2): 565, 18959, 43245, 19327, 428 https://tests.stockfishchess.org/tests/view/61942a3dcd645dc8291c876b passed LTC: LLR: 2.95 (-2.94,2.94) <0.50,3.00> Total: 121656 W: 30762 L: 30287 D: 60607 Ptnml(0-2): 87, 12558, 35032, 13095, 56 https://tests.stockfishchess.org/tests/view/61962c58cd645dc8291c8877 ------- How to continue from there? a) the shape (slope and amplitude) of the sigmoid used to compute the optimism value could be tweaked to try to gain more Elo, so the parameters of the sigmoid function in line 391 of search.cpp could be tuned with SPSA. Manual tweaking is also possible using this Desmos page: https://www.desmos.com/calculator/jhh83sqq92 b) in a similar vein, with two recents patches affecting the scaling of the NNUE evaluation in evaluate.cpp, now could be a good time to try a round of SPSA tuning of the NNUE network; c) this patch will tend to keep tension in middlegame a little bit longer, so any patch improving the defensive aspect of play via search extensions in risky, tactical positions would be welcome. ------- closes official-stockfish/Stockfish#3797 Bench: 6184852
200cd29
to
d450f9c
Compare
Current master implements a scaling of the raw NNUE output value with a formula equivalent to 'eval = alpha * NNUE_output', where the scale factor alpha varies between 1.8 (for early middle game) and 0.9 (for pure endgames). This feature allows Stockfish to keep material on the board when she thinks she has the advantage, and to seek exchanges and simplifications when she thinks she has to defend. This patch slightly offsets the turning point between these two strategies, by adding to Stockfish's evaluation a small "optimism" value before actually doing the scaling. The effect is that SF will play a little bit more risky, trying to keep the tension a little bit longer when she is defending, and keeping even more material on the board when she has an advantage. We note that this patch is similar in spirit to the old "Contempt" idea we used to have in classical Stockfish, but this implementation differs in two key points: a) it has been tested as an Elo-gainer against master; b) the values output by the search are not changed on average by the implementation (in other words, the optimism value changes the tension/exchange strategy, but a displayed value of 1.0 pawn has the same signification before and after the patch). See the old comment official-stockfish/Stockfish#1361 (comment) for some images illustrating the ideas. ------- finished yellow at STC: LLR: -2.94 (-2.94,2.94) <0.00,2.50> Total: 165048 W: 41705 L: 41611 D: 81732 Ptnml(0-2): 565, 18959, 43245, 19327, 428 https://tests.stockfishchess.org/tests/view/61942a3dcd645dc8291c876b passed LTC: LLR: 2.95 (-2.94,2.94) <0.50,3.00> Total: 121656 W: 30762 L: 30287 D: 60607 Ptnml(0-2): 87, 12558, 35032, 13095, 56 https://tests.stockfishchess.org/tests/view/61962c58cd645dc8291c8877 ------- How to continue from there? a) the shape (slope and amplitude) of the sigmoid used to compute the optimism value could be tweaked to try to gain more Elo, so the parameters of the sigmoid function in line 391 of search.cpp could be tuned with SPSA. Manual tweaking is also possible using this Desmos page: https://www.desmos.com/calculator/jhh83sqq92 b) in a similar vein, with two recents patches affecting the scaling of the NNUE evaluation in evaluate.cpp, now could be a good time to try a round of SPSA tuning of the NNUE network; c) this patch will tend to keep tension in middlegame a little bit longer, so any patch improving the defensive aspect of play via search extensions in risky, tactical positions would be welcome. ------- closes official-stockfish/Stockfish#3797 Bench: 6184852
d450f9c
to
4e116d4
Compare
Current master implements a scaling of the raw NNUE output value with a formula equivalent to 'eval = alpha * NNUE_output', where the scale factor alpha varies between 1.8 (for early middle game) and 0.9 (for pure endgames). This feature allows Stockfish to keep material on the board when she thinks she has the advantage, and to seek exchanges and simplifications when she thinks she has to defend. This patch slightly offsets the turning point between these two strategies, by adding to Stockfish's evaluation a small "optimism" value before actually doing the scaling. The effect is that SF will play a little bit more risky, trying to keep the tension a little bit longer when she is defending, and keeping even more material on the board when she has an advantage. We note that this patch is similar in spirit to the old "Contempt" idea we used to have in classical Stockfish, but this implementation differs in two key points: a) it has been tested as an Elo-gainer against master; b) the values output by the search are not changed on average by the implementation (in other words, the optimism value changes the tension/exchange strategy, but a displayed value of 1.0 pawn has the same signification before and after the patch). See the old comment official-stockfish/Stockfish#1361 (comment) for some images illustrating the ideas. ------- finished yellow at STC: LLR: -2.94 (-2.94,2.94) <0.00,2.50> Total: 165048 W: 41705 L: 41611 D: 81732 Ptnml(0-2): 565, 18959, 43245, 19327, 428 https://tests.stockfishchess.org/tests/view/61942a3dcd645dc8291c876b passed LTC: LLR: 2.95 (-2.94,2.94) <0.50,3.00> Total: 121656 W: 30762 L: 30287 D: 60607 Ptnml(0-2): 87, 12558, 35032, 13095, 56 https://tests.stockfishchess.org/tests/view/61962c58cd645dc8291c8877 ------- How to continue from there? a) the shape (slope and amplitude) of the sigmoid used to compute the optimism value could be tweaked to try to gain more Elo, so the parameters of the sigmoid function in line 391 of search.cpp could be tuned with SPSA. Manual tweaking is also possible using this Desmos page: https://www.desmos.com/calculator/jhh83sqq92 b) in a similar vein, with two recents patches affecting the scaling of the NNUE evaluation in evaluate.cpp, now could be a good time to try a round of SPSA tuning of the NNUE network; c) this patch will tend to keep tension in middlegame a little bit longer, so any patch improving the defensive aspect of play via search extensions in risky, tactical positions would be welcome. ------- closes official-stockfish/Stockfish#3797 Bench: 6184852
4e116d4
to
e2372a9
Compare
CI is failing, I believe because the variables are not initialized for the 'eval' command. Probably needs to add something near line 1135 in evaluate.cpp. Maybe it makes sense to have these variables (optimism, trend, bestValue currently) initialized in a function called from both search and eval paths ? |
Thanks, I will have a look tonight! |
Current master implements a scaling of the raw NNUE output value with a formula equivalent to 'eval = alpha * NNUE_output', where the scale factor alpha varies between 1.8 (for early middle game) and 0.9 (for pure endgames). This feature allows Stockfish to keep material on the board when she thinks she has the advantage, and to seek exchanges and simplifications when she thinks she has to defend. This patch slightly offsets the turning point between these two strategies, by adding to Stockfish's evaluation a small "optimism" value before actually doing the scaling. The effect is that SF will play a little bit more risky, trying to keep the tension a little bit longer when she is defending, and keeping even more material on the board when she has an advantage. We note that this patch is similar in spirit to the old "Contempt" idea we used to have in classical Stockfish, but this implementation differs in two key points: a) it has been tested as an Elo-gainer against master; b) the values output by the search are not changed on average by the implementation (in other words, the optimism value changes the tension/exchange strategy, but a displayed value of 1.0 pawn has the same signification before and after the patch). See the old comment official-stockfish/Stockfish#1361 (comment) for some images illustrating the ideas. ------- finished yellow at STC: LLR: -2.94 (-2.94,2.94) <0.00,2.50> Total: 165048 W: 41705 L: 41611 D: 81732 Ptnml(0-2): 565, 18959, 43245, 19327, 428 https://tests.stockfishchess.org/tests/view/61942a3dcd645dc8291c876b passed LTC: LLR: 2.95 (-2.94,2.94) <0.50,3.00> Total: 121656 W: 30762 L: 30287 D: 60607 Ptnml(0-2): 87, 12558, 35032, 13095, 56 https://tests.stockfishchess.org/tests/view/61962c58cd645dc8291c8877 ------- How to continue from there? a) the shape (slope and amplitude) of the sigmoid used to compute the optimism value could be tweaked to try to gain more Elo, so the parameters of the sigmoid function in line 391 of search.cpp could be tuned with SPSA. Manual tweaking is also possible using this Desmos page: https://www.desmos.com/calculator/jhh83sqq92 b) in a similar vein, with two recents patches affecting the scaling of the NNUE evaluation in evaluate.cpp, now could be a good time to try a round of SPSA tuning of the NNUE network; c) this patch will tend to keep tension in middlegame a little bit longer, so any patch improving the defensive aspect of play via search extensions in risky, tactical positions would be welcome. ------- closes official-stockfish/Stockfish#3797 Bench: 6184852
e2372a9
to
d97b5d9
Compare
I have pushed a fix (probably the initilization function could wait a distinct patch). Now AppVeyor fails, don't understand why nor if it is important there. |
Current master implements a scaling of the raw NNUE output value with a formula equivalent to 'eval = alpha * NNUE_output', where the scale factor alpha varies between 1.8 (for early middle game) and 0.9 (for pure endgames). This feature allows Stockfish to keep material on the board when she thinks she has the advantage, and to seek exchanges and simplifications when she thinks she has to defend. This patch slightly offsets the turning point between these two strategies, by adding to Stockfish's evaluation a small "optimism" value before actually doing the scaling. The effect is that SF will play a little bit more risky, trying to keep the tension a little bit longer when she is defending, and keeping even more material on the board when she has an advantage. We note that this patch is similar in spirit to the old "Contempt" idea we used to have in classical Stockfish, but this implementation differs in two key points: a) it has been tested as an Elo-gainer against master; b) the values output by the search are not changed on average by the implementation (in other words, the optimism value changes the tension/exchange strategy, but a displayed value of 1.0 pawn has the same signification before and after the patch). See the old comment official-stockfish/Stockfish#1361 (comment) for some images illustrating the ideas. ------- finished yellow at STC: LLR: -2.94 (-2.94,2.94) <0.00,2.50> Total: 165048 W: 41705 L: 41611 D: 81732 Ptnml(0-2): 565, 18959, 43245, 19327, 428 https://tests.stockfishchess.org/tests/view/61942a3dcd645dc8291c876b passed LTC: LLR: 2.95 (-2.94,2.94) <0.50,3.00> Total: 121656 W: 30762 L: 30287 D: 60607 Ptnml(0-2): 87, 12558, 35032, 13095, 56 https://tests.stockfishchess.org/tests/view/61962c58cd645dc8291c8877 ------- How to continue from there? a) the shape (slope and amplitude) of the sigmoid used to compute the optimism value could be tweaked to try to gain more Elo, so the parameters of the sigmoid function in line 391 of search.cpp could be tuned with SPSA. Manual tweaking is also possible using this Desmos page: https://www.desmos.com/calculator/jhh83sqq92 b) in a similar vein, with two recents patches affecting the scaling of the NNUE evaluation in evaluate.cpp, now could be a good time to try a round of SPSA tuning of the NNUE network; c) this patch will tend to keep tension in middlegame a little bit longer, so any patch improving the defensive aspect of play via search extensions in risky, tactical positions would be welcome. ------- closes official-stockfish/Stockfish#3797 Bench: 6184852
d97b5d9
to
6ead927
Compare
the appveyor error is strange, seems like the binary crashes. Maybe somebody with msvc could test this natively? However, I think we should eventually retire the appveyor test, I don't think we're interested in testing the build environment, especially now where windows is part of CI via mingw, which is our recommended way to build anyway. |
Current master implements a scaling of the raw NNUE output value with a formula equivalent to 'eval = alpha * NNUE_output', where the scale factor alpha varies between 1.8 (for early middle game) and 0.9 (for pure endgames). This feature allows Stockfish to keep material on the board when she thinks she has the advantage, and to seek exchanges and simplifications when she thinks she has to defend. This patch slightly offsets the turning point between these two strategies, by adding to Stockfish's evaluation a small "optimism" value before actually doing the scaling. The effect is that SF will play a little bit more risky, trying to keep the tension a little bit longer when she is defending, and keeping even more material on the board when she has an advantage. We note that this patch is similar in spirit to the old "Contempt" idea we used to have in classical Stockfish, but this implementation differs in two key points: a) it has been tested as an Elo-gainer against master; b) the values output by the search are not changed on average by the implementation (in other words, the optimism value changes the tension/exchange strategy, but a displayed value of 1.0 pawn has the same signification before and after the patch). See the old comment official-stockfish/Stockfish#1361 (comment) for some images illustrating the ideas. ------- finished yellow at STC: LLR: -2.94 (-2.94,2.94) <0.00,2.50> Total: 165048 W: 41705 L: 41611 D: 81732 Ptnml(0-2): 565, 18959, 43245, 19327, 428 https://tests.stockfishchess.org/tests/view/61942a3dcd645dc8291c876b passed LTC: LLR: 2.95 (-2.94,2.94) <0.50,3.00> Total: 121656 W: 30762 L: 30287 D: 60607 Ptnml(0-2): 87, 12558, 35032, 13095, 56 https://tests.stockfishchess.org/tests/view/61962c58cd645dc8291c8877 ------- How to continue from there? a) the shape (slope and amplitude) of the sigmoid used to compute the optimism value could be tweaked to try to gain more Elo, so the parameters of the sigmoid function in line 391 of search.cpp could be tuned with SPSA. Manual tweaking is also possible using this Desmos page: https://www.desmos.com/calculator/jhh83sqq92 b) in a similar vein, with two recents patches affecting the scaling of the NNUE evaluation in evaluate.cpp, now could be a good time to try a round of SPSA tuning of the NNUE network; c) this patch will tend to keep tension in middlegame a little bit longer, so any patch improving the defensive aspect of play via search extensions in risky, tactical positions would be welcome. ------- closes official-stockfish/Stockfish#3797 Bench: 6184852
6ead927
to
c204661
Compare
I've disabled appveyor CI, and have a pull request to remove the corresponding CI file #3800 |
OK, I had tried five minutes ago to rewrite the sigmoid as a normal, non-inline function to test if the inlining was the cause of the MSVC crash. |
Current master implements a scaling of the raw NNUE output value with a formula equivalent to 'eval = alpha * NNUE_output', where the scale factor alpha varies between 1.8 (for early middle game) and 0.9 (for pure endgames). This feature allows Stockfish to keep material on the board when she thinks she has the advantage, and to seek exchanges and simplifications when she thinks she has to defend. This patch slightly offsets the turning point between these two strategies, by adding to Stockfish's evaluation a small "optimism" value before actually doing the scaling. The effect is that SF will play a little bit more risky, trying to keep the tension a little bit longer when she is defending, and keeping even more material on the board when she has an advantage. We note that this patch is similar in spirit to the old "Contempt" idea we used to have in classical Stockfish, but this implementation differs in two key points: a) it has been tested as an Elo-gainer against master; b) the values output by the search are not changed on average by the implementation (in other words, the optimism value changes the tension/exchange strategy, but a displayed value of 1.0 pawn has the same signification before and after the patch). See the old comment official-stockfish/Stockfish#1361 (comment) for some images illustrating the ideas. ------- finished yellow at STC: LLR: -2.94 (-2.94,2.94) <0.00,2.50> Total: 165048 W: 41705 L: 41611 D: 81732 Ptnml(0-2): 565, 18959, 43245, 19327, 428 https://tests.stockfishchess.org/tests/view/61942a3dcd645dc8291c876b passed LTC: LLR: 2.95 (-2.94,2.94) <0.50,3.00> Total: 121656 W: 30762 L: 30287 D: 60607 Ptnml(0-2): 87, 12558, 35032, 13095, 56 https://tests.stockfishchess.org/tests/view/61962c58cd645dc8291c8877 ------- How to continue from there? a) the shape (slope and amplitude) of the sigmoid used to compute the optimism value could be tweaked to try to gain more Elo, so the parameters of the sigmoid function in line 391 of search.cpp could be tuned with SPSA. Manual tweaking is also possible using this Desmos page: https://www.desmos.com/calculator/jhh83sqq92 b) in a similar vein, with two recents patches affecting the scaling of the NNUE evaluation in evaluate.cpp, now could be a good time to try a round of SPSA tuning of the NNUE network; c) this patch will tend to keep tension in middlegame a little bit longer, so any patch improving the defensive aspect of play via search extensions in risky, tactical positions would be welcome. ------- closes official-stockfish/Stockfish#3797 Bench: 6184852
c204661
to
a5a89b2
Compare
Merged via a5a89b2 |
for reference, this patch lead to a 15Elo jump on NCM (https://nextchessmove.com/dev-builds#a5a89b27c8e3225fb453d603bc4515d32bb351c3). Somewhat contrary to our tests against SF7 (https://tests.stockfishchess.org/tests/view/619808ffcd645dc8291c8978 and https://tests.stockfishchess.org/tests/view/6197faebcd645dc8291c8974), which might be due to the difference in book between the two tests. |
That's nice :-) |
Ex post, maybe it is normal that the effect of the patch is greater with a more drawish book. Indeed, we can open the Dermos sheet at https://www.desmos.com/calculator/jhh83sqq92 (1) y = x Curve (1) is the nnue eval before the patch, with is then scaled by alpha as described in the PR. The critical part of the graph is the region just on the left of zero. We see that the two curves cross at nnue=-115 (where the patch at no effect), but stay very close to each other up to nnue=-50. I am currently testing a version with C=100, x0=-30 to get about the same asymptotic behaviour, the same crossing point but a faster separation in the -100 < nnue < 0 interval: |
I do all my development in MSVC so I debugged the appveyor error as suggested above. @vondele @snicolet |
@mstembera I have no problem with fixes that are needed to compile with msvc, as long as they are reasonable and related to mistakes made in SF code (e.g. C++ compliance). However, it is just not a 'supported' (tested) platform any longer. The stacksize will need to be larger (probably 4 or 8MB) to be sure we can search to MAX_PLY. |
@vondele Great. Thanks! I did compile avx2 w/ msvc 2017 and get the correct bench. |
Current master implements a scaling of the raw NNUE output value with a formula
equivalent to 'eval = alpha * NNUE_output', where the scale factor alpha varies
between 1.8 (for early middle game) and 0.9 (for pure endgames). This feature
allows Stockfish to keep material on the board when she thinks she has the advantage,
and to seek exchanges and simplifications when she thinks she has to defend.
This patch slightly offsets the turning point between these two strategies, by adding
to Stockfish's evaluation a small "optimism" value before actually doing the scaling.
The effect is that SF will play a little bit more risky, trying to keep the tension a
little bit longer when she is defending, and keeping even more material on the board
when she has an advantage.
We note that this patch is similar in spirit to the old "Contempt" idea we used to have
in classical Stockfish, but this implementation differs in two key points:
a) it has been tested as an Elo-gainer against master;
b) the values output by the search are not changed on average by the implementation
(in other words, the optimism value changes the tension/exchange strategy, but a
displayed value of 1.0 pawn has the same signification before and after the patch).
See the old comment #1361 (comment)
for some images illustrating the ideas.
finished yellow at STC:
LLR: -2.94 (-2.94,2.94) <0.00,2.50>
Total: 165048 W: 41705 L: 41611 D: 81732
Ptnml(0-2): 565, 18959, 43245, 19327, 428
https://tests.stockfishchess.org/tests/view/61942a3dcd645dc8291c876b
passed LTC:
LLR: 2.95 (-2.94,2.94) <0.50,3.00>
Total: 121656 W: 30762 L: 30287 D: 60607
Ptnml(0-2): 87, 12558, 35032, 13095, 56
https://tests.stockfishchess.org/tests/view/61962c58cd645dc8291c8877
How to continue from there?
a) the shape (slope and amplitude) of the sigmoid used to compute the optimism value
could be tweaked to try to gain more Elo, so the parameters of the sigmoid function
in line 391 of search.cpp could be tuned with SPSA. Manual tweaking is also possible
using this Desmos page: https://www.desmos.com/calculator/jhh83sqq92
b) in a similar vein, with two recents patches affecting the scaling of the NNUE
evaluation in evaluate.cpp, now could be a good time to try a round of SPSA tuning
of the NNUE network;
c) this patch will tend to keep tension in middlegame a little bit longer, so any
patch improving the defensive aspect of play via search extensions in risky,
tactical positions would be welcome.
closes #3797
Bench: 6184852