-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use per-thread dynamic contempt #1515
Closed
Closed
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
TL;DR: this patch has the following effect: * for Threads=1: **non-functional** * for Threads>1: * with MultiPV=1: **no regression, little to no ELO gain** * with MultiPV>1: **clear improvement over master** First, I tried testing at standard MultiPV=1 play with [0,5] bounds. This yielded 2 yellow and 1 red test: 5+0.05, Threads=5: LLR: -2.96 (-2.94,2.94) [0.00,5.00] Total: 82689 W: 16439 L: 16190 D: 50060 http://tests.stockfishchess.org/tests/view/5aa93a5a0ebc5902952892e6 5+0.05, Threads=8: LLR: -2.96 (-2.94,2.94) [0.00,5.00] Total: 27164 W: 4974 L: 4983 D: 17207 http://tests.stockfishchess.org/tests/view/5ab2639b0ebc5902a6fbefd5 5+0.5, Threads=16: LLR: -2.97 (-2.94,2.94) [0.00,5.00] Total: 41396 W: 7127 L: 7082 D: 27187 http://tests.stockfishchess.org/tests/view/5ab124220ebc59029516cb62 Then, I tested with Skill Level=17 (implicitly MutliPV=4), showing a clear improvement: 5+0.05, Threads=5: LLR: 2.96 (-2.94,2.94) [0.00,5.00] Total: 3498 W: 1316 L: 1135 D: 1047 http://tests.stockfishchess.org/tests/view/5ab4b6580ebc5902932aeca2 Next, I tested the patch with MultiPV=1 again, this time checking for non-regression ([-3, 1]): 5+0.5, Threads=5: LLR: 2.96 (-2.94,2.94) [-3.00,1.00] Total: 65575 W: 12786 L: 12745 D: 40044 http://tests.stockfishchess.org/tests/view/5ab4e8500ebc5902932aecb3 Finally, I ran some tests with fixed number of games, checking if reverting dynamic contempt gains more elo with Skill Level=17 (i.e. MultiPV) than applying the "prevScore" fix and this patch. These tests showed, that this patch gains 15 ELO when playing with Skill Level=17: 5+0.05, Threads=3, "revert dynamic contempt" vs. "WITHOUT this patch": ELO: -11.43 +-4.1 (95%) LOS: 0.0% Total: 20000 W: 7085 L: 7743 D: 5172 http://tests.stockfishchess.org/tests/view/5ab636450ebc590295d88536 5+0.05, Threads=3, "revert dynamic contempt" vs. "WITH this patch": ELO: -26.42 +-4.1 (95%) LOS: 0.0% Total: 20000 W: 6661 L: 8179 D: 5160 http://tests.stockfishchess.org/tests/view/5ab62e680ebc590295d88524 === FAQ === **Why should this be commited?** I believe that the gain for multi-thread MultiPV search is a sufficient justification for this otherwise neutral change. I also believe this implementation of dynamic contempt is more logical, although this may be just my opinion. **Why is per-thread contempt better at MultiPV?** A likely explanation for the gain in MultiPV mode is that during search each thread independently switches between rootMoves and via the shared contempt score skews each other's evaluation. **Why were the tests done with Skill Level=17?** This was originally suggested by @Hanamuke and the idea is that with Skill Level Stockfish sometimes plays also moves it thinks are slightly sub-optimal and thus the quality of all moves offered by the MultiPV search is checked by the test. **Why are the ELO differences so huge?** This is most likely because of the nature of Skill Level mode -- since it slower and weaker than normal mode, bugs in evaluation have much greater effect.
I think this is just more logical and simple than the original one, apart of the Elo gain. |
Even though contempt is reset to zero on thread creation, there might be some leftover contempt set after previous search.
It turns out this is not actually needed.
Merged via c8ef80f. Thank you for the patch, and for the great, informative commit message -- a pleasure to merge! |
goodkov
pushed a commit
to goodkov/Stockfish
that referenced
this pull request
Jul 21, 2018
We now use per-thread dynamic contempt. This patch has the following effects: * for Threads=1: **non-functional** * for Threads>1: * with MultiPV=1: **no regression, little to no ELO gain** * with MultiPV>1: **clear improvement over master** First, I tried testing at standard MultiPV=1 play with [0,5] bounds. This yielded 2 yellow and 1 red test: 5+0.05, Threads=5: LLR: -2.96 (-2.94,2.94) [0.00,5.00] Total: 82689 W: 16439 L: 16190 D: 50060 http://tests.stockfishchess.org/tests/view/5aa93a5a0ebc5902952892e6 5+0.05, Threads=8: LLR: -2.96 (-2.94,2.94) [0.00,5.00] Total: 27164 W: 4974 L: 4983 D: 17207 http://tests.stockfishchess.org/tests/view/5ab2639b0ebc5902a6fbefd5 5+0.5, Threads=16: LLR: -2.97 (-2.94,2.94) [0.00,5.00] Total: 41396 W: 7127 L: 7082 D: 27187 http://tests.stockfishchess.org/tests/view/5ab124220ebc59029516cb62 Then, I tested with Skill Level=17 (implicitly MutliPV=4), showing a clear improvement: 5+0.05, Threads=5: LLR: 2.96 (-2.94,2.94) [0.00,5.00] Total: 3498 W: 1316 L: 1135 D: 1047 http://tests.stockfishchess.org/tests/view/5ab4b6580ebc5902932aeca2 Next, I tested the patch with MultiPV=1 again, this time checking for non-regression ([-3, 1]): 5+0.5, Threads=5: LLR: 2.96 (-2.94,2.94) [-3.00,1.00] Total: 65575 W: 12786 L: 12745 D: 40044 http://tests.stockfishchess.org/tests/view/5ab4e8500ebc5902932aecb3 Finally, I ran some tests with fixed number of games, checking if reverting dynamic contempt gains more elo with Skill Level=17 (i.e. MultiPV) than applying the "prevScore" fix and this patch. These tests showed, that this patch gains 15 ELO when playing with Skill Level=17: 5+0.05, Threads=3, "revert dynamic contempt" vs. "WITHOUT this patch": ELO: -11.43 +-4.1 (95%) LOS: 0.0% Total: 20000 W: 7085 L: 7743 D: 5172 http://tests.stockfishchess.org/tests/view/5ab636450ebc590295d88536 5+0.05, Threads=3, "revert dynamic contempt" vs. "WITH this patch": ELO: -26.42 +-4.1 (95%) LOS: 0.0% Total: 20000 W: 6661 L: 8179 D: 5160 http://tests.stockfishchess.org/tests/view/5ab62e680ebc590295d88524 --- ***FAQ*** **Why should this be commited?** I believe that the gain for multi-thread MultiPV search is a sufficient justification for this otherwise neutral change. I also believe this implementation of dynamic contempt is more logical, although this may be just my opinion. **Why is per-thread contempt better at MultiPV?** A likely explanation for the gain in MultiPV mode is that during search each thread independently switches between rootMoves and via the shared contempt score skews each other's evaluation. **Why were the tests done with Skill Level=17?** This was originally suggested by @Hanamuke and the idea is that with Skill Level Stockfish sometimes plays also moves it thinks are slightly sub-optimal and thus the quality of all moves offered by the MultiPV search is checked by the test. **Why are the ELO differences so huge?** This is most likely because of the nature of Skill Level mode -- since it slower and weaker than normal mode, bugs in evaluation have much greater effect. --- Closes official-stockfish#1515. No functional change -- in single thread mode.
Closed
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
We now use per-thread dynamic contempt. This patch has the following effects:
First, I tried testing at standard MultiPV=1 play with [0,5] bounds.
This yielded 2 yellow and 1 red test:
5+0.05, Threads=5:
LLR: -2.96 (-2.94,2.94) [0.00,5.00]
Total: 82689 W: 16439 L: 16190 D: 50060
http://tests.stockfishchess.org/tests/view/5aa93a5a0ebc5902952892e6
5+0.05, Threads=8:
LLR: -2.96 (-2.94,2.94) [0.00,5.00]
Total: 27164 W: 4974 L: 4983 D: 17207
http://tests.stockfishchess.org/tests/view/5ab2639b0ebc5902a6fbefd5
5+0.5, Threads=16:
LLR: -2.97 (-2.94,2.94) [0.00,5.00]
Total: 41396 W: 7127 L: 7082 D: 27187
http://tests.stockfishchess.org/tests/view/5ab124220ebc59029516cb62
Then, I tested with Skill Level=17 (implicitly MutliPV=4), showing
a clear improvement:
5+0.05, Threads=5:
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 3498 W: 1316 L: 1135 D: 1047
http://tests.stockfishchess.org/tests/view/5ab4b6580ebc5902932aeca2
Next, I tested the patch with MultiPV=1 again, this time checking for
non-regression ([-3, 1]):
5+0.5, Threads=5:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 65575 W: 12786 L: 12745 D: 40044
http://tests.stockfishchess.org/tests/view/5ab4e8500ebc5902932aecb3
Finally, I ran some tests with fixed number of games, checking if
reverting dynamic contempt gains more elo with Skill Level=17 (i.e.
MultiPV) than applying the "prevScore" fix and this patch. These tests
showed, that this patch gains 15 ELO when playing with Skill Level=17:
5+0.05, Threads=3, "revert dynamic contempt" vs. "WITHOUT this patch":
ELO: -11.43 +-4.1 (95%) LOS: 0.0%
Total: 20000 W: 7085 L: 7743 D: 5172
http://tests.stockfishchess.org/tests/view/5ab636450ebc590295d88536
5+0.05, Threads=3, "revert dynamic contempt" vs. "WITH this patch":
ELO: -26.42 +-4.1 (95%) LOS: 0.0%
Total: 20000 W: 6661 L: 8179 D: 5160
http://tests.stockfishchess.org/tests/view/5ab62e680ebc590295d88524
FAQ
Why should this be commited?
I believe that the gain for multi-thread MultiPV search is a sufficient
justification for this otherwise neutral change. I also believe this
implementation of dynamic contempt is more logical, although this may
be just my opinion.
Why is per-thread contempt better at MultiPV?
A likely explanation for the gain in MultiPV mode is that during
search each thread independently switches between rootMoves and via
the shared contempt score skews each other's evaluation.
Why were the tests done with Skill Level=17?
This was originally suggested by @Hanamuke and the idea is that with
Skill Level Stockfish sometimes plays also moves it thinks are slightly
sub-optimal and thus the quality of all moves offered by the MultiPV
search is checked by the test.
Why are the ELO differences so huge?
This is most likely because of the nature of Skill Level mode --
since it slower and weaker than normal mode, bugs in evaluation have
much greater effect.
Closes #1515.
No functional change -- in single thread mode.