Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This is a respin of #1439 to keep things simple.
Add a logarithmic term in the contempt computation, increase the maximal contempt and lower contempt offset.
This increases the dynamics of the contempt, giving a boost for balanced positions without skewing too much on unbalanced positions. This helps, since dynamic contempt is in general a good thing, for instance at LTC, but too high contempt rapidly contaminates play.
There has been extensive work on this patch with two major patches having been tested: this one with contempt of 10 (PR in the following) and version with a slightly higher contempt of 12 (HC in the following).
The original HC passed STC and LTC and was found to be even on master on different matches with SF7, SF8.
The PR passes a single LTC tests [0,4] against HC. Attempts to raise the contempt to 15, 18, 20 from 12 did not pass [-3, 1] tests. An attempt with contempt 22 is still running.
The PR raises the draw rate in self-play STC from 56% to 59%, higher than expected from Elo gain.
There have been several attempts to simplify the patch by removing the logarithm term, the most promising (and only not failing) being still running (but see this discussion there). Other attempts have included compensating the logarithm by a static contempt.
It must be mentioned that a version of the PR with contempt 0 did not pass STC [0,5].
Further work
References
HC, STC
LLR: 2.96 (-2.94,2.94) [0.00,5.00]
Total: 159343 W: 34489 L: 33588 D: 91266
HC, LTC
LLR: 2.97 (-2.94,2.94) [0.00,5.00]
Total: 47491 W: 7825 L: 7517 D: 32149
master vs SF7, STC: +165 Elo
HC vs SF7, STC: + 164 Elo
master vs SF8, STC: + 66 Elo
HC vs SF8, STC: + 68 Elo
master vs SF8, LTC: +76 Elo
HC vs SF8, LTC: + 75 Elo
PR vs HC
LLR: 2.96 (-2.94,2.94) [0.00,4.00]
Total: 48385 W: 7437 L: 7143 D: 33805
PR, contempt 15 vs PR
PR, contempt 18 vs PR
PR, contempt 20 vs PR
PR, contempt 22 vs PR
Linear + sign vs PR
master draw rate
PR draw rate