Skip to content

Commit

Permalink
Contempt (#26)
Browse files Browse the repository at this point in the history
* Contempt

* Regular compile no longer contains search param settings used for tuning

* Updated README.md

* Updated version number
  • Loading branch information
rosenthj authored Sep 21, 2019
1 parent 9f84595 commit 0c46fa2
Show file tree
Hide file tree
Showing 7 changed files with 165 additions and 39 deletions.
28 changes: 24 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,23 +9,43 @@ Winter has relied on many machine learning algorithms and techniques over the co
As of Winter 0.6.2, the evaluation function relies on a small neural network for more precise evaluations.

## Installation
In order to run it on Linux, just compile it via "make" in the root directory and then run it from the root directory. Tested with clang (recommended) and gcc (default).
In order to run it on Linux, just compile it via "make" in the root directory and then run it from the root directory.

When running Winter from command line, be sure to call "uci" to get information about the current version, including its number and the detected architecture.
The makefile will assume you are making a native build, but if you are making a build for a different system, it should be reasonably straightforward to modify yourself.

Winter does not rely on any external libraries aside from the Standard Template Library. All algorithms have been implemented from scratch. As of Winter 0.6.2 I have started to build an external codebase for neural network training.

## Contempt
Winter versions 0.7 and later have support for contempt settings. In most engines contempt is used to reduce the number of draws and thus increase performance against weaker engines, often at the cost of performance in self play or against stronger opposition.

Winter uses a novel contempt implementation that utilizes the fact that Winter calculates win, draw and loss probabilities. Increasing contempt in Winter reduces how much it values draws for itself and increases how much it believes the opponent values draws.

#### Centipawn output recalibration

Internally Winter actually tends to have a negative score for positive contempt and vice versa. This is natural as positive contempt is reducing the value of a draw for the side to move, so the score will be negatively biased.

In order to have more realistic score outputs, Winter does a bias adjustment for non-mate scores. The formula assumes a maximum probability for a draw and readjusts the score based on that. This results in an overcorrection. Ie: positive contempt values will result in a reported upper bound score and negative contempt values will result in a reported lower bound score.

For high contempt values it is recommended to adjust adjudication settings.

#### Armageddon
An increasingly popular format in human chess is Armageddon. Winter is the first engine to the author's knowledge to natively support Armageddon play as a UCI option. Internally this works by setting contempt to high positive value when playing white and a high negative value when playing black.

At the moment contempt is not set to the maximum in Armageddon mode. In the limited testing done this proved to perform more consistently. This may change in the future.

In Armageddon mode score recalibration is not performed. The score recalibration formula for regular contempt assumes the contempt pushes the score away from the true symmetrical evaluation. In Armageddon the true eval is not symmetric.

## Training Your Own Winter Flavor

At the moment training a neural network for use in Winter is only supported in a very limited way. I intend to release the script shortly which was used in order to train the initial 0.6.2 net.

In the following I describe the steps to get from a pgn game database to a network for Winter.

1. Get and compile the latest [pgn-extract](https://www.cs.kent.ac.uk/people/staff/djb/pgn-extract/) by David J. Barnes.
1. Get and compile the latest [pgn-extract](https://www.cs.kent.ac.uk/people/staff/djb/pgn-extract/) by David J. Barnes.
2. Use pgn-extract on your .pgn file with the arguments `-Wuci` and `--notags`. This will create a file readable by Winter.
3. Run Winter from command line. Call `gen_eval_csv filename out_filename` where filename is the name of the file generated in 2. and out_filename is what Winter should call the generated file. This will create a .csv dataset file (described below) based on pseudo-quiescent positions from the input games.
4. Train a neural network on the dataset. It is recommended to try to train something simple for now. Keep in mind I would like to refrain from making Winter rely on any external libraries.
5. Integrate the network into Winter. In the future I will probably support loading external weight files, but for now you need to replace the appropriate entries in `src/net_weights.h`.
6. `make clean` and `make` (or `make no_bmi`)
6. `make clean` and `make`

The structure of the .csv dataset generated in 3. is as follows. The first column is a boolean value indicating wether the player to move won. The second column is a boolean value indicating whether the player to move scored at least a draw. The remaining collumns are features which are somewhat sparse. An overview of these features can be found in `src/net_evaluation.h`.
2 changes: 1 addition & 1 deletion src/general/settings.h
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@
namespace settings {

const std::string engine_name = "Winter";
const std::string engine_version = "0.6.7a";
const std::string engine_version = "0.7";
const std::string engine_author = "Jonathan Rosenthal";

#if defined(__BMI2__)
Expand Down
52 changes: 45 additions & 7 deletions src/net_evaluation.cc
Original file line number Diff line number Diff line change
Expand Up @@ -14,11 +14,13 @@

using namespace net_features;

constexpr float kEpsilon = 0.000001;
constexpr size_t block_size = 16;
// The (post-) activation block size is only needed if dimension is different from preactivation
//constexpr size_t act_block_size = 2 * block_size;
using NetLayerType = Vec<float, block_size>;
//using CReLULayerType = Vec<float, act_block_size>;
std::array<float, 2> contempt = { 0.5, 0.5 };

namespace {
const int net_version = 19081700;
Expand All @@ -32,7 +34,7 @@ float sigmoid(float x) {
std::vector<NetLayerType> net_input_weights(kTotalNumFeatures, 0);
NetLayerType bias_layer_one(0);

std::vector<NetLayerType> second_layer_weights(32 * 16, 0);
std::vector<NetLayerType> second_layer_weights(16 * 16, 0);
NetLayerType bias_layer_two(0);

//NetLayerType output_weights(0);
Expand Down Expand Up @@ -614,9 +616,7 @@ T ScoreBoard(const Board &board) {
return score;
}

Score NetForward(NetLayerType &layer_one) {
constexpr float epsilon = 0.000001;

Score NetForward(NetLayerType &layer_one, float c = 0.5) {
layer_one += bias_layer_one;
layer_one.relu();
// layer_one.ns_prelu(net_hardcode::l1_activation_weights);
Expand All @@ -635,8 +635,9 @@ Score NetForward(NetLayerType &layer_one) {
float win = layer_two.dot(win_weights) + win_bias;
float win_draw = layer_two.dot(win_draw_weights) + win_draw_bias;

float wpct = (sigmoid(win) + sigmoid(win_draw)) / 2;
wpct = std::max(std::min(wpct, 1-epsilon), epsilon);
float wpct = sigmoid(win) * c + sigmoid(win_draw) * (1 - c);
// wpct = wpct * (1-kEpsilon) + 0.5 * kEpsilon;
wpct = std::max(std::min(wpct, 1-kEpsilon), kEpsilon);
float output = std::log(wpct / (1-wpct));

return std::round(output * 1024);
Expand All @@ -650,7 +651,7 @@ Score ScoreBoard(const Board &board) {
else {
layer_one = ScoreBoard<NetLayerType, kBlack>(board);
}
return NetForward(layer_one);
return NetForward(layer_one, contempt[board.get_turn()]);
}

void init_weights() {
Expand Down Expand Up @@ -958,5 +959,42 @@ void EstimateFeatureImpact() {
}
}

void SetContempt(int value, Color color) {
float f = (value + 100) * 0.005;
contempt[color] = f;
contempt[color ^ 0x1] = 1-f;
}

std::array<Score, 2> GetDrawArray() {
// float f = contempt[0] * (1-kEpsilon) + 0.5 * kEpsilon;
float f = std::max(std::min(contempt[0], 1-kEpsilon), kEpsilon);
f = std::log(f / (1-f));
Score res = std::round(f * 1024);
std::array<Score, 2> result = { -res, res };
return result;
}

Score GetUnbiasedScore(Score score, Color color) {
Color not_color = color ^ 0x1;
float f = sigmoid(score / 1024.0);
float w, wd;
if (f == contempt[not_color]) {
return 0;
}
else if (f > contempt[not_color]) {
w = (f - contempt[not_color]) / contempt[color];
wd = 1.0;
}
else {
w = 0.0;
wd = f / contempt[not_color];
}
float x = (w + wd) / 2;
x = std::log(x / (1-x));
return std::round(x * 1024);
// w * vw + wd * vwd = f
// if f > vwd: wd = 1, w = (f - vwd) / vw
// else w = 0, wd = f / vwd
}

}
4 changes: 4 additions & 0 deletions src/net_evaluation.h
Original file line number Diff line number Diff line change
Expand Up @@ -47,6 +47,10 @@ void EstimateFeatureImpact();
void GenerateDatasetFromUCIGames(std::string filename, std::string out_name = "eval_dataset.csv",
size_t reroll_pct = 0);

void SetContempt(int value, Color color);
std::array<Score, 2> GetDrawArray();
Score GetUnbiasedScore(Score score, Color color);

}

// TODO: Move to external file
Expand Down
68 changes: 52 additions & 16 deletions src/search.cc
Original file line number Diff line number Diff line change
Expand Up @@ -62,6 +62,10 @@ int kNodeCountSampleAt = 1000;
//int kNodeCountSampleEvalAt = 5000;
const int kMaxDepthSampled = 32;

std::array<Score, 2> draw_score = { 0, 0 };
int contempt = 0;
bool armageddon = false;

int rsearch_mode;
Milliseconds rsearch_duration;
Depth rsearch_depth;
Expand Down Expand Up @@ -91,10 +95,18 @@ Vec<Score, 4> init_futility_margins(Score s) {
return kFutilityMargins;
}

#ifdef TUNE
Score kSNMPMargin = 588;// 587
Array2d<Depth, 64, 64> lmr_reductions = init_lmr_reductions(1.34);//135
Vec<Score, 4> kFutileMargin = init_futility_margins(1274);//900
std::array<size_t, 5> kLMP = {0, 6, 9, 13, 18};

#else
constexpr Score kSNMPMargin = 588;// 587
Array2d<Depth, 64, 64> lmr_reductions = init_lmr_reductions(1.34);//135
const Vec<Score, 4> kFutileMargin = init_futility_margins(1274);//900
const std::array<size_t, 5> kLMP = {0, 6, 9, 13, 18};
#endif

template<NodeType node_type>
const Depth get_lmr_reduction(const Depth depth, const size_t move_number) {
Expand Down Expand Up @@ -611,7 +623,7 @@ Score QuiescentSearch(Thread &t, Score alpha, Score beta) {

//End search immediately if trivial draw is reached
if (t.board.IsTriviallyDrawnEnding()) {
return 0;
return draw_score[t.board.get_turn()];;
}

//TT probe
Expand Down Expand Up @@ -802,6 +814,7 @@ Score AlphaBeta(Thread &t, Score alpha, Score beta, Depth depth, bool expected_c
assert(node_type != NodeType::kPV || !expected_cut_node);

const Score original_alpha = alpha;
const Score score_draw = draw_score[t.board.get_turn()];
Score lower_bound_score = kMinScore+t.board.get_num_made_moves();

//Immediately return 0 if we detect a draw.
Expand All @@ -810,7 +823,7 @@ Score AlphaBeta(Thread &t, Score alpha, Score beta, Depth depth, bool expected_c
if (t.board.IsFiftyMoveDraw() && t.board.InCheck() && t.board.GetMoves<kNonQuiescent>().empty()) {
return kMinScore+t.board.get_num_made_moves();
}
return 0;
return score_draw;
}

//We drop to QSearch if we run out of depth.
Expand Down Expand Up @@ -893,7 +906,7 @@ Score AlphaBeta(Thread &t, Score alpha, Score beta, Depth depth, bool expected_c
if (in_check) {
return kMinScore+t.board.get_num_made_moves();
}
return 0;
return score_draw;
}

// if (Mode == kSamplingSearchMode && node_type == NodeType::kNW && depth <= kMaxDepthSampled) {
Expand Down Expand Up @@ -1101,21 +1114,22 @@ Score RootSearchLoop(Thread &t, Score original_alpha, Score beta, Depth current_

Score alpha = original_alpha;
Score lower_bound_score = kMinScore;
const Score score_draw = draw_score[t.board.get_turn()];
//const bool in_check = board.InCheck();
if (settings::kRepsForDraw == 3 && alpha < -1 && t.board.MoveInListCanRepeat(moves)) {
if (beta <= 0) {
return 0;
if (settings::kRepsForDraw == 3 && alpha < score_draw-1 && t.board.MoveInListCanRepeat(moves)) {
if (beta <= score_draw) {
return score_draw;
}
alpha = -1;
alpha = score_draw-1;
}
const bool in_check = t.board.InCheck();
for (size_t i = 0; i < moves.size(); ++i) {
t.set_move(moves[i]);
t.board.Make(moves[i]);
if (i == 0) {
Score score = -AlphaBeta<NodeType::kPV, Mode>(t, -beta, -alpha, current_depth - 1);
if (settings::kRepsForDraw == 3 && score < 0 && t.board.CountRepetitions() >= 2) {
score = 0;
if (settings::kRepsForDraw == 3 && score < score_draw && t.board.CountRepetitions() >= 2) {
score = score_draw;
}
t.board.UnMake();
if (score >= beta) {
Expand All @@ -1137,8 +1151,8 @@ Score RootSearchLoop(Thread &t, Score original_alpha, Score beta, Depth current_
if (score > alpha) {
score = -AlphaBeta<NodeType::kPV, Mode>(t, -beta, -alpha, current_depth - 1);
}
if (settings::kRepsForDraw == 3 && score < 0 && t.board.CountRepetitions() >= 2) {
score = 0;
if (settings::kRepsForDraw == 3 && score < score_draw && t.board.CountRepetitions() >= 2) {
score = score_draw;
}
lower_bound_score = std::max(score, lower_bound_score);
t.board.UnMake();
Expand Down Expand Up @@ -1305,8 +1319,13 @@ void Thread::search() {
<< " time " << time_used.count()
<< " nodes " << node_count << " nps " << ((1000*node_count) / (time_used.count()+1));
if (!is_mate_score(score)) {
std::cout << " score cp "
<< (score / 8);
std::cout << " score cp ";
if (armageddon) {
std::cout << (score / 8);
}
else {
std::cout << (net_evaluation::GetUnbiasedScore(score, board.get_turn()) / 8);
}
}
else {
Score m_score = board.get_num_made_moves();
Expand Down Expand Up @@ -1345,6 +1364,13 @@ void Thread::search() {
template<int Mode>
Move RootSearch(Board &board, Depth depth, Milliseconds duration = Milliseconds(24 * 60 * 60 * 1000)) {
table::UpdateGeneration();
if (armageddon) {
net_evaluation::SetContempt(60, kWhite);
}
else {
net_evaluation::SetContempt(contempt, board.get_turn());
}
draw_score = net_evaluation::GetDrawArray();
min_ply = board.get_num_made_moves();
Threads.reset_node_count();
Threads.reset_depths();
Expand Down Expand Up @@ -2120,16 +2146,26 @@ std::vector<Board> GenerateEvalSampleSet(std::string filename) {
return boards;
}

void SetContempt(int contempt_) {
contempt = contempt_;
}

void SetArmageddon(bool armageddon_) {
armageddon = armageddon_;
}

#ifdef TUNE
void SetFutilityMargin(Score score) {
//kFutileMargin = init_futility_margins(score);
kFutileMargin = init_futility_margins(score);
}

void SetSNMPMargin(Score score) {
//kSNMPMargin = score;
kSNMPMargin = score;
}

void SetLMRDiv(double div) {
// lmr_reductions = init_lmr_reductions(div);
lmr_reductions = init_lmr_reductions(div);
}
#endif

}
5 changes: 5 additions & 0 deletions src/search.h
Original file line number Diff line number Diff line change
Expand Up @@ -65,9 +65,14 @@ void LoadSearchVariablesHardCoded();
void EvaluateCaptureMoveValue(int n);
void EvaluateScoreDistributions(const int focus);

void SetContempt(int contempt);
void SetArmageddon(bool armageddon);

#ifdef TUNE
void SetFutilityMargin(Score score);
void SetSNMPMargin(Score score);
void SetLMRDiv(double div);
#endif

}

Expand Down
Loading

0 comments on commit 0c46fa2

Please sign in to comment.