Here you can find the scripts and instructions for reproducing all the figures in my thesis. I use matplotlib.pyplot for everything described here. Furthermore, the plots require input data, which is based on the CSV files generated by scripts as described in the top-level README.
All plot scripts are found in the python/plots
directory. To make sure that imports work, either run the scripts from within the python/plots
directory or export PYTHONPATH=python/plots
first. Some scripts also require python/model
to be in the path. Cover all cases with
export PYTHONPATH="$PYTHONPATH:python:python/plots:python/model"
This plot shows the distribution of player ratings with different colors for DDK, SDK and DAN categories.
The ratings data must be provided in a plain text file, one per line. The script analyzedataset.py
with the rating
command writes out the data from a dataset list CSV.
Extract the data, then create the plot with ratings.py
as follows:
python3 python/analyzedataset.py rating csv/games_labels.csv featurecache csv/rating.csv
python3 python/plots/ratings.py csv/rating.csv
Somewhat related, the command analyzedataset.py basics path/to/games.csv x y
(with two dummy arguments) prints the total number of games and the number of white wins (Score=0).
This plot shows the a-priori distribution of points loss per move with different ranking categories.
First, get the rating labels and points loss for every move in the training dataset using analyzedataset.py
with the loss_rating
command, then pass the file name to the script ploss_rating.py
:
python3 python/analyzedataset.py loss_rating csv/games_labels.csv featurecache csv/loss_rating.csv
python3 python/plots/ploss_rating.py csv/loss_rating.csv
Like the points loss plot, this shows the a-priori distribution of winrate loss per move with different ranking categories. The steps are the same.
First, get the rating labels and winrate loss for every move in the training dataset using analyzedataset.py
with the loss_rating
command (if not already extracted for the previous plot), then pass the file name to the script ploss_rating.py
:
python3 python/analyzedataset.py loss_rating csv/games_labels.csv featurecache csv/loss_rating.csv
python3 plots/wrloss_rating.py csv/loss_rating.csv
This plot shows the distribution of game lengths for all games that ended in counting.
The game length data must be provided in a plain text file, one per line. The script analyzedataset.py
with the gamelength
command writes out the data from a dataset list CSV. (The dummy argument is ignored, but required.) The script can be easily modified to consider the entire dataset, or just training games.
Extract the data, then create the plot with gamelength.py
as follows:
python3 python/analyzedataset.py gamelength csv/games_7M.csv dummy csv/gamelength.csv
python3 python/plots/gamelength.py csv/gamelength.csv
This script creates two plots to show the progression of ratings under Glicko-2 after a certain number of games. It sources its data from the result as described in the main README, Section “Glicko-2 Calculation”.
The first plot shows how the rating deviation develops over games. The second plot shows histograms of the rating distribution over games.
python3 python/plots/deviation.py csv/games_glicko.csv
This script creates plots which illustrate the estimates produced by our models compared with the labels on the games. It sources its data from the result as described in the main README. The input CSV file needs to have the columns "BlackRating", "PredictedBlackRating", "WhiteRating", "PredictedWhiteRating", "Score" and "PredictedScore".
The second parameter to the script just gives it a model name to use in the figure title.
python3 python/plots/estimate_vs_label.py --setmarker E csv/games_glicko.csv Glicko-2
With the optional --scoredist
switch, the script produces a different, 2-subfigure plot that shows the distribution of games where the outcome was predicted correctly or incorrectly by the model.
These are plots of the strength model network's internal values: layer outputs, activations and gradients. Then a plot of the network's output distribution, compared to the training label distribution.
python3 python/plots/netvis.py csv/games_labels.csv featurecache --net nets/model.pth --index 10 --featurename pick
If the --net
parameter is unspecified, the net will be randomly initialized.
The optional --index
parameter indicates which item from the training set should be passed through the network (always black recent moves). If unspecified, the first item is used by default.
The optional --featurename
parameter indicates which feature type from the dataset should be used (trunk
, pick
or head
).
During training, the progress is recorded as training loss (once every step) and validation loss (once every epoch).
This data is usually (if following the commands as laid out in README.txt) stored under logs/
, in files with trainloss
and validationloss
in their name.
The plot script trainingprogress.py
reads a trainloss
and a validationloss
file and draws them. An optional third parameter determines the prefix in the figure title.
python3 python/plots/trainingprogress.py logs/trainloss.txt logs/validationloss.txt "Experiment Model"
The hyperparameter search runs multiple trainings, looking for the best hyperparameter set.
The script plots/search.py
shows the performance as a series of plots, one for every hyperparameter.
python3 python/plots/search.py --zoom path/to/search/logs
The optional --zoom
parameter restricts the y-axis to the range 0.578-0.59, showing only the more promising training runs. Especially in iteration 1, some results can be useless outliers that squish the better models at the bottom of the plot.
For visualizing the ratings determined by the strength model of the various trick play failure lines and refutations, the script trickratings.py
creates a plot of their distribution.
python3 python/plots/trickratings.py csv/trickratings.csv
The input file trickratings.csv
must be either used as it is in this repository, or manually constructed from the output of run.py
over extracted trick line features. The process for this is described in the main README
.