Merge pull request #110 from kobanium/develop

Develop
kobanium · Dec 15, 2024 · af99376 · af99376
2 parents a1e2003 + 4540372
commit af99376
Show file tree

Hide file tree

Showing 19 changed files with 756 additions and 106 deletions.
diff --git a/.gitignore b/.gitignore
@@ -3,4 +3,5 @@
 **/*.npz
 **/*.bin
 **/*.ckpt
-archive/*
+archive/*
+.coverage
diff --git a/README.md b/README.md
@@ -22,14 +22,18 @@ TamaGo runs on Python 3.6 or higher.
 - [GoGui analyze commands](#gogui-analyze-commands)
 - [Analyze commands](#analyze-commands)
 - [CGOS analyze mode](#cgos-analyze-mode)
+- [GTP extension command](#gtp-extension-command)
+- [Tree visualization](#tree-visualization)
 - [License](#license)
 
 # Requirements
-| Package name | Purpose |
-| --- | --- |
-| click | Implementation of command line options |
-| numpy | Fast calculation |
-| pytorch | Implementation of neural network construction and learning |
+| Package name | Purpose | Note |
+| --- | --- | --- |
+| click | Implementation of command line options | |
+| numpy | Fast calculation | |
+| pytorch | Implementation of neural network construction and learning | |
+| graphviz | Visualization for MCTS tree | Optional |
+| matplotlib | Visualization for MCTS tree | Optional |
 
 # Installation
 You can install TamaGo by executing the following command in a Python-installed computer.
@@ -54,7 +58,8 @@ TamaGo's command line options are as follows,
 | `--use-gpu` | Flag to use a GPU | true or false | true | false | |
 | `--policy-move` | Flag to move according to Policy distribution | true or false | true | false | |
 | `--sequential-halving` | Flag to use SHOT (Sequential Halving applied to trees) for searching | true or false | true | false | It's for debugging |
-| `--visits` | The number of visits per move | Integer number more than 0 | 1000 | 1000 | When you use '--const-time' or '--time' options, this option is ignored. |
+| `--visits` | The number of visits per move | Integer number more than 0 | 1000 | 1000 | When you use '--strict-visits', '--const-time' or '--time' options, this option is ignored. |
+| `--strict-visits` | Same as `--visits`, but never stop early even if the best move is clear | Integer number more than 0 | 1000 | None | When you use '--const-time' or '--time' options, this option is ignored. |
 | `--const-time` | Time to thinking per move | Real number more than 0 | 10.0 | None | When you use '--const-time' or '--time' options, this option is ignored.|
 | `--time` | Total remaining time for a game | Real number more than 0 | 600.0 | None |
 | `--batch-size` | Mini-batch size for MCTS | Integer number more than 0 | 13 | NN_BATCH_SIZE | NN_BATCH_SIZE is defined in mcts/constant.py. |
@@ -124,5 +129,15 @@ TamaGo version 0.7.0 supports cgos-analyze, cgos-genmove_analyze commands. When
 
 ![cgos-analyze-pv](img/cgos-analyze-pv.png)
 
+# GTP extension command
+TamaGo supports tamago-readsgf command as an original extension of GTP. Similar to the standard GTP command loadsgf, it accepts a literal SGF string instead of an SGF file path, as illustrated in the following example. However, unlike loadsgf, `move_number` is not supported. Additionally, the SGF string must not contain any newlines.
+
+```
+tamago-readsgf (;SZ[9]KM[7];B[fe];W[de])
+```
+
+# Tree visualization
+TamaGo version 0.10.0 supports visualization of a search tree, please check [here](doc/en/tree_visualization.md).
+
 # License
 You can use TamaGo under [Apache License 2.0](LICENSE).
diff --git a/animation/animation.py b/animation/animation.py
@@ -0,0 +1,68 @@
+import sys
+import select
+import time
+
+
+def animate_mcts(mcts, board, to_move, pv_wait_sec, move_wait_sec):
+    previous_pv = []
+    def callback(path):
+        _animate_path(path, mcts, board, pv_wait_sec, move_wait_sec, previous_pv)
+        finished = _stdin_has_data()
+        return finished
+    mcts.search_with_callback(board, to_move, callback)
+
+
+def _stdin_has_data():
+    rlist, _, _ = select.select([sys.stdin], [], [], 0)
+    return bool(rlist)
+
+
+def _animate_path(path, mcts, board, pv_wait_sec, move_wait_sec, previous_pv):
+    # 今回探索した系列の属性値
+    root_index, i = path[0]
+    root = mcts.node[root_index]
+    if root.children_visits[i] == 0:
+        return
+    coordinate = board.coordinate
+    move = coordinate.convert_to_gtp_format(root.action[i])
+    pv = [coordinate.convert_to_gtp_format(mcts.node[index].action[child_index]) for (index, child_index) in path]
+    pv_visits = [str(mcts.node[index].children_visits[child_index]) for (index, child_index) in path]
+    pv_winrate = [str(int(10000 * _get_winrate(mcts, index, child_index, depth))) for depth, (index, child_index) in enumerate(path)]
+
+    # lz-analyze の本来の出力内容を加工
+    children_status_list = root.get_analysis_status_list(board, mcts.get_pv_lists)
+    fake_status_list = [status.copy() for status in children_status_list]
+    target = next((status for status in fake_status_list if status["move"] == move), None)
+    if target is None:
+        return  # can't happen
+    # 今回探索した系列の初手を最善手と偽って順位をふり直す
+    target["order"] = -1
+    fake_status_list.sort(key=lambda status: status["order"])
+    for order, status in enumerate(fake_status_list):
+        status["order"] = order
+
+    # PV 欄を差しかえながら複数回出力することで一手ずつアニメーション
+    for k in range(1, len(pv) + 1):
+        # 前回の系列と共通な手順はスキップ
+        if pv[:k] == previous_pv[:k]:
+            continue
+
+        target["pv"] = " ".join(pv[:k])
+        target["pvVisits"] = " ".join(pv_visits[:k])
+        target["pvWinrate"] = " ".join(pv_winrate[:k])
+
+        sys.stdout.write(root.get_analysis_from_status_list("lz", fake_status_list))
+        sys.stdout.flush()
+        time.sleep(max(move_wait_sec, 0.0))
+
+    previous_pv[:] = pv
+    time.sleep(max(pv_wait_sec, 0.0))
+
+
+def _get_winrate(mcts, index, child_index, depth):
+    node = mcts.node[index]
+    i = child_index
+    visits = node.children_visits[i]
+    value = node.children_value_sum[i] / visits if visits > 0 else node.children_value[i]
+    winrate = value if depth % 2 == 0 else 1.0 - value
+    return winrate
diff --git a/board/go_board.py b/board/go_board.py
@@ -411,6 +411,11 @@ def get_all_legal_pos(self, color: Stone) -> List[int]:
     def display(self, sym: int=0) -> NoReturn:
         """盤面を表示する。
         """
+        print_err(self.get_board_string(sym=sym))
+
+    def get_board_string(self, sym: int=0) -> str:
+        """盤面を表わす文字列を返す。
+        """
         board_string = f"Move : {self.moves}\n"
         board_string += f"Prisoner(Black) : {self.prisoner[0]}\n"
         board_string += f"Prisoner(White) : {self.prisoner[1]}\n"
@@ -432,7 +437,7 @@ def display(self, sym: int=0) -> NoReturn:
 
         board_string += "  +" + "-" * (self.board_size * 2 + 1) + "+\n"
 
-        print_err(board_string)
+        return board_string
 
 
     def display_self_atari(self, color: Stone) -> NoReturn:
@@ -546,6 +551,13 @@ def get_handicap_history(self) -> List[int]:
         """
         return self.record.handicap_pos[:]
 
+    def set_history(self, move_history, handicap_history):
+        self.clear()
+        for handicap in handicap_history:
+            self.board.put_handicap_stone(handicap, Stone.BLACK)
+        for (color, pos, _) in move_history:
+            self.put_stone(pos, color)
+
     def count_score(self) -> int: # pylint: disable=R0912
         """領地を簡易的にカウントする。
 

diff --git a/board/pattern.py b/board/pattern.py
@@ -18,6 +18,17 @@
     [0xfffc, 0x00000001, 0x00000002],
 ], dtype=np.uint32)
 
+nb4_empty = [0] * 65536
+for i, _ in enumerate(nb4_empty):
+    if ((i >> 2) & 0x3) == 0:
+        nb4_empty[i] += 1
+    if ((i >> 6) & 0x3) == 0:
+        nb4_empty[i] += 1
+    if ((i >> 8) & 0x3) == 0:
+        nb4_empty[i] += 1
+    if ((i >> 12) & 0x3) == 0:
+        nb4_empty[i] += 1
+
 
 class Pattern:
     """配石パターンクラス。
@@ -38,17 +49,6 @@ def __init__(self, board_size: int, pos_func: Callable[[int], int]):
             -1, 1, board_size_with_ob - 1, board_size_with_ob, board_size_with_ob + 1
         ]
 
-        self.nb4_empty = [0] * 65536
-        for i, _ in enumerate(self.nb4_empty):
-            if ((i >> 2) & 0x3) == 0:
-                self.nb4_empty[i] += 1
-            if ((i >> 6) & 0x3) == 0:
-                self.nb4_empty[i] += 1
-            if ((i >> 8) & 0x3) == 0:
-                self.nb4_empty[i] += 1
-            if ((i >> 12) & 0x3) == 0:
-                self.nb4_empty[i] += 1
-
         # 眼のパターン
         eye_pat3 = [
             # +OO     XOO     +O+     XO+
@@ -148,7 +148,7 @@ def get_n_neighbors_empty(self, pos: int) -> int:
         Returns:
             int: 上下左右の空点数（最大4）
         """
-        return self.nb4_empty[self.pat3[pos]]
+        return nb4_empty[self.pat3[pos]]
 
     def get_eye_color(self, pos: int) -> Stone:
         """指定した座標の眼の色を取得する。

diff --git a/doc/en/tree_visualization.md b/doc/en/tree_visualization.md
@@ -0,0 +1,29 @@
+# About tree visualization
+TamaGo supports visualization of a search treem.
+
+## Example
+```
+(echo 'tamago-readsgf (;SZ[9]KM[7];B[fe];W[de];B[ec])';
+ echo 'lz-genmove_analyze 7777777';
+ echo 'undo';
+ echo 'tamago-dump_tree') \
+| python3 main.py --model model/model.bin --strict-visits 100 \
+| grep dump_version | gzip > tree.json.gz
+python3 graph/plot_tree.py tree.json.gz tree_graph
+display tree_graph.svg
+```
+
+![Result of search tree visualization](../../img/tree_graph.png)
+
+## Command line arguments for graph/plot_tree.py
+
+| Argument | Description | Value | Example of value | Node |
+|---|---|---|---|---|
+| INPUT_JSON_PATH | Path to a .json file which has a result of tamago-dump_tree command. | tree.json.gz | |
+| OUTPUT_IMAGE_PATH | Path to a image file which has a visualization result. | tree_graph | | Automatically assigned extension(.svg) |
+
+## Option for graph/plot_tree.py
+
+| Option | Description | Value | Example of value | Default value | Node |
+|---|---|---|---|---|---|
+| `--around-pv` | Flag to include all tree nodes. | true or false | true | false | |
diff --git a/doc/ja/README.md b/doc/ja/README.md
@@ -2,13 +2,15 @@
 TamaGoはPythonで実装された囲碁の思考エンジンです。  
 SGF形式の棋譜ファイルを利用した教師あり学習、Gumbel AlphaZero方式の強化学習をお試しできるプログラムとなっています。  
 学習したニューラルネットワークのモデルを使用したモンテカルロ木探索による着手生成ができます。  
-Python 3.6で動作確認をしています。
+Python 3.8で動作確認をしています。
 
 * [使用する前提パッケージ](#requirements)
 * [セットアップ手順](#installation)
 * [思考エンジンとしての使い方](#how-to-execute-gtp-engine)
 * [教師あり学習の実行](#how-to-execute-supervised-learning)
 * [強化学習の実行](#how-to-execute-reinforcement-learning)
+* [GTP](#gtp-extension-command)
+* [探索木の可視化](#tree-visualization)
 * [ライセンス](#license)
 
 # Requirements
@@ -17,6 +19,8 @@ Python 3.6で動作確認をしています。
 |click|コマンドライン引数の実装|
 |numpy|雑多な計算|
 |pytorch|Neural Networkの構成と学習の実装|
+|graphviz|探索木の可視化|
+|matplotlib|探索木の可視化|
 
 # Installation
 Python 3.6が使える環境で下記コマンドで前提パッケージをインストールします。
@@ -40,7 +44,8 @@ python main.py
 | `--use-gpu` | GPU使用フラグ | true または false | true | false | |
 | `--policy-move` | Policyの分布に従って着手するフラグ | true または false | true | false | Policyのみの強さを確認するときに使用します。 |
 | `--sequential-halving` | Sequential Halving applied to treesの探索手法で探索するフラグ | true または false | true | false | 自己対戦時に使う探索なので、基本的にデバッグ用です。 |
-| `--visits` | 1手あたりの探索回数 | 1以上の整数 | 1000 | 1000 | --const-timeオプション、または--timeオプションの指定があるときは本オプションを無視します。 |
+| `--visits` | 1手あたりの探索回数 | 1以上の整数 | 1000 | 1000 | --strict-visitsオプション、--const-timeオプション、または--timeオプションの指定があるときは本オプションを無視します。 |
+| `--strict-visits` | --visitsと同様だが、途中で最善手が確定しても探索を打ち切らない | 1以上の整数 | 1000 | None | --const-timeオプション、または--timeオプションの指定があるときは本オプションを無視します。 |
 | `--const-time` | 1手あたりの探索時間 (秒) | 0より大きい実数 | 10.0 |  | --timeオプションの指定があるときは本オプションを無視します。 |
 | `--time` | 持ち時間 (秒) | 0より大きい実数 | 600.0 | |
 | `--batch-size` | 探索時のニューラルネットワークのミニバッチサイズ | 1以上の整数 | 13 | NN_BATCH_SIZE | NN_BATCH_SIZEはmcts/constant.pyに定義してあります。 |
@@ -112,6 +117,16 @@ TamaGoはバージョン0.7.0からcgos-analyze, cgos-genmove_analyzeをサポ
 
 ![CGOSでの読み筋表示](../../img/cgos-analyze-pv.png)
 
+# GTP extension command
+TamaGoはGTPの独自拡張としてtamago-readsgfコマンドをサポートします。これはGTP標準のloadsgfコマンドと似ていますが、SGFファイルのパスではなくSGF文字列そのものを引数として、次の例のように使います。loadsgfとは異なり`move_number`の指定はできません。また、SGF文字列は改行を含んではいけません。
+
+```
+tamago-readsgf (;SZ[9]KM[7];B[fe];W[de])
+```
+
+# Tree visualization
+TamaGoはバージョン0.10.0から探索木の可視化機能をサポートしています。詳細については[こちら](tree_visualization.md)をご参照ください。
+
 # License
 ライセンスはApache License ver 2.0です。
 

diff --git a/doc/ja/tree_visualization.md b/doc/ja/tree_visualization.md
@@ -0,0 +1,29 @@
+# 探索木の可視化について
+TamaGoは探索木の可視化機能をサポートしています。
+
+## 可視化機能の実行例
+```
+(echo 'tamago-readsgf (;SZ[9]KM[7];B[fe];W[de];B[ec])';
+ echo 'lz-genmove_analyze 7777777';
+ echo 'undo';
+ echo 'tamago-dump_tree') \
+| python3 main.py --model model/model.bin --strict-visits 100 \
+| grep dump_version | gzip > tree.json.gz
+python3 graph/plot_tree.py tree.json.gz tree_graph
+display tree_graph.svg
+```
+
+![探索木の可視化結果](../../img/tree_graph.png)
+
+## graph/plot_tree.pyのコマンドライン引数
+
+| 引数 | 概要 | 設定する値 | 設定値の例 | 備考 |
+|---|---|---|---|---|
+| INPUT_JSON_PATH | tamago-dump_treeコマンドを実行した結果のJSONファイルのパス | tree.json.gz | |
+| OUTPUT_IMAGE_PATH | 可視化結果を保持する画像ファイルのパス | tree_graph | | 拡張子(.svg)が自動的に付与される |
+
+## graph/plot_tree.pyのオプション
+
+| オプション | 概要 | 設定する値 | 設定値の例 | デフォルト値 | 備考 |
+|---|---|---|---|---|---|
+| `--around-pv` | 主分岐のまわりのみ表示するフラグ | true または false | true | false | |