You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I observed that the policy will be set to noise in "expand_node", but the "update_policy" used during inference (in "process_mini_batch") will directly update the policy to the result of network calculations, so that there will be no randomness at all except selfplay games.
The text was updated successfully, but these errors were encountered:
The noise is not set in expand_node(). It is tentative policy. It will be replaced by NN policy in process_mini_batch(). So you are right. The MCTS process is not random.
CGLemon is right. There is little randomness when executing as a normal MCTS player. If you want to add randomness to TamaGo, I modify TamaGo to be able to run like AlphaZero (dirichlet noise and move generation from distribution of the number of visits).
I observed that the policy will be set to noise in "expand_node", but the "update_policy" used during inference (in "process_mini_batch") will directly update the policy to the result of network calculations, so that there will be no randomness at all except selfplay games.
The text was updated successfully, but these errors were encountered: