-
Notifications
You must be signed in to change notification settings - Fork 298
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Disable tree reuse in training. #536
Conversation
It was spinning on resign, if the 450th ply was a win/loss/draw it wasn't recognized, and if the 450th ply was a win/loss/draw it was trying to play a move from that position.
Doesn't seem to have a significant performance win, and has detrimental effects on the effectiveness of noise.
Note that this PR is diffbased against PR #526 - since its easier to make this change after that is fixed. |
I'm in favor of this, especially since it's practically free performance-wise. I'd like to see some opinions from @glinscott @Error323. |
Some numbers to aid consideration. |
src/UCI.cpp
Outdated
for (int game_ply = 0; game_ply < 450; ++game_ply) { | ||
auto search = std::make_unique<UCTSearch>(bh.shallow_clone()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about a comment saying we are doing this on purpose to avoid tree reuse? Otherwise it might look like a bug.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
Doesn't seem to have a significant performance win, and has detrimental effects on the effectiveness of noise.
I've started gathering some data. Moves is actually ply. baseline and baseline 2 are independent searches with full tree reuse policy visit delta relative to a main full tree reuse search that is in charge of choosing the moves under standard training conditions. I put in 2 baselines to get a feel for what kind of error bar there is in the data. So disabling tree reuse is ~10% more noise effect overall. dual_tree is similar. I had another run about this deep which I accidentally closed - disabling tree reuse was a bit higher in that one, closer to 15% and dual tree was a bit closer to the middle. I'll leave it running overnight to see if it moves a bit more. 10-15% may not sound large - but I'll call out the specific lines I've selected above. They are taken from late game, and obviously there are some forced moves. Noise has 0 effect on a forced move, but then in the baseline it also has 0 effect on the next move, where dual tree and no tree reuse both get nice noise effects. So it may not be lots more noise overall, it is a huge increase in noise at specific moves which I think are the moves at most risk of policy overfit. |
Overnight numbers: Not really much change. ~10% more for dual tree, and ~12% for no tree reuse. Delta moves: 36909 baseline: 5351446, baseline2: 5328474, dual_tree: 5854904, nohistory: 6069182 |
I realized there was an issue with my methodology. Rather than comparing against the move chooser, I added a new search tree with no noise to compare against - I think its gives a more realistic value magnitude then calculating the diff against something with noise in it. Also included the move_chooser to compare next to baseline to see how much effect being the move chooser has on the result. Early numbers: Results are pretty similar, but the relative magnitude of the new scenarios are a bit higher than before. |
One last result set. (I'm going to switch to testing something else now.) ~16% for no tree reuse and ~12% for dual tree. |
This is an alternative to #528, since I didn't find any significant performance difference and this is even more clearly a win for noise effectiveness.