-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dual NNUE with L1-128 smallnet #4915
Conversation
e1044c1
to
54574d3
Compare
@mstembera i added you as a co-author to this PR and fixed the sanitizer errors |
So first of all nice that we have some working example of this idea.
|
cleaned up makefile duplication with: #4910 (comment) cc @mstembera regarding the code style comments |
Congrats, nice achievement!!
Me neither. Is there anything wrong with the generic |
@linrock Congrats again and thanks for all your work!
|
imo it's a simplification and more cohesive to move simple_eval into the position class, given it only uses position stuff.
Updated so there's only a single |
I agree the code could be cleaned up and easier to read, though it seems more time and discussion is needed to flush that out. Particularly if more testing needs to be done due to the risk of a slowdown from a cleaner solution. I think of this initial PR as a proof-of-concept that dual nnue has potential, with more cleanups and improvements to come. Another LTC passed today, tested vs. this initial PR, showing there's room for more elo here as well: |
I'd like to avoid merging a "half" baked solution, I think fishtest is also missing some changes for this as seen by the fact that noob's workers are having problems. I'm still waiting on some feedback from @ppigazzini.
that is always easier said than done :D
we have an eval namespace and I would find it weird if we decide to split different kind of evals between position and the already mentioned eval namespace. So I'd prefer it when we leave the simple_eval in the eval namespace.
@linrock i suppose you have a repo/branch somewhere to do this? Can you link it as well? In the meantime I could merge this into a separate branch, where other people could then can contribute to and improve this (with fishtest backed tests)? Although they could already make pr's to your branch I'd also like to hear some feedback from @Sopel97. |
The binary seems to not declare the small net name:
This is the output of the @mstembera's PR:
|
We removed that since when I talked with vondele about this he’d like to not expose this as a second uci option. |
TODO fill this out Passed STC vs. 36db936: https://tests.stockfishchess.org/tests/view/6576b3484d789acf40aabbfe LLR: 2.94 (-2.94,2.94) <0.00,2.00> Total: 21920 W: 5680 L: 5381 D: 10859 Ptnml(0-2): 82, 2488, 5520, 2789, 81 Passed LTC vs. DualNNUE official-stockfish#4915: https://tests.stockfishchess.org/tests/view/65775c034d789acf40aac7e3 LLR: 2.95 (-2.94,2.94) <0.50,2.50> Total: 147606 W: 36619 L: 36063 D: 74924 Ptnml(0-2): 98, 16591, 39891, 17103, 120 bench 1366244
TODO fill out details Passed STC vs. 36db936: https://tests.stockfishchess.org/tests/view/6576b3484d789acf40aabbfe LLR: 2.94 (-2.94,2.94) <0.00,2.00> Total: 21920 W: 5680 L: 5381 D: 10859 Ptnml(0-2): 82, 2488, 5520, 2789, 81 Passed LTC vs. DualNNUE official-stockfish#4915: https://tests.stockfishchess.org/tests/view/65775c034d789acf40aac7e3 LLR: 2.95 (-2.94,2.94) <0.50,2.50> Total: 147606 W: 36619 L: 36063 D: 74924 Ptnml(0-2): 98, 16591, 39891, 17103, 120 bench 1366244
Ok, thanks, I was using the @mstembera PR to change fishtest, reverting. |
src/evaluate.cpp
Outdated
for (bool small : {false, true}) | ||
{ | ||
std::string eval_file = | ||
std::string(small ? EvalFileDefaultNameSmall : Options[EvFiles[small]]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why treat small nets differently here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this was a hack to disable the additional EvalFileSmall
UCI option and hardcode the default smallnet name while leaving the default EvalFile
option available for configuring the big net
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe it's worth considering removing the Default net from the options also. I don't think any users can produce nets anyway and anyone who has usually just changes the name in evaluate.h Treating the nets consistently would be nice and avoid this. @vondele @Disservin What do u think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the EvalFile
option is currently necessary during training for local elo measurements, as it's used to load nets converted from training checkpoints without having to recompile a new binary for each one:
https://github.com/official-stockfish/nnue-pytorch/blob/master/run_games.py#L126-L129
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@linrock Got it. Is it then not a problem for you to not have the small file exposed for the same reason?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's not a problem as long as there's at least one option to change a net. for small net test branches, i modified EvalFile
to instead change the small net
src/nnue/nnue_feature_transformer.h
Outdated
class FeatureTransformer { | ||
|
||
private: | ||
static constexpr bool Small = TransformedFeatureDimensions == TransformedFeatureDimensionsSmall; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is really backwards logic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see your point. I thought the FeatureTransformers and Networks instantiated at the top of evaluate_nnue.cpp (Lines 43-48) should match so it's clear that they are both created w/ matching dimensions. No problem if you change it.
st->accumulator.computed[BLACK] = false; | ||
auto& dp = st->dirtyPiece; | ||
dp.dirty_num = 1; | ||
st->accumulatorBig.computed[WHITE] = st->accumulatorBig.computed[BLACK] = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should loop over the list of all network types to prevent bugs in the future
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure what you mean.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
somewhere in nnue headers
constexpr auto NETWORK_TYPES[] = { Big, Small };
...
for (auto tp : NETWORK_TYPES)
for (auto color : { WHITE, BLACK })
st->computed(color, tp) = false
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I c. Feels a bit over engineered but I don't really object. I guess we would also have to add a StateInfo::computed(Color, NetSize) method. I was actually able to remove all the other StateInfo methods to access the accumulators as a result of your other suggestion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess it's fine, other places are also not completely generic
st->accumulator.computed[BLACK] = false; | ||
st->dirtyPiece.dirty_num = 0; | ||
st->dirtyPiece.piece[0] = NO_PIECE; // Avoid checks in UpdateAccumulator() | ||
st->accumulatorBig.computed[WHITE] = st->accumulatorBig.computed[BLACK] = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should loop over the list of all network types to prevent bugs in the future
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto
Why the bench of the PR is different from the bench of the version that passed the tests? |
Sometimes tests passes on an older masters and they get rebased in the PR time. |
bench: 1485861
@mstembera where are you doing the changes? I think it be the easiest when you create a pr to linrocks branch and he merges that in. Instead of updating two pr simultaneously. |
@Sopel97 Thanks for the review!
@Disservin You are probably right but I don't know git well enough to make a PR to someone elses branch. Maybe @linrock can just grab the changes from me like I grabbed those from here w/o git? |
When creating a pr to Stockfish I guess you already do it over this interface? Here you can just click on the base repository dropdown and select linrocks fork and then you can just change the base branch to match that of this pr, which would be |
Created by training an L1-128 net from scratch with a wider range of evals in the training data and wld-fen-skipping disabled during training. The differences in this training data compared to the first dual nnue PR are: - removal of all positions with 3 pieces - when piece count >= 16, keep positions with simple eval above 750 - when piece count < 16, remove positions with simple eval above 3000 The asymmetric data filtering was meant to flatten the training data piece count distribution, which was previously heavily skewed towards positions with low piece counts. Additionally, the simple eval range where the smallnet is used was widened to cover more positions previously evaluated by the big net and simple eval. ```yaml experiment-name: 128--S1-hse-S7-v4-S3-v1-no-wld-skip training-dataset: - /data/hse/S3/leela96-filt-v2.min.high-simple-eval-1k.binpack - /data/hse/S3/dfrc99-16tb7p-eval-filt-v2.min.high-simple-eval-1k.binpack - /data/hse/S3/test80-apr2022-16tb7p.min.high-simple-eval-1k.binpack - /data/hse/S7/test60-2020-2tb7p.v6-3072.high-simple-eval-v4.binpack - /data/hse/S7/test60-novdec2021-12tb7p-filter-v6-dd.min-mar2023.unmin.high-simple-eval-v4.binpack - /data/hse/S7/test77-nov2021-2tb7p.v6-3072.min.high-simple-eval-v4.binpack - /data/hse/S7/test77-dec2021-16tb7p-filter-v6-dd.min-mar2023.unmin.high-simple-eval-v4.binpack - /data/hse/S7/test77-jan2022-2tb7p.high-simple-eval-v4.binpack - /data/hse/S7/test78-jantomay2022-16tb7p-filter-v6-dd.min-mar2023.unmin.high-simple-eval-v4.binpack - /data/hse/S7/test78-juntosep2022-16tb7p-filter-v6-dd.min-mar2023.unmin.high-simple-eval-v4.binpack - /data/hse/S7/test79-apr2022-16tb7p-filter-v6-dd.min-mar2023.unmin.high-simple-eval-v4.binpack - /data/hse/S7/test79-may2022-16tb7p-filter-v6-dd.min-mar2023.unmin.high-simple-eval-v4.binpack - /data/hse/S7/test80-may2022-16tb7p.high-simple-eval-v4.binpack - /data/hse/S7/test80-jun2022-16tb7p-filter-v6-dd.min-mar2023.unmin.high-simple-eval-v4.binpack - /data/hse/S7/test80-jul2022-16tb7p.v6-dd.min.high-simple-eval-v4.binpack - /data/hse/S7/test80-aug2022-16tb7p-filter-v6-dd.min-mar2023.unmin.high-simple-eval-v4.binpack - /data/hse/S7/test80-sep2022-16tb7p-filter-v6-dd.min-mar2023.unmin.high-simple-eval-v4.binpack - /data/hse/S7/test80-oct2022-16tb7p.v6-dd.high-simple-eval-v4.binpack - /data/hse/S7/test80-nov2022-16tb7p-v6-dd.min.high-simple-eval-v4.binpack - /data/hse/S7/test80-jan2023-3of3-16tb7p-filter-v6-dd.min-mar2023.unmin.high-simple-eval-v4.binpack - /data/hse/S7/test80-feb2023-16tb7p-filter-v6-dd.min-mar2023.unmin.high-simple-eval-v4.binpack - /data/hse/S7/test80-mar2023-2tb7p.v6-sk16.min.high-simple-eval-v4.binpack - /data/hse/S7/test80-apr2023-2tb7p-filter-v6-sk16.min.high-simple-eval-v4.binpack - /data/hse/S7/test80-may2023-2tb7p.v6.min.high-simple-eval-v4.binpack - /data/hse/S7/test80-jun2023-2tb7p.v6-3072.min.high-simple-eval-v4.binpack - /data/hse/S7/test80-jul2023-2tb7p.v6-3072.min.high-simple-eval-v4.binpack - /data/hse/S7/test80-aug2023-2tb7p.v6.min.high-simple-eval-v4.binpack - /data/hse/S7/test80-sep2023-2tb7p.high-simple-eval-v4.binpack - /data/hse/S7/test80-oct2023-2tb7p.high-simple-eval-v4.binpack wld-fen-skipping: False start-from-engine-test-net: False nnue-pytorch-branch: linrock/nnue-pytorch/L1-128 engine-test-branch: linrock/Stockfish/L1-128-nolazy engine-base-branch: linrock/Stockfish/L1-128 num-epochs: 500 start-lambda: 1.0 end-lambda: 1.0 ``` Experiment yaml configs converted to easy_train.sh commands with: https://github.com/linrock/nnue-tools/blob/4339954/yaml_easy_train.py Binpacks interleaved at training time with: official-stockfish/nnue-pytorch#259 FT weights permuted with 10k positions from fishpack32.binpack with: official-stockfish/nnue-pytorch#254 Data filtered for high simple eval positions (v4) with: https://github.com/linrock/Stockfish/blob/b9c8440/src/tools/transform.cpp#L640-L675 Training data can be found at: https://robotmoon.com/nnue-training-data/ Local elo at 25k nodes per move of L1-128 smallnet (nnue-only eval) vs. L1-128 trained on standard S1 data: nn-epoch319.nnue : -241.7 +/- 3.2 Passed STC vs. 36db936: https://tests.stockfishchess.org/tests/view/6576b3484d789acf40aabbfe LLR: 2.94 (-2.94,2.94) <0.00,2.00> Total: 21920 W: 5680 L: 5381 D: 10859 Ptnml(0-2): 82, 2488, 5520, 2789, 81 Passed LTC vs. DualNNUE official-stockfish#4915: https://tests.stockfishchess.org/tests/view/65775c034d789acf40aac7e3 LLR: 2.95 (-2.94,2.94) <0.50,2.50> Total: 147606 W: 36619 L: 36063 D: 74924 Ptnml(0-2): 98, 16591, 39891, 17103, 120 closes official-stockfish#4919 Bench: 1438336
This cleans up the EvalFile handling after the merge of official-stockfish#4915, which has become a bit confusing on what it is actually doing. closes official-stockfish#4971 No functional change
btw master is quite a bit faster after the merge of dualnet |
Created by training an L1-128 net from scratch with a wider range of evals in the training data and wld-fen-skipping disabled during training. The differences in this training data compared to the first dual nnue PR are: - removal of all positions with 3 pieces - when piece count >= 16, keep positions with simple eval above 750 - when piece count < 16, remove positions with simple eval above 3000 The asymmetric data filtering was meant to flatten the training data piece count distribution, which was previously heavily skewed towards positions with low piece counts. Additionally, the simple eval range where the smallnet is used was widened to cover more positions previously evaluated by the big net and simple eval. ```yaml experiment-name: 128--S1-hse-S7-v4-S3-v1-no-wld-skip training-dataset: - /data/hse/S3/leela96-filt-v2.min.high-simple-eval-1k.binpack - /data/hse/S3/dfrc99-16tb7p-eval-filt-v2.min.high-simple-eval-1k.binpack - /data/hse/S3/test80-apr2022-16tb7p.min.high-simple-eval-1k.binpack - /data/hse/S7/test60-2020-2tb7p.v6-3072.high-simple-eval-v4.binpack - /data/hse/S7/test60-novdec2021-12tb7p-filter-v6-dd.min-mar2023.unmin.high-simple-eval-v4.binpack - /data/hse/S7/test77-nov2021-2tb7p.v6-3072.min.high-simple-eval-v4.binpack - /data/hse/S7/test77-dec2021-16tb7p-filter-v6-dd.min-mar2023.unmin.high-simple-eval-v4.binpack - /data/hse/S7/test77-jan2022-2tb7p.high-simple-eval-v4.binpack - /data/hse/S7/test78-jantomay2022-16tb7p-filter-v6-dd.min-mar2023.unmin.high-simple-eval-v4.binpack - /data/hse/S7/test78-juntosep2022-16tb7p-filter-v6-dd.min-mar2023.unmin.high-simple-eval-v4.binpack - /data/hse/S7/test79-apr2022-16tb7p-filter-v6-dd.min-mar2023.unmin.high-simple-eval-v4.binpack - /data/hse/S7/test79-may2022-16tb7p-filter-v6-dd.min-mar2023.unmin.high-simple-eval-v4.binpack - /data/hse/S7/test80-may2022-16tb7p.high-simple-eval-v4.binpack - /data/hse/S7/test80-jun2022-16tb7p-filter-v6-dd.min-mar2023.unmin.high-simple-eval-v4.binpack - /data/hse/S7/test80-jul2022-16tb7p.v6-dd.min.high-simple-eval-v4.binpack - /data/hse/S7/test80-aug2022-16tb7p-filter-v6-dd.min-mar2023.unmin.high-simple-eval-v4.binpack - /data/hse/S7/test80-sep2022-16tb7p-filter-v6-dd.min-mar2023.unmin.high-simple-eval-v4.binpack - /data/hse/S7/test80-oct2022-16tb7p.v6-dd.high-simple-eval-v4.binpack - /data/hse/S7/test80-nov2022-16tb7p-v6-dd.min.high-simple-eval-v4.binpack - /data/hse/S7/test80-jan2023-3of3-16tb7p-filter-v6-dd.min-mar2023.unmin.high-simple-eval-v4.binpack - /data/hse/S7/test80-feb2023-16tb7p-filter-v6-dd.min-mar2023.unmin.high-simple-eval-v4.binpack - /data/hse/S7/test80-mar2023-2tb7p.v6-sk16.min.high-simple-eval-v4.binpack - /data/hse/S7/test80-apr2023-2tb7p-filter-v6-sk16.min.high-simple-eval-v4.binpack - /data/hse/S7/test80-may2023-2tb7p.v6.min.high-simple-eval-v4.binpack - /data/hse/S7/test80-jun2023-2tb7p.v6-3072.min.high-simple-eval-v4.binpack - /data/hse/S7/test80-jul2023-2tb7p.v6-3072.min.high-simple-eval-v4.binpack - /data/hse/S7/test80-aug2023-2tb7p.v6.min.high-simple-eval-v4.binpack - /data/hse/S7/test80-sep2023-2tb7p.high-simple-eval-v4.binpack - /data/hse/S7/test80-oct2023-2tb7p.high-simple-eval-v4.binpack wld-fen-skipping: False start-from-engine-test-net: False nnue-pytorch-branch: linrock/nnue-pytorch/L1-128 engine-test-branch: linrock/Stockfish/L1-128-nolazy engine-base-branch: linrock/Stockfish/L1-128 num-epochs: 500 start-lambda: 1.0 end-lambda: 1.0 ``` Experiment yaml configs converted to easy_train.sh commands with: https://github.com/linrock/nnue-tools/blob/4339954/yaml_easy_train.py Binpacks interleaved at training time with: official-stockfish/nnue-pytorch#259 FT weights permuted with 10k positions from fishpack32.binpack with: official-stockfish/nnue-pytorch#254 Data filtered for high simple eval positions (v4) with: https://github.com/linrock/Stockfish/blob/b9c8440/src/tools/transform.cpp#L640-L675 Training data can be found at: https://robotmoon.com/nnue-training-data/ Local elo at 25k nodes per move of L1-128 smallnet (nnue-only eval) vs. L1-128 trained on standard S1 data: nn-epoch319.nnue : -241.7 +/- 3.2 Passed STC vs. 36db936: https://tests.stockfishchess.org/tests/view/6576b3484d789acf40aabbfe LLR: 2.94 (-2.94,2.94) <0.00,2.00> Total: 21920 W: 5680 L: 5381 D: 10859 Ptnml(0-2): 82, 2488, 5520, 2789, 81 Passed LTC vs. DualNNUE official-stockfish#4915: https://tests.stockfishchess.org/tests/view/65775c034d789acf40aac7e3 LLR: 2.95 (-2.94,2.94) <0.50,2.50> Total: 147606 W: 36619 L: 36063 D: 74924 Ptnml(0-2): 98, 16591, 39891, 17103, 120 closes official-stockfish#4919 Bench: 1438336
Created by training an L1-128 net from scratch with a wider range of evals in the training data and wld-fen-skipping disabled during training. The differences in this training data compared to the first dual nnue PR are: - removal of all positions with 3 pieces - when piece count >= 16, keep positions with simple eval above 750 - when piece count < 16, remove positions with simple eval above 3000 The asymmetric data filtering was meant to flatten the training data piece count distribution, which was previously heavily skewed towards positions with low piece counts. Additionally, the simple eval range where the smallnet is used was widened to cover more positions previously evaluated by the big net and simple eval. ```yaml experiment-name: 128--S1-hse-S7-v4-S3-v1-no-wld-skip training-dataset: - /data/hse/S3/leela96-filt-v2.min.high-simple-eval-1k.binpack - /data/hse/S3/dfrc99-16tb7p-eval-filt-v2.min.high-simple-eval-1k.binpack - /data/hse/S3/test80-apr2022-16tb7p.min.high-simple-eval-1k.binpack - /data/hse/S7/test60-2020-2tb7p.v6-3072.high-simple-eval-v4.binpack - /data/hse/S7/test60-novdec2021-12tb7p-filter-v6-dd.min-mar2023.unmin.high-simple-eval-v4.binpack - /data/hse/S7/test77-nov2021-2tb7p.v6-3072.min.high-simple-eval-v4.binpack - /data/hse/S7/test77-dec2021-16tb7p-filter-v6-dd.min-mar2023.unmin.high-simple-eval-v4.binpack - /data/hse/S7/test77-jan2022-2tb7p.high-simple-eval-v4.binpack - /data/hse/S7/test78-jantomay2022-16tb7p-filter-v6-dd.min-mar2023.unmin.high-simple-eval-v4.binpack - /data/hse/S7/test78-juntosep2022-16tb7p-filter-v6-dd.min-mar2023.unmin.high-simple-eval-v4.binpack - /data/hse/S7/test79-apr2022-16tb7p-filter-v6-dd.min-mar2023.unmin.high-simple-eval-v4.binpack - /data/hse/S7/test79-may2022-16tb7p-filter-v6-dd.min-mar2023.unmin.high-simple-eval-v4.binpack - /data/hse/S7/test80-may2022-16tb7p.high-simple-eval-v4.binpack - /data/hse/S7/test80-jun2022-16tb7p-filter-v6-dd.min-mar2023.unmin.high-simple-eval-v4.binpack - /data/hse/S7/test80-jul2022-16tb7p.v6-dd.min.high-simple-eval-v4.binpack - /data/hse/S7/test80-aug2022-16tb7p-filter-v6-dd.min-mar2023.unmin.high-simple-eval-v4.binpack - /data/hse/S7/test80-sep2022-16tb7p-filter-v6-dd.min-mar2023.unmin.high-simple-eval-v4.binpack - /data/hse/S7/test80-oct2022-16tb7p.v6-dd.high-simple-eval-v4.binpack - /data/hse/S7/test80-nov2022-16tb7p-v6-dd.min.high-simple-eval-v4.binpack - /data/hse/S7/test80-jan2023-3of3-16tb7p-filter-v6-dd.min-mar2023.unmin.high-simple-eval-v4.binpack - /data/hse/S7/test80-feb2023-16tb7p-filter-v6-dd.min-mar2023.unmin.high-simple-eval-v4.binpack - /data/hse/S7/test80-mar2023-2tb7p.v6-sk16.min.high-simple-eval-v4.binpack - /data/hse/S7/test80-apr2023-2tb7p-filter-v6-sk16.min.high-simple-eval-v4.binpack - /data/hse/S7/test80-may2023-2tb7p.v6.min.high-simple-eval-v4.binpack - /data/hse/S7/test80-jun2023-2tb7p.v6-3072.min.high-simple-eval-v4.binpack - /data/hse/S7/test80-jul2023-2tb7p.v6-3072.min.high-simple-eval-v4.binpack - /data/hse/S7/test80-aug2023-2tb7p.v6.min.high-simple-eval-v4.binpack - /data/hse/S7/test80-sep2023-2tb7p.high-simple-eval-v4.binpack - /data/hse/S7/test80-oct2023-2tb7p.high-simple-eval-v4.binpack wld-fen-skipping: False start-from-engine-test-net: False nnue-pytorch-branch: linrock/nnue-pytorch/L1-128 engine-test-branch: linrock/Stockfish/L1-128-nolazy engine-base-branch: linrock/Stockfish/L1-128 num-epochs: 500 start-lambda: 1.0 end-lambda: 1.0 ``` Experiment yaml configs converted to easy_train.sh commands with: https://github.com/linrock/nnue-tools/blob/4339954/yaml_easy_train.py Binpacks interleaved at training time with: official-stockfish/nnue-pytorch#259 FT weights permuted with 10k positions from fishpack32.binpack with: official-stockfish/nnue-pytorch#254 Data filtered for high simple eval positions (v4) with: https://github.com/linrock/Stockfish/blob/b9c8440/src/tools/transform.cpp#L640-L675 Training data can be found at: https://robotmoon.com/nnue-training-data/ Local elo at 25k nodes per move of L1-128 smallnet (nnue-only eval) vs. L1-128 trained on standard S1 data: nn-epoch319.nnue : -241.7 +/- 3.2 Passed STC vs. 36db936: https://tests.stockfishchess.org/tests/view/6576b3484d789acf40aabbfe LLR: 2.94 (-2.94,2.94) <0.00,2.00> Total: 21920 W: 5680 L: 5381 D: 10859 Ptnml(0-2): 82, 2488, 5520, 2789, 81 Passed LTC vs. DualNNUE official-stockfish#4915: https://tests.stockfishchess.org/tests/view/65775c034d789acf40aac7e3 LLR: 2.95 (-2.94,2.94) <0.50,2.50> Total: 147606 W: 36619 L: 36063 D: 74924 Ptnml(0-2): 98, 16591, 39891, 17103, 120 closes official-stockfish#4919 Bench: 1438336
This cleans up the EvalFile handling after the merge of official-stockfish#4915, which has become a bit confusing on what it is actually doing. closes official-stockfish#4971 No functional change
Created by training an L1-128 net from scratch with a wider range of evals in the training data and wld-fen-skipping disabled during training. The differences in this training data compared to the first dual nnue PR are: - removal of all positions with 3 pieces - when piece count >= 16, keep positions with simple eval above 750 - when piece count < 16, remove positions with simple eval above 3000 The asymmetric data filtering was meant to flatten the training data piece count distribution, which was previously heavily skewed towards positions with low piece counts. Additionally, the simple eval range where the smallnet is used was widened to cover more positions previously evaluated by the big net and simple eval. ```yaml experiment-name: 128--S1-hse-S7-v4-S3-v1-no-wld-skip training-dataset: - /data/hse/S3/leela96-filt-v2.min.high-simple-eval-1k.binpack - /data/hse/S3/dfrc99-16tb7p-eval-filt-v2.min.high-simple-eval-1k.binpack - /data/hse/S3/test80-apr2022-16tb7p.min.high-simple-eval-1k.binpack - /data/hse/S7/test60-2020-2tb7p.v6-3072.high-simple-eval-v4.binpack - /data/hse/S7/test60-novdec2021-12tb7p-filter-v6-dd.min-mar2023.unmin.high-simple-eval-v4.binpack - /data/hse/S7/test77-nov2021-2tb7p.v6-3072.min.high-simple-eval-v4.binpack - /data/hse/S7/test77-dec2021-16tb7p-filter-v6-dd.min-mar2023.unmin.high-simple-eval-v4.binpack - /data/hse/S7/test77-jan2022-2tb7p.high-simple-eval-v4.binpack - /data/hse/S7/test78-jantomay2022-16tb7p-filter-v6-dd.min-mar2023.unmin.high-simple-eval-v4.binpack - /data/hse/S7/test78-juntosep2022-16tb7p-filter-v6-dd.min-mar2023.unmin.high-simple-eval-v4.binpack - /data/hse/S7/test79-apr2022-16tb7p-filter-v6-dd.min-mar2023.unmin.high-simple-eval-v4.binpack - /data/hse/S7/test79-may2022-16tb7p-filter-v6-dd.min-mar2023.unmin.high-simple-eval-v4.binpack - /data/hse/S7/test80-may2022-16tb7p.high-simple-eval-v4.binpack - /data/hse/S7/test80-jun2022-16tb7p-filter-v6-dd.min-mar2023.unmin.high-simple-eval-v4.binpack - /data/hse/S7/test80-jul2022-16tb7p.v6-dd.min.high-simple-eval-v4.binpack - /data/hse/S7/test80-aug2022-16tb7p-filter-v6-dd.min-mar2023.unmin.high-simple-eval-v4.binpack - /data/hse/S7/test80-sep2022-16tb7p-filter-v6-dd.min-mar2023.unmin.high-simple-eval-v4.binpack - /data/hse/S7/test80-oct2022-16tb7p.v6-dd.high-simple-eval-v4.binpack - /data/hse/S7/test80-nov2022-16tb7p-v6-dd.min.high-simple-eval-v4.binpack - /data/hse/S7/test80-jan2023-3of3-16tb7p-filter-v6-dd.min-mar2023.unmin.high-simple-eval-v4.binpack - /data/hse/S7/test80-feb2023-16tb7p-filter-v6-dd.min-mar2023.unmin.high-simple-eval-v4.binpack - /data/hse/S7/test80-mar2023-2tb7p.v6-sk16.min.high-simple-eval-v4.binpack - /data/hse/S7/test80-apr2023-2tb7p-filter-v6-sk16.min.high-simple-eval-v4.binpack - /data/hse/S7/test80-may2023-2tb7p.v6.min.high-simple-eval-v4.binpack - /data/hse/S7/test80-jun2023-2tb7p.v6-3072.min.high-simple-eval-v4.binpack - /data/hse/S7/test80-jul2023-2tb7p.v6-3072.min.high-simple-eval-v4.binpack - /data/hse/S7/test80-aug2023-2tb7p.v6.min.high-simple-eval-v4.binpack - /data/hse/S7/test80-sep2023-2tb7p.high-simple-eval-v4.binpack - /data/hse/S7/test80-oct2023-2tb7p.high-simple-eval-v4.binpack wld-fen-skipping: False start-from-engine-test-net: False nnue-pytorch-branch: linrock/nnue-pytorch/L1-128 engine-test-branch: linrock/Stockfish/L1-128-nolazy engine-base-branch: linrock/Stockfish/L1-128 num-epochs: 500 start-lambda: 1.0 end-lambda: 1.0 ``` Experiment yaml configs converted to easy_train.sh commands with: https://github.com/linrock/nnue-tools/blob/4339954/yaml_easy_train.py Binpacks interleaved at training time with: official-stockfish/nnue-pytorch#259 FT weights permuted with 10k positions from fishpack32.binpack with: official-stockfish/nnue-pytorch#254 Data filtered for high simple eval positions (v4) with: https://github.com/linrock/Stockfish/blob/b9c8440/src/tools/transform.cpp#L640-L675 Training data can be found at: https://robotmoon.com/nnue-training-data/ Local elo at 25k nodes per move of L1-128 smallnet (nnue-only eval) vs. L1-128 trained on standard S1 data: nn-epoch319.nnue : -241.7 +/- 3.2 Passed STC vs. 36db936: https://tests.stockfishchess.org/tests/view/6576b3484d789acf40aabbfe LLR: 2.94 (-2.94,2.94) <0.00,2.00> Total: 21920 W: 5680 L: 5381 D: 10859 Ptnml(0-2): 82, 2488, 5520, 2789, 81 Passed LTC vs. DualNNUE official-stockfish#4915: https://tests.stockfishchess.org/tests/view/65775c034d789acf40aac7e3 LLR: 2.95 (-2.94,2.94) <0.50,2.50> Total: 147606 W: 36619 L: 36063 D: 74924 Ptnml(0-2): 98, 16591, 39891, 17103, 120 closes official-stockfish#4919 Bench: 1438336
This cleans up the EvalFile handling after the merge of official-stockfish#4915, which has become a bit confusing on what it is actually doing. closes official-stockfish#4971 No functional change
Stockfish will use more than one net with official-stockfish/Stockfish#4915 Get all the couples "option" and "net names" from the output of the `stockfish uci` command. Raise worker version to 227 (also server side)
Stockfish will use more than one net with official-stockfish/Stockfish#4915 Get all the couples "option" and "net names" from the output of the `stockfish uci` command. Raise worker version to 227 (also server side)
Stockfish will use more than one net with official-stockfish/Stockfish#4915 Get all the couples "option" and "net names" from the output of the `stockfish uci` command. Raise worker version to 227 (also server side)
Required by the new SF architecture with a Big + Small nets official-stockfish/Stockfish#4915 official-stockfish/Stockfish#5068
Required by the new SF architecture with a Big + Small nets official-stockfish/Stockfish#4915 official-stockfish/Stockfish#5068
Required by the new SF architecture with a Big + Small nets official-stockfish/Stockfish#4915 official-stockfish/Stockfish#5068
Required by the new SF architecture with a Big + Small nets official-stockfish/Stockfish#4915 official-stockfish/Stockfish#5068
Required by the new SF architecture with a Big + Small nets official-stockfish/Stockfish#4915 official-stockfish/Stockfish#5068
Required by the new SF architecture with multiple nets, see: official-stockfish/Stockfish#4915 official-stockfish/Stockfish#5068
Required by the new SF architecture with multiple nets, see: official-stockfish/Stockfish#4915 official-stockfish/Stockfish#5068
Required by the new SF architecture with multiple nets, see: official-stockfish/Stockfish#4915 official-stockfish/Stockfish#5068
Required by the new SF architecture with multiple nets, see: official-stockfish/Stockfish#4915 official-stockfish/Stockfish#5068
Required by the new SF architecture with multiple nets, see: official-stockfish/Stockfish#4915 official-stockfish/Stockfish#5068 Also: - fix the increment of the download of the nets - improve the net code to deal with a develepment server built from scratch
Required by the new SF architecture with multiple nets, see: official-stockfish/Stockfish#4915 official-stockfish/Stockfish#5068 Also: - increment the download of the net only for a API request - improve the net code to deal with a develepment server built from scratch
Required by the new SF architecture with multiple nets, see: official-stockfish/Stockfish#4915 official-stockfish/Stockfish#5068 Also: - increment the download of the net only for a API request - improve the net code to deal with a devel0pment server built from scratch
Required by the new SF architecture with multiple nets, see: official-stockfish/Stockfish#4915 official-stockfish/Stockfish#5068 Also: - increment the download of the net only for a API request - improve the net code to deal with a devel0pment server built from scratch
Continuing from PR official-stockfish#4968, this update improves how Stockfish handles network usage, making it easier to manage and modify networks in the future. With the introduction of a dedicated Network class, creating networks has become straightforward. See uci.cpp: ```cpp NN::NetworkBig({EvalFileDefaultNameBig, "None", ""}, NN::embeddedNNUEBig) ``` The new `Network` encapsulates all network-related logic, significantly reducing the complexity previously required to support multiple network types, such as the distinction between small and big networks official-stockfish#4915. _This PR builds on top of official-stockfish#5098 to avoid merge conflicts._ Non-Regression STC: https://tests.stockfishchess.org/tests/view/65edd26c0ec64f0526c43584 LLR: 2.94 (-2.94,2.94) <-1.75,0.25> Total: 33760 W: 8887 L: 8661 D: 16212 Ptnml(0-2): 143, 3795, 8808, 3961, 173 Non-Regression SMP STC: https://tests.stockfishchess.org/tests/view/65ed71970ec64f0526c42fdd LLR: 2.96 (-2.94,2.94) <-1.75,0.25> Total: 59088 W: 15121 L: 14931 D: 29036 Ptnml(0-2): 110, 6640, 15829, 6880, 85 Compiled with `make -j profile-build` ``` bash ./bench_parallel.sh ./stockfish ./stockfish-nnue 13 50 sf_base = 1568540 +/- 7637 (95%) sf_test = 1573129 +/- 7301 (95%) diff = 4589 +/- 8720 (95%) speedup = 0.29260% +/- 0.556% (95%) ``` Compiled with `make -j build` ``` bash ./bench_parallel.sh ./stockfish ./stockfish-nnue 13 50 sf_base = 1472653 +/- 7293 (95%) sf_test = 1491928 +/- 7661 (95%) diff = 19275 +/- 7154 (95%) speedup = 1.30886% +/- 0.486% (95%) ``` closes official-stockfish#5100 No functional change
Continuing from PR official-stockfish#4968, this update improves how Stockfish handles network usage, making it easier to manage and modify networks in the future. With the introduction of a dedicated Network class, creating networks has become straightforward. See uci.cpp: ```cpp NN::NetworkBig({EvalFileDefaultNameBig, "None", ""}, NN::embeddedNNUEBig) ``` The new `Network` encapsulates all network-related logic, significantly reducing the complexity previously required to support multiple network types, such as the distinction between small and big networks official-stockfish#4915. _This PR builds on top of official-stockfish#5098 to avoid merge conflicts._ Non-Regression STC: https://tests.stockfishchess.org/tests/view/65edd26c0ec64f0526c43584 LLR: 2.94 (-2.94,2.94) <-1.75,0.25> Total: 33760 W: 8887 L: 8661 D: 16212 Ptnml(0-2): 143, 3795, 8808, 3961, 173 Non-Regression SMP STC: https://tests.stockfishchess.org/tests/view/65ed71970ec64f0526c42fdd LLR: 2.96 (-2.94,2.94) <-1.75,0.25> Total: 59088 W: 15121 L: 14931 D: 29036 Ptnml(0-2): 110, 6640, 15829, 6880, 85 Compiled with `make -j profile-build` ``` bash ./bench_parallel.sh ./stockfish ./stockfish-nnue 13 50 sf_base = 1568540 +/- 7637 (95%) sf_test = 1573129 +/- 7301 (95%) diff = 4589 +/- 8720 (95%) speedup = 0.29260% +/- 0.556% (95%) ``` Compiled with `make -j build` ``` bash ./bench_parallel.sh ./stockfish ./stockfish-nnue 13 50 sf_base = 1472653 +/- 7293 (95%) sf_test = 1491928 +/- 7661 (95%) diff = 19275 +/- 7154 (95%) speedup = 1.30886% +/- 0.486% (95%) ``` closes official-stockfish#5100 No functional change
Continuing from PR #4968, this update improves how Stockfish handles network usage, making it easier to manage and modify networks in the future. With the introduction of a dedicated Network class, creating networks has become straightforward. See uci.cpp: ```cpp NN::NetworkBig({EvalFileDefaultNameBig, "None", ""}, NN::embeddedNNUEBig) ``` The new `Network` encapsulates all network-related logic, significantly reducing the complexity previously required to support multiple network types, such as the distinction between small and big networks #4915. Non-Regression STC: https://tests.stockfishchess.org/tests/view/65edd26c0ec64f0526c43584 LLR: 2.94 (-2.94,2.94) <-1.75,0.25> Total: 33760 W: 8887 L: 8661 D: 16212 Ptnml(0-2): 143, 3795, 8808, 3961, 173 Non-Regression SMP STC: https://tests.stockfishchess.org/tests/view/65ed71970ec64f0526c42fdd LLR: 2.96 (-2.94,2.94) <-1.75,0.25> Total: 59088 W: 15121 L: 14931 D: 29036 Ptnml(0-2): 110, 6640, 15829, 6880, 85 Compiled with `make -j profile-build` ``` bash ./bench_parallel.sh ./stockfish ./stockfish-nnue 13 50 sf_base = 1568540 +/- 7637 (95%) sf_test = 1573129 +/- 7301 (95%) diff = 4589 +/- 8720 (95%) speedup = 0.29260% +/- 0.556% (95%) ``` Compiled with `make -j build` ``` bash ./bench_parallel.sh ./stockfish ./stockfish-nnue 13 50 sf_base = 1472653 +/- 7293 (95%) sf_test = 1491928 +/- 7661 (95%) diff = 19275 +/- 7154 (95%) speedup = 1.30886% +/- 0.486% (95%) ``` closes #5100 No functional change
Continuing from PR official-stockfish#4968, this update improves how Stockfish handles network usage, making it easier to manage and modify networks in the future. With the introduction of a dedicated Network class, creating networks has become straightforward. See uci.cpp: ```cpp NN::NetworkBig({EvalFileDefaultNameBig, "None", ""}, NN::embeddedNNUEBig) ``` The new `Network` encapsulates all network-related logic, significantly reducing the complexity previously required to support multiple network types, such as the distinction between small and big networks official-stockfish#4915. Non-Regression STC: https://tests.stockfishchess.org/tests/view/65edd26c0ec64f0526c43584 LLR: 2.94 (-2.94,2.94) <-1.75,0.25> Total: 33760 W: 8887 L: 8661 D: 16212 Ptnml(0-2): 143, 3795, 8808, 3961, 173 Non-Regression SMP STC: https://tests.stockfishchess.org/tests/view/65ed71970ec64f0526c42fdd LLR: 2.96 (-2.94,2.94) <-1.75,0.25> Total: 59088 W: 15121 L: 14931 D: 29036 Ptnml(0-2): 110, 6640, 15829, 6880, 85 Compiled with `make -j profile-build` ``` bash ./bench_parallel.sh ./stockfish ./stockfish-nnue 13 50 sf_base = 1568540 +/- 7637 (95%) sf_test = 1573129 +/- 7301 (95%) diff = 4589 +/- 8720 (95%) speedup = 0.29260% +/- 0.556% (95%) ``` Compiled with `make -j build` ``` bash ./bench_parallel.sh ./stockfish ./stockfish-nnue 13 50 sf_base = 1472653 +/- 7293 (95%) sf_test = 1491928 +/- 7661 (95%) diff = 19275 +/- 7154 (95%) speedup = 1.30886% +/- 0.486% (95%) ``` closes official-stockfish#5100 No functional change
Credit goes to @mstembera for:
The L1-128 smallnet is:
Experiment yaml configs converted to easy_train.sh commands with:
https://github.com/linrock/nnue-tools/blob/4339954/yaml_easy_train.py
Binpacks interleaved at training time with:
official-stockfish/nnue-pytorch#259
Data filtered for high simple eval positions with:
https://github.com/linrock/nnue-data/blob/32d6a68/filter_high_simple_eval_plain.py
https://github.com/linrock/Stockfish/blob/61dbfe/src/tools/transform.cpp#L626-L655
Training data can be found at:
https://robotmoon.com/nnue-training-data/
Local elo at 25k nodes per move of
L1-128 smallnet (nnue-only eval) vs. L1-128 trained on standard S1 data:
nn-epoch399.nnue : -318.1 +/- 2.1
Passed STC:
https://tests.stockfishchess.org/tests/view/6574cb9d95ea6ba1fcd49e3b
LLR: 2.93 (-2.94,2.94) <0.00,2.00>
Total: 62432 W: 15875 L: 15521 D: 31036
Ptnml(0-2): 177, 7331, 15872, 7633, 203
Passed LTC:
https://tests.stockfishchess.org/tests/view/6575da2d4d789acf40aaac6e
LLR: 2.94 (-2.94,2.94) <0.50,2.50>
Total: 64830 W: 16118 L: 15738 D: 32974
Ptnml(0-2): 43, 7129, 17697, 7497, 49
Bench: 1397884
Co-Authored-By: mstembera