Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clean up NNUE template parameters #5584

Closed
wants to merge 1 commit into from

Conversation

MinetaS
Copy link
Contributor

@MinetaS MinetaS commented Sep 10, 2024

Having multiple template parameters might lead to confusion that unallowed combinations are possible to use, such as:

FeatureTransformer<TransformedFeatureDimensionsSmall, &StateInfo::accumulatorBig>

By grouping the parameters into a single struct, the code becomes more clear, comprehensible, and easier to maintain.

No functional change

@MinetaS MinetaS mentioned this pull request Sep 10, 2024
@MinetaS MinetaS force-pushed the 2680c9c7-2-PR branch 2 times, most recently from 3215928 to 132d421 Compare September 10, 2024 18:07
Having multiple template parameters might lead to confusion that
unallowed combinations are possible to use, such as:

FeatureTransformer<TransformedFeatureDimensionsSmall, &StateInfo::accumulatorBig>

By grouping the parameters into a single struct, the code becomes more
clear, comprehensible, and easier to maintain.

No functional change
@vondele
Copy link
Member

vondele commented Sep 10, 2024

I think that's the right direction, I would still suggest to run a non-regression test for these kind of cleanups.

@Sopel97
Copy link
Member

Sopel97 commented Sep 11, 2024

I don't like how the FeatureTransformer is getting information about the whole rest of the network architecture. The code always looked like it's something special but it's just a normal layer in the network, it just happens to be doing some more caching. Other than that looks good.

@MinetaS
Copy link
Contributor Author

MinetaS commented Sep 11, 2024

I don't like how the FeatureTransformer is getting information about the whole rest of the network architecture. The code always looked like it's something special but it's just a normal layer in the network, it just happens to be doing some more caching. Other than that looks good.

I agree but that also suggests that the current code leads to some confusion. The accumluator arrays are tied to the FeatureTransformer and are defined per each net type. From my perspective, having the FT take a pointer-to-member does not make it up being a generic layer as it relies on specific members of StateInfo struct.

Let me know if this looks right to you:

template<IndexType _L1, int _L2, int _L3, Accumulator<_L1> StateInfo::*_accPtr>
struct NetworkType {
    static constexpr IndexType L1 = _L1;
    static constexpr int       L2 = _L2;
    static constexpr int       L3 = _L3;

    struct FeatureTransformerType {
       static constexpr IndexType TransformedFeatureDimensions = L1;
       static constexpr Accumulator<TransformedFeatureDimensions> StateInfo::*accPtr = _accPtr;
    };
};

and

    using Transformer = FeatureTransformer<typename Type::FeatureTransformerType>;

@Sopel97
Copy link
Member

Sopel97 commented Sep 11, 2024

I was thinking simply

template<IndexType _L1, int _L2, int _L3, Accumulator<_L1> StateInfo::*_accPtr>
struct NetworkType {
    static constexpr IndexType L1 = _L1;
    static constexpr int       L2 = _L2;
    static constexpr int       L3 = _L3;

    static constexpr Accumulator<L1> StateInfo::*accPtr = _accPtr;
};

...
using Transformer = FeatureTransformer<Type::L1, Type::accPtr>;

@MinetaS
Copy link
Contributor Author

MinetaS commented Sep 11, 2024

I was thinking simply

template<IndexType _L1, int _L2, int _L3, Accumulator<_L1> StateInfo::*_accPtr>
struct NetworkType {
    static constexpr IndexType L1 = _L1;
    static constexpr int       L2 = _L2;
    static constexpr int       L3 = _L3;

    static constexpr Accumulator<L1> StateInfo::*accPtr = _accPtr;
};

...
using Transformer = FeatureTransformer<Type::L1, Type::accPtr>;

As I mentioned in the commit message, two template parameters are grouped because they are not independent. It's never allowed to have FeatureTransformer<TransformedFeatureDimensionsBig, &StateInfo::accumulatorSmall> or vice versa (there can be more unallowed combinations i.e. linrock's triple NNUE experiments). I believe going back to this implementation eliminates the whole purpose of this PR.

@Sopel97
Copy link
Member

Sopel97 commented Sep 11, 2024

You can't exclude all buggy code at compile time. Still nothing would prevent creating two networks that point to the same accumulator.

I don't think this one case of keeping the parameters inseparable warrants closely coupling FeatureTransformer with the whole network. You should be able to create the FeatureTransformer as a standalone layer without having to specify anything about other layers. And I think the PR has a lot of merit outside of this small change.

@MinetaS
Copy link
Contributor Author

MinetaS commented Sep 11, 2024

Ok, now I understand that. Although revamping FT template parameters was the main purpose of this PR, there seems to be other kinds of benefit this PR brings out.

@MinetaS
Copy link
Contributor Author

MinetaS commented Sep 12, 2024

The revised code failed to pass the regression test.
https://tests.stockfishchess.org/tests/view/66e1e33686d5ee47d953a5a1

The initial version test is likely going to pass the test:
https://tests.stockfishchess.org/tests/view/66e0989486d5ee47d953a45d

I believe this is false negative, but let me know how to proceed on this.

@Disservin
Copy link
Member

I'd say give the revised one a rerun

@MinetaS
Copy link
Contributor Author

MinetaS commented Sep 13, 2024

Both tests have failed, so closing this for now. Something affects the compiler optimization.

@MinetaS MinetaS closed this Sep 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants