Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

understanding the weight learning algorithm #18

Open
gittea-rpi opened this issue Oct 8, 2024 · 0 comments
Open

understanding the weight learning algorithm #18

gittea-rpi opened this issue Oct 8, 2024 · 0 comments

Comments

@gittea-rpi
Copy link

In the paper, the weights are the solution to equation (8), which minimizes the squared frobenius norms of the weighted RFF covariance matrices for each pair of features, subject to the constraint that the weights are a probability distribution.

In the code, the weight_learner function solves this problem (?) by using gradient descent on a modified objective that combines the squared frobenius norms of the weighted RFF covariance matrices and a lp norm of the weight vector. What is the purpose of the lp norm on the weight vector (which is already created using softmax on logits, so it is a probability vector)?

Does this somehow ensure that the logits don't go off to infinity? If that is the aim, why not directly regularize by the size of the logits?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant