Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEA] TSNE - Kullback Leiback Divergence for early stopping #863

Open
danielhanchen opened this issue Jul 19, 2019 · 8 comments
Open

[FEA] TSNE - Kullback Leiback Divergence for early stopping #863

danielhanchen opened this issue Jul 19, 2019 · 8 comments
Labels
? - Needs Triage Need team to review and classify feature request New feature or request

Comments

@danielhanchen
Copy link
Contributor

Currently only the gradient norm is used for early stopping. However, one can calculate the actual Kullback Leiback divergence during the gradient updates for further diagnosis of whether TSNE has reached a stable configuration.

@danielhanchen danielhanchen added ? - Needs Triage Need team to review and classify feature request New feature or request labels Jul 19, 2019
@cjnolet
Copy link
Member

cjnolet commented Jul 19, 2019

Will the new KLDivergence prim help you here?

@danielhanchen
Copy link
Contributor Author

Oh so in the naive kernel I already incrementally updated the KLD loss. I just didn't put it in the Barnes Hut version yet. But I'll check the new prim out.

Most likely I'll continue using the naive TSNE version since no repeat calculations are made, and just P*log(P/Q) is used. Ie I have P, Q but they're not in a vector, but discarded during the algorithm. So I need an incremental version.

@teju85
Copy link
Member

teju85 commented Jul 23, 2019

What do you mean by incremental KLD version?

@danielhanchen
Copy link
Contributor Author

@teju85 So the KL prim is KL_Prim(P, Q) and KL = sum(P * log(P / Q)) is done.

In TSNE, P and Q are not formed explicity as a vector, they're formed on the go. Hence the KL divergence im using will incrementally add to KL.

@teju85
Copy link
Member

teju85 commented Jul 23, 2019

got it. Makes sense now. Thanks @danielhanchen for explaining.

minor point: you can reduce a couple of lines of code in yours if you decide to use the KLDOp defined here

@danielhanchen
Copy link
Contributor Author

Yep soz for the delay! I'll defs use that :)

zbjornson added a commit to zbjornson/cuml that referenced this issue Jul 11, 2020
I assume the intent is to implement this after rapidsai#863 is completed, so I didn't remove it completely and just marked it as unused.
@drobison00
Copy link
Contributor

@danielhanchen Do you know the current state of this work?

@danielhanchen
Copy link
Contributor Author

@drobison00 I haven't been able to get around to this sadly.
I did however in https://github.com/danielhanchen/tsne/blob/master/TSNE%20Extended%20Notebook.ipynb, specify some equations which might be helpful.

The main aim is to find the sum for KL for each point, which can be done inside the attractive forces kernel.
So, some manipulation of formulas will have to be done in order to get sum KL_ij.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
? - Needs Triage Need team to review and classify feature request New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants