Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

explosion at 14000/500000 #1

Open
frolf opened this issue Apr 3, 2017 · 7 comments
Open

explosion at 14000/500000 #1

frolf opened this issue Apr 3, 2017 · 7 comments

Comments

@frolf
Copy link

frolf commented Apr 3, 2017

Not sure exactly what occurred, but while training with your code with default parameters and tensorboard enabled, the training lost it at 14000/500000.

beganoops
beganoops2

The network then began outputting completely black images for all three _D, _D, and _D_fake.

Not much more insight from my end I am afraid. Trained on a Titan XP in ubuntu with pytorch 0.1.10 - py27_1cu80 [cuda80] - soumith, torchvision 0.1.6, and python 2.7.13 on ubuntu 16.04.

Loss_D went from 0.0436 to 1.45, and L_x went from 0.0436 to 1.522

@carpedm20
Copy link
Owner

I'm working on this and other issues. Autoencoder which is D trained well but G doesn't. G seems to go to the right path only at the very early steps but when k_t goes near 0, it goes wrong.

@carpedm20
Copy link
Owner

After I switched 'z_dim' from 1024 to 124, I guess explosion with default parameter is solved.

@malreddysid
Copy link

Hi! Were you able to recreate the results given in paper?

@carpedm20
Copy link
Owner

carpedm20 commented Apr 6, 2017

Not yet. But I will.

@malreddysid
Copy link

https://github.com/rcalland/chainer-BEGAN
Here is another implementation which is done using chainer. It might help you.

@carpedm20
Copy link
Owner

carpedm20 commented Apr 6, 2017

@malreddysid Yea. I saw it too but it doesn't have a learning rate schedule which, in my opinion, important to get out of mode collapsing. I guess it's hard to understand gradient flow of pytorch and I'm working on a tensorflow implementation which is better than this implementation and actually show some faces.
image image

@Tyhye
Copy link

Tyhye commented May 9, 2017

Hi! Maybe you could change the L1loss() to tensor,reducemean(tensor.abs(D-X) ).
I have try by this way. The losses seems becoming similar to your tensorflow-version,but the image generated is still bad.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants