Since the variable detached from the graph, why does the spectral regularization help? #13

weix-liu · 2020-11-21T08:35:49Z

img_numpy = gen_imgs[t,:,:,:].cpu().detach().numpy()
...

psd1D_rec = torch.from_numpy(psd1D_rec).float()
psd1D_rec = Variable(psd1D_rec, requires_grad=True).to(device)

loss_freq = criterion_freq(psd1D_rec,psd1D_img.detach())

xuhewei98 · 2020-12-10T12:06:42Z

你好，请问torchprob这个模块是需要自己实现嘛？我这边提示没有这个模块诶

victorca25 · 2021-01-02T07:33:25Z

The whole "loss_freq" is created from 2 numpy arrays, it doesn't seem to be carrying any gradient calculations.

I don't think it's contributing anything to the training other than scaling the loss.

njuaplusplus · 2021-06-16T20:35:57Z

The whole "loss_freq" is created from 2 numpy arrays, it doesn't seem to be carrying any gradient calculations.

I don't think it's contributing anything to the training other than scaling the loss.

This implementation can not support the paper. It's weird that the original paper can achieve the claimed effects in its evaluation: reducing the spectral difference and stabilizing the training.

njuaplusplus · 2021-06-16T20:52:27Z

The original paper said that

and AI is differentiable, a simple choice for LSpectral is the binary cross entropy between the generated output AIout and the mean AIreal obtained from real samples

But the implementation is completely different.

According to eq. 10 and eq. 11 in the paper, the loss backward propagate through AI and thus FFT as well.
However, in #5 the author said

Notice that we do not back propagate through the fft (that's why there is the detach).

Relevant issues: #5 #6 #8

yuetung · 2021-07-12T15:04:14Z

Take a look at this latest CVPR 2021 work by Chandrasegaran et al.: https://arxiv.org/pdf/2103.17195.pdf

Official implementation: https://github.com/sutd-visual-computing-group/Fourier-Discrepancies-CNN-Detection

Chandrasegaran et al. show that this discrepancy can be avoided by using nearest interpolation or bilinear interpolation feature map scaling in the last upsampling step (instead of transpose convolution) in LSGAN, DCGAN and WGAN-GP using CelebA. In fact, this particular work also uses nearest interpolation for last feature map scaling for the generator. (See line 53 in module_spectrum.py).

Chandrasegaran et al. show that these discrepancies are not intrinsic in many similar setups as well. You can also see their Supplementary materials (section B) where they are also confused by how generator loss scaling can achieve spectral consistency.

So, I think in these GAN setups, the nearest/ bilinear interpolation in the last upsampling step is the key to avoid these spectral decay discrepancies since they inject less amount of high frequency content into feature maps compared to transpose convolutions.

Xiaodong-Bran · 2021-10-14T08:27:30Z

@yuetung Thanks for sharing this!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Since the variable detached from the graph, why does the spectral regularization help? #13

Since the variable detached from the graph, why does the spectral regularization help? #13

weix-liu commented Nov 21, 2020

xuhewei98 commented Dec 10, 2020

victorca25 commented Jan 2, 2021 •

edited

Loading

njuaplusplus commented Jun 16, 2021

njuaplusplus commented Jun 16, 2021

yuetung commented Jul 12, 2021

Xiaodong-Bran commented Oct 14, 2021

Since the variable detached from the graph, why does the spectral regularization help? #13

Since the variable detached from the graph, why does the spectral regularization help? #13

Comments

weix-liu commented Nov 21, 2020

xuhewei98 commented Dec 10, 2020

victorca25 commented Jan 2, 2021 • edited Loading

njuaplusplus commented Jun 16, 2021

njuaplusplus commented Jun 16, 2021

yuetung commented Jul 12, 2021

Xiaodong-Bran commented Oct 14, 2021

victorca25 commented Jan 2, 2021 •

edited

Loading