Detach in Lab3-2 & 3-3 #20

pandasfang · 2017-12-07T15:54:33Z

Dear TA:

In the Lab3-2. why don't we need to detach Discriminator when we backward propagate Generator?

############################
# (2) Update G network: maximize log(D(G(z)))
###########################
netG.zero_grad()
labelv = Variable(label.fill_(real_label))  # fake labels are real for generator cost
output = netD(fake)
errG = criterion(output, labelv)
errG.backward()
D_G_z2 = output.data.mean()
optimizerG.step()

The text was updated successfully, but these errors were encountered:

hui-po-wang · 2017-12-07T17:39:50Z

Hi @pandasfang,

In optimizerG = optim.Adam(netG.parameters(), lr=opt.lr, betas=(opt.beta1, 0.999)), we tell the optimizer that it only needs to update the parameters of generator. That is, although netD will receive gradients, it won't be updated, so we don't have to detach it.

Now, you may have another question. Why do we call detach in this line output = netD(fake.detach()) ? Well, the answer is that it's not necessary to call detach.

Considering the following example which is a very simple auto-encoder.

fc1 = nn.Linear(1, 2)
fc2 = nn.Linear(2, 1)
opt1 = optim.Adam(fc1.parameters(),lr=1e-1)
opt2 = optim.Adam(fc2.parameters(),lr=1e-1)

x = Variable(torch.FloatTensor([5]))
z = fc1(x)
x_p = fc2(z)
cost = (x_p - x) ** 2
'''
print (z)
print (x_p)
print (cost)
'''
opt1.zero_grad()
opt2.zero_grad()

cost.backward()
for n, p in fc1.named_parameters():
    print (n, p.grad.data)

for n, p in fc2.named_parameters():
    print (n, p.grad.data)


opt1.zero_grad()
opt2.zero_grad()

z = fc1(x)
x_p = fc2(z.detach())
cost = (x_p - x) ** 2

cost.backward()
for n, p in fc1.named_parameters():
    print (n, p.grad.data)

for n, p in fc2.named_parameters():
    print (n, p.grad.data)

The output would be :

weight 
 12.0559
 -8.3572
[torch.FloatTensor of size 2x1]

bias 
 2.4112
-1.6714
[torch.FloatTensor of size 2]

weight 
-33.5588 -19.4411
[torch.FloatTensor of size 1x2]

bias 
-9.9940
[torch.FloatTensor of size 1]

================================================

weight 
 0
 0
[torch.FloatTensor of size 2x1]

bias 
 0
 0
[torch.FloatTensor of size 2]

weight 
-33.5588 -19.4411
[torch.FloatTensor of size 1x2]

bias 
-9.9940
[torch.FloatTensor of size 1]

You can find that there's no influence on the gradients of fc2 though we detach the result from fc1. Once we know that the gradient won't be influenced, we can simply use the optimizerD (which only updates the parameters of discriminator) to update the netD without concerning the generator (even when we don't detach it). However, it may lead to some additional computational cost if you don't detach the parts which you don't need.

Thanks

hui-po-wang · 2017-12-07T17:43:34Z

I think it's a good question and you guys can verify if what I told is right (maybe I am wrong because I am still learning, too :) ).

If possible, please keep this thread open, and I think it would be helpful for people who want to know more about detach.

It's also highly welcome to discuss with me.

Thanks

yyrkoon27 · 2017-12-08T01:09:35Z

Soumith's reply in this thread might also clarify things a little bit...
[https://github.com/pytorch/examples/issues/116]

hui-po-wang · 2017-12-08T02:39:34Z

Hi @yyrkoon27 ,

In this case, it's right. In VAE-GAN, the detach function may be needed for the correctness if you use, for example, opt1 = optim.RMSprop(G.parameters(), lr=1e-1) where G consists of an encoder and a decoder.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Detach in Lab3-2 & 3-3 #20

Detach in Lab3-2 & 3-3 #20

pandasfang commented Dec 7, 2017 •

edited

Loading

hui-po-wang commented Dec 7, 2017 •

edited

Loading

hui-po-wang commented Dec 7, 2017 •

edited

Loading

yyrkoon27 commented Dec 8, 2017 •

edited

Loading

hui-po-wang commented Dec 8, 2017

Detach in Lab3-2 & 3-3 #20

Detach in Lab3-2 & 3-3 #20

Comments

pandasfang commented Dec 7, 2017 • edited Loading

hui-po-wang commented Dec 7, 2017 • edited Loading

hui-po-wang commented Dec 7, 2017 • edited Loading

yyrkoon27 commented Dec 8, 2017 • edited Loading

hui-po-wang commented Dec 8, 2017

pandasfang commented Dec 7, 2017 •

edited

Loading

hui-po-wang commented Dec 7, 2017 •

edited

Loading

hui-po-wang commented Dec 7, 2017 •

edited

Loading

yyrkoon27 commented Dec 8, 2017 •

edited

Loading