Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The problem of pre_train model #2

Open
tim120526 opened this issue Sep 13, 2018 · 4 comments
Open

The problem of pre_train model #2

tim120526 opened this issue Sep 13, 2018 · 4 comments

Comments

@tim120526
Copy link

HI!XiaoMeng!
The problem occurs when I load the pretrain model you provide at (https://drive.google.com/open?id=1EwRuqfGASarGidutnYB8rXLSuzYpEoSM). the model named imagenet_epoch_2_glo_step_128118.pth.tar.
error:
Missing key(s) in state_dict: "Conv2d_1a_3x3.conv.weight", "Conv2d_1a_3x3.bn.weight", "Conv2d_1a_3x3.bn.bias", "Conv2d_1a_3x3.bn.running_mean", "Conv2d_1a_3x3.bn.running_var", "Conv2d_2a_3x3.conv.weight", "Conv2d_2a_3x3.bn.weight", "Conv2d_2a_3x3.bn.bias", "Conv2d_2a_3x3.bn.running_mean", "Conv2d_2a_3x3.bn.running_var", "Conv2d_2b_3x3.conv.weight", "Conv2d_2b_3x3.bn.weight", "Conv2d_2b_3x3.bn.bias", "Conv2d_2b_3x3.bn.running_mean", "Conv2d_2b_3x3.bn.running_var", "Conv2d_3b_1x1.conv.weight", "Conv2d_3b_1x1.bn.weight", "Conv2d_3b_1x1.bn.bias", "Conv2d_3b_1x1.bn.running_mean", "Conv2d_3b_1x1.bn.running_var", "Conv2d_4a_3x3.conv.weight", "Conv2d_4a_3x3.bn.weight", "Conv2d_4a_3x3.bn.bias", "Conv2d_4a_3x3.bn.running_mean", "Conv2d_4a_3x3.bn.running_var", "Mixed_5b.branch1x1.conv.weight", "Mixed_5b.branch1x1.bn.weight", "Mixed_5b.branch1x1.bn.bias", "Mixed_5b.branch1x1.bn.running_mean", "Mixed_5b.branch1x1.bn.running_var", "Mixed_5b.branch5x5_1.conv.weight", "Mixed_5b.branch5x5_1.bn.weight", "Mixed_5b.branch5x5_1.bn.bias", "Mixed_5b.branch5x5_1.bn.running_mean", "Mixed_5b.branch5x5_1.bn.running_var", "Mixed_5b.branch5x5_2.conv.weight", "Mixed_5b.branch5x5_2.bn.weight", "Mixed_5b.branch5x5_2.bn.bias", "Mixed_5b.branch5x5_2.bn.running_mean", "Mixed_5b.branch5x5_2.bn.running_var", "Mixed_5b.branch3x3dbl_1.conv.weight", "Mixed_5b.branch3x3dbl_1.bn.weight", "Mixed_5b.branch3x3dbl_1.bn.bias", "Mixed_5b.branch3x3dbl_1.bn.running_mean", "Mixed_5b.branch3x3dbl_1.bn.running_var", "Mixed_5b.branch3x3dbl_2.conv.weight", "Mixed_5b.branch3x3dbl_2.bn.weight", "Mixed_5b.branch3x3dbl_2.bn.bias", "Mixed_5b.branch3x3dbl_2.bn.running_mean", "Mixed_5b.branch3x3dbl_2.bn.running_var", "Mixed_5b.branch3x3dbl_3.conv.weight", "Mixed_5b.branch3x3dbl_3.bn.weight", "Mixed_5b.branch3x3dbl_3.bn.bias", "Mixed_5b.branch3x3dbl_3.bn.running_mean", "Mixed_5b.branch3x3dbl_3.bn.running_var", "Mixed_5b.branch_pool.conv.weight", "Mixed_5b.branch_pool.bn.weight", "Mixed_5b.branch_pool.bn.bias", "Mixed_5b.branch_pool.bn.running_mean", "Mixed_5b.branch_pool.bn.running_var", "Mixed_5c.branch1x1.conv.weight", "Mixed_5c.branch1x1.bn.weight", "Mixed_5c.branch1x1.bn.bias", "Mixed_5c.branch1x1.bn.running_mean", "Mixed_5c.branch1x1.bn.running_var", "Mixed_5c.branch5x5_1.conv.weight", "Mixed_5c.branch5x5_1.bn.weight", "Mixed_5c.branch5x5_1.bn.bias", "Mixed_5c.branch5x5_1.bn.running_mean", "Mixed_5c.branch5x5_1.bn.running_var", "Mixed_5c.branch5x5_2.conv.weight", "Mixed_5c.branch5x5_2.bn.weight", "Mixed_5c.branch5x5_2.bn.bias", "Mixed_5c.branch5x5_2.bn.running_mean", "Mixed_5c.branch5x5_2.bn.running_var", "Mixed_5c.branch3x3dbl_1.conv.weight", "Mixed_5c.branch3x3dbl_1.bn.weight", "Mixed_5c.branch3x3dbl_1.bn.bias", "Mixed_5c.branch3x3dbl_1.bn.running_mean", "Mixed_5c.branch3x3dbl_1.bn.running_var", "Mixed_5c.branch3x3dbl_2.conv.weight", "Mixed_5c.branch3x3dbl_2.bn.weight", "Mixed_5c.branch3x3dbl_2.bn.bias", "Mixed_5c.branch3x3dbl_2.bn.running_mean", "Mixed_5c.branch3x3dbl_2.bn.running_var", "Mixed_5c.branch3x3dbl_3.conv.weight", "Mixed_5c.branch3x3dbl_3.bn.weight", "Mixed_5c.branch3x3dbl_3.bn.bias", "Mixed_5c.branch3x3dbl_3.bn.running_mean", "Mixed_5c.branch3x3dbl_3.bn.running_var", "Mixed_5c.branch_pool.conv.weight", "Mixed_5c.branch_pool.bn.weight", "Mixed_5c.branch_pool.bn.bias", "Mixed_5c.branch_pool.bn.running_mean", "Mixed_5c.branch_pool.bn.running_var", "Mixed_5d.branch1x1.conv.weight", "Mixed_5d.branch1x1.bn.weight",
........

it reveal the the model cant match the net defined. Is there some mistake in the model I downloaded?Please enlighten me. Thank you very much!

@xiaomengyc
Copy link
Owner

Did you try the latest code? I tested it now. The pre-trained weights can be successfully loaded.
If it is still not working for you, can you provide the shell script for using the weights?

@yeezhu
Copy link

yeezhu commented Sep 16, 2018

Hi @xiaomengyc
I noticed that your inception v3 model is a little different from the official model definition of torchvision, but they use the same pre-trained weights (inception_v3_google-1a9a5a14.pth).
The different places are as follows:

  1. padding of Conv2d_1a_3x3: yours, torchvision
  2. padding and stride of Mixed_6a: yours, torchvision
  3. padding of max_pool2d: yours, torchvision

Did you make that change? Could you explain it for me?

Thanks!

@xiaomengyc
Copy link
Owner

Hi @yeezhu,

  1. I added padding because torchvision by default gives us the different resolution of feature maps from the same feature maps in Caffe and Tensorflow. I keep the resolutions same with the baseline methods by adding padding operation.
  2. The stride of Mixed_6a is also for keeping a relative better resolution of the final heatmaps.

@yeezhu
Copy link

yeezhu commented Sep 17, 2018

Thank you! @xiaomengyc
I modify the code in inception_spg.py to construct SPG-plain and train it on CUB, but it doesn't converge.
Here are my settings:

  1. Pytorch 0.4.0, python 3.6, cuda 8.0

  2. lr=0.001 for the pretrained weights (before Mixed_6e), lr = 0.01 for others. momentum=0.9, weight_decay=0.0005. (based on the description in section 3.3 of the paper)

  3. The code in inception_spg.py that I changed:

     feat = self.Mixed_6e(x) 
     # side3 = self.side3(x)
     # side3 = self.side_all(side3)
     # 28 x 28 x 192
     x = self.Mixed_6a(x)
     # 28 x 28 x 768
     x = self.Mixed_6b(x)
     # 28 x 28 x 768
     x = self.Mixed_6c(x)
     # 28 x 28 x 768
     x = self.Mixed_6d(x)
     # 28 x 28 x 768
     feat = self.Mixed_6e(x)
    
     # side4 = self.side4(x)
     # side4 = self.side_all(side4)
    
     #Branch 1
     out1, last_feat = self.inference(feat, label=label)
     # self.map1 = out1
    
     # atten_map = self.get_atten_map(self.interp(out1), label, True)
    
     #Branch B
     # out_seg = self.branchB(last_feat)
    
     logits_1 = torch.mean(torch.mean(out1, dim=2), dim=2)
    
     # return [logits_1, side3, side4, out_seg, atten_map]
     return logits_1
    

So, can you share the detailed settings for training CUB on SPG-plain?
Many thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants