Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KeyError: 'unexpected key "conv1.weight" in state_dict' #3

Closed
John1231983 opened this issue Mar 15, 2018 · 14 comments
Closed

KeyError: 'unexpected key "conv1.weight" in state_dict' #3

John1231983 opened this issue Mar 15, 2018 · 14 comments

Comments

@John1231983
Copy link

John1231983 commented Mar 15, 2018

I am using imagenet pretrain and when I run the training, I got the error. How to fix it? Thanks. My pytorch is 0.4.0, python 3.6
The pretrain model I got from torchvision (official pretrain model). I guess we have to use keras pretrain model to fix it. However, how could we use official pretrain model in https://github.com/pytorch/vision/blob/master/torchvision/models/resnet.py?

Loading weights  /home/john/pytorch-mask-rcnn/resnet101-5d3b4d8f.pth
Traceback (most recent call last):
  File "train.py", line 28, in <module>
    model.load_weights(model_path)
  File "/home/john/pytorch-mask-rcnn/model.py", line 1563, in load_weights
    self.load_state_dict(torch.load(filepath))
  File "/home/john/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 538, in load_state_dict
    .format(name))
KeyError: 'unexpected key "conv1.weight" in state_dict'
@lasseha
Copy link
Collaborator

lasseha commented Mar 15, 2018

It should definitely be possible to load pretrained imagenet weights from torchvision. The used resnet model is just a rewritten version of the model from here to allow direct access to the different layers. So renaming the keys and removing the fully connected layer weights would be sufficient to load the weights. You can have a look at convert_from_keras.py to see how to modify a pretrained model file.

@John1231983
Copy link
Author

But the code that you are using is for pretrained from what: converted keras or official pytorch? If it is for official pytorch then I think we do not need to change. Because it matched with the official implementation of pytorch. I have downloaded the pretrained from the link you gave me ( torchvision) but it shows error above when I use it.

@lasseha
Copy link
Collaborator

lasseha commented Mar 15, 2018

Then i missunderstood. For direct use with this code download the pretrained models converted from keras.

@John1231983
Copy link
Author

John1231983 commented Mar 16, 2018

Thanks @lasseha for your help. I have downloaded the pretrain and run the code. But I got the new error. Hence, I have two questions:

  1. If I use your converted pretrain from Keras, it will have below error. How can we fix it?
  2. If I want to use the pretrain from pytorch website, how can we modify your code? Because I got the error as the above. The keras version does not have resnet101 pretrain, so I am not prefer to use convertkeras2python code
Loading weights  /home/john/pytorch-mask-rcnn/resnet50_imagenet.pth
Traceback (most recent call last):
  File "/home/john/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 530, in load_state_dict
    own_state[name].copy_(param)
RuntimeError: The expanded size of the tensor (2) must match the existing size (81) at non-singleton dimension 0

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "train.py", line 28, in <module>
    model.load_weights(model_path)
  File "/home/john/pytorch-mask-rcnn/model.py", line 1563, in load_weights
    self.load_state_dict(torch.load(filepath))
  File "/home/john/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 535, in load_state_dict
    .format(name, own_state[name].size(), param.size()))
RuntimeError: While copying the parameter named mask.conv5.bias, whose dimensions in the model are torch.Size([2]) and whose dimensions in the checkpoint are torch.Size([81]).

@lasseha
Copy link
Collaborator

lasseha commented Mar 16, 2018

  1. The downloadable model was an old version that i updated now. Please try again with the updated version. You probably need to load the model as
self.load_state_dict(torch.load(filepath), strict=False) (line 1563 in model.py)
  1. As keras does not provide a resnet101 pretrained model, the resnet50 model was used instead. See here for the discussion about the topic. You're right ideally we would use the resnet101 pretrained model from torchvision which should be usable with this code making the adjustments mentioned in my first comment.

@John1231983
Copy link
Author

Thanks so much @lasseha. It helped me to solved the problem of loading image net. However, it has other error that is

    layers='heads')
  File "/home/john/pytorch-mask-rcnn/model.py", line 1818, in train_model
    loss = self.train_epoch(train_generator, optimizer, self.config.STEPS_PER_EPOCH)
  File "/home/john/pytorch-mask-rcnn/model.py", line 1873, in train_epoch
    self.predict([images, image_metas, gt_class_ids, gt_boxes, gt_masks], mode='training')
  File "/home/john/pytorch-mask-rcnn/model.py", line 1739, in predict
    detection_target_layer(rpn_rois, gt_class_ids, gt_boxes, gt_masks, self.config)
  File "/home/john/pytorch-mask-rcnn/model.py", line 569, in detection_target_layer
    crowd_ix = torch.nonzero(gt_class_ids < 0)[:, 0]
IndexError: too many indices for tensor of dimension 1

Note that, I have run the matterport implementation successful, but it is too slow. So I prefer your code

@lasseha
Copy link
Collaborator

lasseha commented Mar 19, 2018

Which dataset do you use for training? COCO or some custom datasaet?

@John1231983
Copy link
Author

@lasseha: i used customer dataset. That does not read json, it read mask and image directly from folder. It looks like the shape dataset in matterport code

@lasseha
Copy link
Collaborator

lasseha commented Mar 19, 2018

I do not see this error with COCO, so maybe try to find the difference between those datasets.

@John1231983
Copy link
Author

Thanks. I will try. Could you tell me how can we use the official pretrain model of pytorch, instead of the convert from matterport ?

@lasseha
Copy link
Collaborator

lasseha commented Mar 22, 2018

Again, i refer to my first comment:

It should definitely be possible to load pretrained imagenet weights from torchvision. The used resnet model is just a rewritten version of the model from here to allow direct access to the different layers. So renaming the keys and removing the fully connected layer weights would be sufficient to load the weights. You can have a look at convert_from_keras.py to see how to modify a pretrained model file.

@lasseha lasseha closed this as completed Mar 29, 2018
@rafisef
Copy link

rafisef commented Jul 5, 2018

Hey @lasseha I am getting the same error as @John1231983 except I am trying to train on the COCO dataset sepcifically 2014. Any ideas what is causing this error?

File "/home/cvds_lab/rseferya/pytorch-mask-rcnn/model.py", line 569, in detection_target_layer
crowd_ix = torch.nonzero(gt_class_ids < 0)[:, 0]
IndexError: too many indices for tensor of dimension 1

@hongzhenwang
Copy link

@rafisef @lasseha I meet the same problem. pytorch 0.4, python3.6 ,coco2014

@hongzhenwang
Copy link

hongzhenwang commented Aug 2, 2018

@rafisef Solved!The size of empty tensor in pytorch0.4 is 1, not 0. This is because scalars is introduced, and thus [] becomes dim 1. pytorch/pytorch#7240
you can change line 568: if torch.nonzero(gt_class_ids < 0).size()
to:
if torch.nonzero(gt_class_ids < 0).nelement()

@lasseha can update the code to support pytorch0.4

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants