Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Encountered an error while running the demo script for few-shot VOC inference in object detection. #76

Open
bigz321 opened this issue Nov 26, 2024 · 10 comments

Comments

@bigz321
Copy link

bigz321 commented Nov 26, 2024

Hello, I am trying to using demo to do the inferences for few-shot VOC. My code as following:

def main(
        config_file="configs/few-shot-voc/10shot/vitl_3s.yaml",
        rpn_config_file="configs/VOC_RPN/faster_rcnn_R_50_C4.few_shot_s3.yaml",
        model_path="weights/trained/few-shot-voc/3/vitl_0014999.pth", 
        image_dir='demo/input',
        output_dir='demo/output', 
        category_space="demo/ycb_prototypes.pth",
        device='cpu',
        overlapping_mode=True,
        topk=1,
        output_pth=False,
        threshold=0.45
    ), 

I don't know which part I did wrong, but I got the error like :RuntimeError: Error(s) in loading state_dict for OpenSetDetectorWithExamples:
Missing key(s) in state_dict: "test_class_weight", "mask_intra_dist_emb.weight", "mask_intra_dist_emb.bias", "mask_bg_dist_emb.weight", "mask_bg_dist_emb.bias", "mask_feat_compress.0.weight", "mask_feat_compress.0.bias", "mask_feat_compress.1.weight", "mask_feat_compress.1.bias", "mask_feat_compress.2.weight", "mask_feat_compress.2.bias", "fc_init_mask.weight", "fc_init_mask.bias", "mp_layers.0.0.weight", "mp_layers.0.0.bias", "mp_layers.0.1.weight", "mp_layers.0.1.bias", "mp_layers.1.0.weight", "mp_layers.1.0.bias", "mp_layers.1.1.weight", "mp_layers.1.1.bias", "mp_layers.2.0.weight", "mp_layers.2.0.bias", "mp_layers.2.1.weight", "mp_layers.2.1.bias", "mp_layers.3.0.weight", "mp_layers.3.0.bias", "mp_layers.3.1.weight", "mp_layers.3.1.bias", "mp_layers.4.0.weight", "mp_layers.4.0.bias", "mp_layers.4.1.weight", "mp_layers.4.1.bias", "mp_out_layers.0.weight", "mp_out_layers.0.bias", "mp_out_layers.1.weight", "mp_out_layers.1.bias", "mp_out_layers.2.weight", "mp_out_layers.2.bias", "mp_out_layers.3.weight", "mp_out_layers.3.bias", "mp_out_layers.4.weight", "mp_out_layers.4.bias", "mask_deconv.0.weight", "mask_deconv.0.bias", "mask_deconv.1.weight", "mask_deconv.1.bias", "mask_predictor.weight", "mask_predictor.bias". So do you have any idea to help me out? Thank you very much!

@mlzxy
Copy link
Owner

mlzxy commented Nov 26, 2024

The model complains that missing weights in the instance mask segmentation branch.

Maybe you load the wrong checkpoint? You may want to load the ViT-L trained on LVIS one, instead of the one in VOC, if you want to use the mask branch.

@bigz321
Copy link
Author

bigz321 commented Nov 26, 2024

The model complains that missing weights in the instance mask segmentation branch.

Maybe you load the wrong checkpoint? You may want to load the ViT-L trained on LVIS one, instead of the one in VOC, if you want to use the mask branch.

The model complains that missing weights in the instance mask segmentation branch.

Maybe you load the wrong checkpoint? You may want to load the ViT-L trained on LVIS one, instead of the one in VOC, if you want to use the mask branch.

I tried using the ViT-L on LVIS one, did I do it correctly? the path is located on open-vocabulary folder?

def main(
        config_file="configs/few-shot-voc/10shot/vitl_3s.yaml",
        rpn_config_file="configs/VOC_RPN/faster_rcnn_R_50_C4.few_shot_s3.yaml",
        model_path="weights/trained/open-vocabulary/lvis/vitl_0069999.pth", 
        image_dir='demo/input',
        output_dir='demo/output', 
        category_space="demo/ycb_prototypes.pth",
        device='cpu',
        overlapping_mode=True,
        topk=1,
        output_pth=False,
        threshold=0.45
    ):  

But I still got the error like "offline_backbone.bottom_up.res5.2.conv2.norm.running_var", "offline_backbone.bottom_up.res5.2.conv2.norm.num_batches_tracked", "offline_backbone.bottom_up.res5.2.conv3.weight", "offline_backbone.bottom_up.res5.2.conv3.norm.weight", "offline_backbone.bottom_up.res5.2.conv3.norm.bias", "offline_backbone.bottom_up.res5.2.conv3.norm.running_mean", "offline_backbone.bottom_up.res5.2.conv3.norm.running_var", "offline_backbone.bottom_up.res5.2.conv3.norm.num_batches_tracked", "offline_proposal_generator.rpn_head.conv.conv0.weight", "offline_proposal_generator.rpn_head.conv.conv0.bias", "offline_proposal_generator.rpn_head.conv.conv1.weight", "offline_proposal_generator.rpn_head.conv.conv1.bias", "per_cls_cnn.main_layers.3.0.weight", "per_cls_cnn.main_layers.3.0.bias", "per_cls_cnn.main_layers.3.1.weight", "per_cls_cnn.main_layers.3.1.bias", "per_cls_cnn.main_layers.3.1.running_mean", "per_cls_cnn.main_layers.3.1.running_var", "per_cls_cnn.main_layers.3.1.num_batches_tracked", "per_cls_cnn.main_layers.4.0.weight", "per_cls_cnn.main_layers.4.0.bias", "per_cls_cnn.main_layers.4.1.weight", "per_cls_cnn.main_layers.4.1.bias", "per_cls_cnn.main_layers.4.1.running_mean", "per_cls_cnn.main_layers.4.1.running_var", "per_cls_cnn.main_layers.4.1.num_batches_tracked", "per_cls_cnn.mask_layers.3.weight", "per_cls_cnn.mask_layers.3.bias", "per_cls_cnn.mask_layers.4.weight", "per_cls_cnn.mask_layers.4.bias", "bg_cnn.main_layers.3.0.weight", "bg_cnn.main_layers.3.0.bias", "bg_cnn.main_layers.3.1.weight", "bg_cnn.main_layers.3.1.bias", "bg_cnn.main_layers.3.1.running_mean", "bg_cnn.main_layers.3.1.running_var", "bg_cnn.main_layers.3.1.num_batches_tracked", "bg_cnn.main_layers.4.0.weight", "bg_cnn.main_layers.4.0.bias", "bg_cnn.main_layers.4.1.weight", "bg_cnn.main_layers.4.1.bias", "bg_cnn.main_layers.4.1.running_mean", "bg_cnn.main_layers.4.1.running_var", "bg_cnn.main_layers.4.1.num_batches_tracked", "bg_cnn.mask_layers.3.weight", "bg_cnn.mask_layers.3.bias", "bg_cnn.mask_layers.4.weight", "bg_cnn.mask_layers.4.bias".
size mismatch for train_class_weight: copying a param with shape torch.Size([866, 1024]) from checkpoint, the shape in current model is torch.Size([15, 1024]).
size mismatch for test_class_weight: copying a param with shape torch.Size([1203, 1024]) from checkpoint, the shape in current model is torch.Size([20, 1024]).
size mismatch for offline_proposal_generator.rpn_head.objectness_logits.weight: copying a param with shape torch.Size([3, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([15, 1024, 1, 1]).
size mismatch for offline_proposal_generator.rpn_head.objectness_logits.bias: copying a param with shape torch.Size([3]) from checkpoint, the shape in current model is torch.Size([15]).
size mismatch for offline_proposal_generator.rpn_head.anchor_deltas.weight: copying a param with shape torch.Size([12, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([60, 1024, 1, 1]).
size mismatch for offline_proposal_generator.rpn_head.anchor_deltas.bias: copying a param with shape torch.Size([12]) from checkpoint, the shape in current model is torch.Size([60]).

@mlzxy
Copy link
Owner

mlzxy commented Nov 26, 2024

If you want to use the LVIS model, you need to reset the config and rpn_config file back to the origin.

I think the part that you need to change is the category_space. You can inspect the format of demo/ycb_prototypes.pth and the create a new one for Pascal VOC from the voc prototypes for examples:

  • weights/initial/few-shot-voc/prototypes/pascal_voc_train_split_2.vitl14.bbox.p10.sk.pkl
  • weights/initial/few-shot-voc/prototypes/voc_2007_trainval_novel2_1shot.vitl14.aug.bbox.p10.sk.pkl

You may need to change the voc split and number of shots.

@bigz321
Copy link
Author

bigz321 commented Nov 26, 2024

If you want to use the LVIS model, you need to reset the config and rpn_config file back to the origin.

I think the part that you need to change is the category_space. You can inspect the format of demo/ycb_prototypes.pth and the create a new one for Pascal VOC from the voc prototypes for examples:

  • weights/initial/few-shot-voc/prototypes/pascal_voc_train_split_2.vitl14.bbox.p10.sk.pkl
  • weights/initial/few-shot-voc/prototypes/voc_2007_trainval_novel2_1shot.vitl14.aug.bbox.p10.sk.pkl

You may need to change the voc split and number of shots.

Thank you for your reply! I understand that I need to update the category_space, but I’m not at that step yet. Right now, I’m stuck on loading the model. So you means my config file, rpn_config_file and model_path like the following? Because
I'm not sure how to properly combine the config_file, rpn_config_file, and model_path for the inference setup.

def main(
        config_file="configs/few-shot-voc/10shot/vitl_3s.yaml",
        rpn_config_file="configs/VOC_RPN/faster_rcnn_R_50_C4.few_shot_s3.yaml",
        model_path="weights/initial/few-shot-voc/prototypes/pascal_voc_train_split_3.vitl14.bbox.p10.sk.pkl",  
        image_dir='demo/input',
        output_dir='demo/output', 
        category_space="demo/ycb_prototypes.pth",
        device='cpu',
        overlapping_mode=True,
        topk=1,
        output_pth=False,
        threshold=0.45
    ): 

Basically I am stuck on the following code:

    model.load_state_dict(torch.load(model_path, map_location=device)['model'])
    model.eval()
    model = model.to(device)

    if category_space is not None:
        category_space = torch.load(category_space)
        model.label_names = category_space['label_names']
        model.test_class_weight = category_space['prototypes'].to(device)

this code model.load_state_dict(torch.load(model_path, map_location=device)['model'])

I got the error like: Traceback (most recent call last):
File "/root/src/devit/./demo/demo.py", line 288, in
fire.Fire(main)
File "/usr/local/lib/python3.9/site-packages/fire/core.py", line 135, in Fire
component_trace = _Fire(component, args, parsed_flag_args, context, name)
File "/usr/local/lib/python3.9/site-packages/fire/core.py", line 468, in _Fire
component, remaining_args = _CallAndUpdateTrace(
File "/usr/local/lib/python3.9/site-packages/fire/core.py", line 684, in _CallAndUpdateTrace
component = fn(*varargs, **kwargs)
File "/root/src/devit/./demo/demo.py", line 212, in main
model.load_state_dict(torch.load(model_path, map_location=device)['model'])
KeyError: 'model'

@mlzxy
Copy link
Owner

mlzxy commented Nov 26, 2024

No, I mean

# this three arguments shall be set to the original values
config_file="configs/open-vocabulary/lvis/vitl.yaml", 
rpn_config_file="configs/RPN/mask_rcnn_R_50_FPN_1x.yaml",
model_path="weights/trained/open-vocabulary/lvis/vitl_0069999.pth", 

# change this to prototypes of Pascal VOC classes
category_space="demo/ycb_prototypes.pth",

@bigz321
Copy link
Author

bigz321 commented Nov 27, 2024

No, I mean

# this three arguments shall be set to the original values
config_file="configs/open-vocabulary/lvis/vitl.yaml", 
rpn_config_file="configs/RPN/mask_rcnn_R_50_FPN_1x.yaml",
model_path="weights/trained/open-vocabulary/lvis/vitl_0069999.pth", 

# change this to prototypes of Pascal VOC classes
category_space="demo/ycb_prototypes.pth",

Ok, thank you for your suggestions. But if I want to use a COCO configuration file, how should I specify the config file path? For example, my current setup looks like the following:

def main(
        config_file="configs/open-vocabulary/coco/vitl.yaml",
        rpn_config_file="configs/RPN/mask_rcnn_R_50_FPN_1x.yaml",
        model_path="weights/trained/open-vocabulary/coco/vitl_0064999.pth",

        image_dir='demo/input',
        output_dir='demo/output', 
        category_space="demo/ycb_prototypes.pth",
        device='cpu',
        overlapping_mode=True,
        topk=1,
        output_pth=False,
        threshold=0.45
    ): 

But these setting seems incorrect. How should I set things up properly? Which checkpoint should I use, and which RPN configuration file is needed to work with the COCO config file? If I want to use this one

config_file="configs/open-vocabulary/coco/vitl.yaml"

@mlzxy
Copy link
Owner

mlzxy commented Nov 27, 2024

You can try replacing the mask_rcnn_R_50_FPN_1x.yaml with the mask_rcnn_R_50_C4_1x_ovd_FSD.yaml.

@bigz321
Copy link
Author

bigz321 commented Nov 27, 2024

You can try replacing the mask_rcnn_R_50_FPN_1x.yaml with the mask_rcnn_R_50_C4_1x_ovd_FSD.yaml.

Hello, I tried replacing the RPN config file from mask_rcnn_R_50_FPN_1x.yaml to mask_rcnn_R_50_C4_1x_ovd_FSD.yaml. Does the code look correct? The code likes the following:

def main(
        config_file="configs/open-vocabulary/coco/vitl.yaml",
        rpn_config_file="configs/RPN/mask_rcnn_R_50_C4_1x_ovd_FSD.yaml",
        model_path="weights/trained/open-vocabulary/coco/vitl_0064999.pth",

        image_dir='demo/input',
        output_dir='demo/output', 
        category_space="demo/ycb_prototypes.pth",
        device='cpu',
        overlapping_mode=True,
        topk=1,
        output_pth=False,
        threshold=0.45
    ):

But I got the following error: RuntimeError: Error(s) in loading state_dict for OpenSetDetectorWithExamples:
Missing key(s) in state_dict: "mask_intra_dist_emb.weight", "mask_intra_dist_emb.bias", "mask_bg_dist_emb.weight", "mask_bg_dist_emb.bias", "mask_feat_compress.0.weight", "mask_feat_compress.0.bias", "mask_feat_compress.1.weight", "mask_feat_compress.1.bias", "mask_feat_compress.2.weight", "mask_feat_compress.2.bias", "fc_init_mask.weight", "fc_init_mask.bias", "mp_layers.0.0.weight", "mp_layers.0.0.bias", "mp_layers.0.1.weight", "mp_layers.0.1.bias", "mp_layers.1.0.weight", "mp_layers.1.0.bias", "mp_layers.1.1.weight", "mp_layers.1.1.bias", "mp_layers.2.0.weight", "mp_layers.2.0.bias", "mp_layers.2.1.weight", "mp_layers.2.1.bias", "mp_layers.3.0.weight", "mp_layers.3.0.bias", "mp_layers.3.1.weight", "mp_layers.3.1.bias", "mp_layers.4.0.weight", "mp_layers.4.0.bias", "mp_layers.4.1.weight", "mp_layers.4.1.bias", "mp_out_layers.0.weight", "mp_out_layers.0.bias", "mp_out_layers.1.weight", "mp_out_layers.1.bias", "mp_out_layers.2.weight", "mp_out_layers.2.bias", "mp_out_layers.3.weight", "mp_out_layers.3.bias", "mp_out_layers.4.weight", "mp_out_layers.4.bias", "mask_deconv.0.weight", "mask_deconv.0.bias", "mask_deconv.1.weight", "mask_deconv.1.bias", "mask_predictor.weight", "mask_predictor.bias". Do you have any idea about this?

@mlzxy
Copy link
Owner

mlzxy commented Nov 27, 2024

The COCO model does not have the instance mask branch. You need to turn off it at

config.MODEL.MASK_ON = True

@bigz321
Copy link
Author

bigz321 commented Nov 27, 2024

config.MODEL.MASK_ON

Thank you very much! It works!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants