generate_net() issue #1

Carlisle-Liu · 2021-01-17T07:53:24Z

Hi,

I cannot generate the model for both the deeplab3+ and seam experiments. For the seam's deeplabv1-ResNet38d, it always shows the following error message: The mxnet keeps raising the "get_last_ffi_error()" and then it goes to the "/home1/wangyude/project/SEAM/models/ilsvrc-cls_rna-a1_cls1000_ep-0001.params" for parameters, although, I have defined my own path for model parameters in the config. Can you please help me with this?

/students/u6617221/Models/semantic-segmentation-codebase/ilsvrc-cls_rna-a1_cls1000_ep-0001.params False
Traceback (most recent call last):
File "train.py", line 162, in
train_net()
File "train.py", line 53, in train_net
net = generate_net(cfg, batchnorm=nn.BatchNorm2d)
File "/students/u6617221/Models/semantic-segmentation-codebase/lib/net/generateNet.py", line 15, in generate_net
net = NETS.get(cfg.MODEL_NAME)(cfg, **kwargs)
File "/students/u6617221/Models/semantic-segmentation-codebase/lib/net/deeplabv1.py", line 20, in init
self.backbone = build_backbone(self.cfg.MODEL_BACKBONE, pretrained=cfg.MODEL_BACKBONE_PRETRAIN, norm_layer=self.batchnorm, **kwargs)
File "/students/u6617221/Models/semantic-segmentation-codebase/lib/net/backbone/builder.py", line 8, in build_backbone
net = BACKBONES.get(backbone_name)(pretrained=pretrained, **kwargs)
File "/students/u6617221/Models/semantic-segmentation-codebase/lib/net/backbone/resnet38d.py", line 270, in resnet38
weight_dict = convert_mxnet_to_torch(model_url)
File "/students/u6617221/Models/semantic-segmentation-codebase/lib/net/backbone/resnet38d.py", line 219, in convert_mxnet_to_torch
save_dict = mxnet.nd.load(filename)
File "/students/u6617221/.local/lib/python3.6/site-packages/mxnet/ndarray/utils.py", line 175, in load
ctypes.byref(names)))
File "/students/u6617221/.local/lib/python3.6/site-packages/mxnet/base.py", line 246, in check_call
raise get_last_ffi_error()
mxnet.base.MXNetError: Traceback (most recent call last):
[bt] (2) /students/u6617221/.local/lib/python3.6/site-packages/mxnet/libmxnet.so(MXNDArrayLoad+0x222) [0x7f46b0114282]
[bt] (1) /students/u6617221/.local/lib/python3.6/site-packages/mxnet/libmxnet.so(+0x78dd01a) [0x7f46b13fa01a]
[bt] (0) /students/u6617221/.local/lib/python3.6/site-packages/mxnet/libmxnet.so(+0x78e5521) [0x7f46b1402521]
File "src/io/local_filesys.cc", line 209
LocalFileSystem: Check failed: allow_null: :Open "/home1/wangyude/project/SEAM/models/ilsvrc-cls_rna-a1_cls1000_ep-0001.params": No such file or directory

YudeWang · 2021-01-17T07:56:45Z

@Carlisle-Liu
Replace the pretrain model address here

semantic-segmentation-codebase/lib/net/backbone/resnet38d.py

Line 7 in 995b8fa

    
           model_url='/home1/wangyude/project/SEAM/models/ilsvrc-cls_rna-a1_cls1000_ep-0001.params'

Carlisle-Liu · 2021-01-17T08:20:14Z

@YudeWang
Thank you for the reply.

After changing the weight path, it raises "_pickle.UnpicklingError: invalid load key, '\x12'." as shown below. I downloaded the weight "ilsvrc-cls_rna-a1_cls1000_ep-0001.params" from your SEAM repository (https://github.com/YudeWang/SEAM). Does it mean the model and the weight do not match?

Traceback (most recent call last):
File "train.py", line 162, in
train_net()
File "train.py", line 55, in train_net
net.load_state_dict(torch.load(cfg.TRAIN_CKPT),strict=True)
File "/students/u6617221/.local/lib/python3.6/site-packages/torch/serialization.py", line 595, in load
return _legacy_load(opened_file, map_location, pickle_module, **pickle_load_args)
File "/students/u6617221/.local/lib/python3.6/site-packages/torch/serialization.py", line 764, in _legacy_load
magic_number = pickle_module.load(f, **pickle_load_args)
_pickle.UnpicklingError: invalid load key, '\x12'.

YudeWang · 2021-01-17T08:28:37Z

@Carlisle-Liu
Load pretrain model

semantic-segmentation-codebase/experiment/deeplabv3+voc/config.py

Line 33 in 995b8fa

'MODEL_BACKBONE_PRETRAIN': True,

And left TRAIN_CKPT=None. TRAIN_CKPT is used for finetuning or recovering from unexpected interruption.

semantic-segmentation-codebase/experiment/deeplabv3+voc/config.py

Line 63 in 995b8fa

config_dict['TRAIN_CKPT'] = None

Carlisle-Liu · 2021-01-18T06:23:52Z

For the test, the "config.py" file in the "seamv1-pseudovoc" has no following attributes, should I copy them from the "config.py" file in the "deeplabv3+voc"?

	'MODEL_BACKBONE_DILATED': True
	'MODEL_BACKBONE_MULTIGRID': False,
	'MODEL_BACKBONE_DEEPBASE': True,

Besides, it also does not have 'DATA_FEATURE_DIR' attribute which is required in the "BaseDataset.py" as shown below. This attribute is also absent in the "config.py" file in the "deeplabv3+voc"? How should I define this attribute?

net initialize
start loading model /students/u6617221/Models/semantic-segmentation-codebase/model/seamv1-pseudovoc/deeplabv1_resnet38_VOCDataset_itr20000_all.pth
Use 4 GPU
0%| | 0/1449 [00:00<?, ?it/s]
Traceback (most recent call last):
File "test.py", line 109, in
test_net()
File "test.py", line 103, in test_net
result_list = single_gpu_test(net, dataloader, prepare_func=prepare_func, inference_func=inference_func, collect_func=collect_func, save_step_func=save_step_func)
File "/students/u6617221/Models/semantic-segmentation-codebase/lib/utils/test_utils.py", line 13, in single_gpu_test
for i_batch, sample in enumerate(dataloader):
File "/students/u6617221/.local/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 435, in next
data = self._next_data()
File "/students/u6617221/.local/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 1085, in _next_data
return self._process_data(data)
File "/students/u6617221/.local/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 1111, in _process_data
data.reraise()
File "/students/u6617221/.local/lib/python3.6/site-packages/torch/_utils.py", line 428, in reraise
raise self.exc_type(msg)
AttributeError: Caught AttributeError in DataLoader worker process 0.
Original Traceback (most recent call last):
File "/students/u6617221/.local/lib/python3.6/site-packages/torch/utils/data/_utils/worker.py", line 198, in _worker_loop
data = fetcher.fetch(index)
File "/students/u6617221/.local/lib/python3.6/site-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/students/u6617221/.local/lib/python3.6/site-packages/torch/utils/data/_utils/fetch.py", line 44, in
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/students/u6617221/Models/semantic-segmentation-codebase/lib/datasets/BaseDataset.py", line 46, in getitem
sample = self.sample_generate(idx)
File "/students/u6617221/Models/semantic-segmentation-codebase/lib/datasets/BaseDataset.py", line 73, in sample_generate
if self.transform == 'none' and self.cfg.DATA_FEATURE_DIR:
AttributeError: 'Configuration' object has no attribute 'DATA_FEATURE_DIR'

YudeWang · 2021-01-25T01:24:24Z

@Carlisle-Liu

'MODEL_BACKBONE_DILATED': True
'MODEL_BACKBONE_MULTIGRID': False,
'MODEL_BACKBONE_DEEPBASE': True,

These configs are not used in ResNet38 backbone.
And set 'DATA_FEATURE_DIR: None,' in config.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

generate_net() issue #1

generate_net() issue #1

Carlisle-Liu commented Jan 17, 2021

YudeWang commented Jan 17, 2021

Carlisle-Liu commented Jan 17, 2021

YudeWang commented Jan 17, 2021

Carlisle-Liu commented Jan 18, 2021

YudeWang commented Jan 25, 2021

generate_net() issue #1

generate_net() issue #1

Comments

Carlisle-Liu commented Jan 17, 2021

YudeWang commented Jan 17, 2021

Carlisle-Liu commented Jan 17, 2021

YudeWang commented Jan 17, 2021

Carlisle-Liu commented Jan 18, 2021

YudeWang commented Jan 25, 2021