You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello! I was trying to run the maskrcnn, and during the training, the following error occurred. Here is the traceback:
Traceback (most recent call last):
File "tools/train_mlperf.py", line 367, in <module>
main()
File "tools/train_mlperf.py", line 356, in main
model, success = train(cfg, args.local_rank, args.distributed, random_number_generator)
File "tools/train_mlperf.py", line 248, in train
per_iter_end_callback_fn=per_iter_callback_fn,
File "/workspace/object_detection/maskrcnn_benchmark/engine/trainer.py", line 109, in do_train
loss_dict = model(images, targets)
File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 577, in __call__
result = self.forward(*input, **kwargs)
File "/workspace/object_detection/maskrcnn_benchmark/modeling/detector/generalized_rcnn.py", line 50, in forward
features = self.backbone(images.tensors)
File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 577, in __call__
result = self.forward(*input, **kwargs)
File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/container.py", line 100, in forward
input = module(input)
File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 577, in __call__
result = self.forward(*input, **kwargs)
File "/workspace/object_detection/maskrcnn_benchmark/modeling/backbone/resnet.py", line 152, in forward
x = getattr(self, stage_name)(x)
File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 577, in __call__
result = self.forward(*input, **kwargs)
File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/container.py", line 100, in forward
input = module(input)
File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 577, in __call__
result = self.forward(*input, **kwargs)
RuntimeError: The following operation failed in the TorchScript interpreter.
Traceback of TorchScript (most recent call last):
File "/workspace/object_detection/maskrcnn_benchmark/modeling/backbone/resnet.py", line 317, in forward
out = self.conv1(x)
out = self.bn1(out)
~~~~~~~~ <--- HERE
out = F.relu(out)
File "/workspace/object_detection/maskrcnn_benchmark/modeling/backbone/resnet.py", line 316, in forward
identity = x
out = self.conv1(x)
~~~~~~~~~~ <--- HERE
out = self.bn1(out)
out = F.relu(out)
RuntimeError: ModuleAttributeError: 'RecursiveScriptModule' object has no attribute '_conv_forward'
At:
/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py(621): __getattr__
/opt/conda/lib/python3.6/site-packages/torch/jit/__init__.py(1617): __getattr__
/opt/conda/lib/python3.6/site-packages/torch/jit/__init__.py(1836): __getattr__
/opt/conda/lib/python3.6/site-packages/torch/nn/modules/conv.py(377): forward
/workspace/object_detection/maskrcnn_benchmark/layers/misc.py(34): forward
/opt/conda/lib/python3.6/site-packages/torch/jit/_recursive.py(624): lazy_binding_method
/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py(577): __call__
/opt/conda/lib/python3.6/site-packages/torch/nn/modules/container.py(100): forward
/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py(577): __call__
/workspace/object_detection/maskrcnn_benchmark/modeling/backbone/resnet.py(152): forward
/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py(577): __call__
/opt/conda/lib/python3.6/site-packages/torch/nn/modules/container.py(100): forward
/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py(577): __call__
/workspace/object_detection/maskrcnn_benchmark/modeling/detector/generalized_rcnn.py(50): forward
/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py(577): __call__
/workspace/object_detection/maskrcnn_benchmark/engine/trainer.py(109): do_train
tools/train_mlperf.py(248): train
tools/train_mlperf.py(356): main
tools/train_mlperf.py(367): <module>
I was running on NVIDIA-RTX, Ubuntu 16.04, CUDA 10.2.
Any idea on how to fix this?
The text was updated successfully, but these errors were encountered:
Hello! I was trying to run the maskrcnn, and during the training, the following error occurred. Here is the traceback:
I was running on NVIDIA-RTX, Ubuntu 16.04, CUDA 10.2.
Any idea on how to fix this?
The text was updated successfully, but these errors were encountered: