Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

An error occurred while creating data:TypeError: expected dtype object, got 'numpy.dtype[float64]' #499

Closed
shliang0603 opened this issue Apr 28, 2021 · 15 comments
Assignees
Labels

Comments

@shliang0603
Copy link

@Tai-Wang Hi, I have organized the data as required, but an error was reported when creating the data. I am currently not sure what the reason is.

open-mmlab) shl@zhihui-mint:~/shl_res/MMlab/mmdetection3d$ python tools/create_data.py kitti --root-path ./data/kitti --out-dir ./data/kitti --extra-tag kitti
Generate info. this may take several minutes.
[                                                  ] 0/3712, elapsed: 0s, ETA:Traceback (most recent call last):
  File "tools/create_data.py", line 243, in <module>
    out_dir=args.out_dir)
  File "tools/create_data.py", line 23, in kitti_data_prep
    kitti.create_kitti_info_file(root_path, info_prefix)
  File "/home/shl/shl_res/MMlab/mmdetection3d/tools/data_converter/kitti_converter.py", line 117, in create_kitti_info_file
    _calculate_num_points_in_gt(data_path, kitti_infos_train, relative_path)
  File "/home/shl/shl_res/MMlab/mmdetection3d/tools/data_converter/kitti_converter.py", line 65, in _calculate_num_points_in_gt
    points_v, rect, Trv2c, P2, image_info['image_shape'])
  File "/home/shl/shl_res/MMlab/mmdetection3d/mmdet3d/core/bbox/box_np_ops.py", line 639, in remove_outside_points
    frustum_surfaces = corner_to_surfaces_3d_jit(frustum[np.newaxis, ...])
TypeError: expected dtype object, got 'numpy.dtype[float64]'

Below are some versions of the library package I instlled.

(open-mmlab) shl@zhihui-mint:~/shl_res/MMlab/mmdetection3d$ conda list mmcv
# packages in environment at /home/shl/anaconda3/envs/open-mmlab:
#
# Name                    Version                   Build  Channel
mmcv-full                 1.3.1                    pypi_0    pypi
(open-mmlab) shl@zhihui-mint:~/shl_res/MMlab/mmdetection3d$ conda list mmdet
# packages in environment at /home/shl/anaconda3/envs/open-mmlab:
#
# Name                    Version                   Build  Channel
mmdet                     2.11.0                    dev_0    <develop>
mmdet3d                   0.12.0                    dev_0    <develop>
(open-mmlab) shl@zhihui-mint:~/shl_res/MMlab/mmdetection3d$ 
@Tai-Wang
Copy link
Member

Please refer to #465 and FAQ.

@Tai-Wang Tai-Wang added the usage label Apr 28, 2021
@shliang0603
Copy link
Author

@Tai-Wang Thank you very much for your prompt reply. I checked that this error may be caused by the mismatch between the versions of numba and numpy, but I tried to install other versions of numpy, but a new error appeared. The version I tried was :1.16.5、1.17.0、1.18.0、1.19.0, and the following new error appeared:ValueError: numpy.ndarray size changed, may indicate binary incompatibility. Expected 88 from C header, got 80 fro### m PyObject. And I re-executed command python setup.py develop. I recompiled mmcv and other libraries, but I still reported the following error

(open-mmlab) shl@zhihui-mint:~/shl_res/MMlab/mmdetection3d$ python tools/create_data.py kitti --root-path ./data/kitti --out-dir ./data/kitti --extra-tag kitti
Traceback (most recent call last):
  File "tools/create_data.py", line 5, in <module>
    from tools.data_converter import kitti_converter as kitti
  File "/home/shl/shl_res/MMlab/mmdetection3d/tools/data_converter/kitti_converter.py", line 7, in <module>
    from mmdet3d.core.bbox import box_np_ops
  File "/home/shl/shl_res/MMlab/mmdetection3d/mmdet3d/core/__init__.py", line 1, in <module>
    from .anchor import *  # noqa: F401, F403
  File "/home/shl/shl_res/MMlab/mmdetection3d/mmdet3d/core/anchor/__init__.py", line 1, in <module>
    from mmdet.core.anchor import build_anchor_generator
  File "/home/shl/shl_res/MMlab/mmdetection/mmdet/core/__init__.py", line 5, in <module>
    from .mask import *  # noqa: F401, F403
  File "/home/shl/shl_res/MMlab/mmdetection/mmdet/core/mask/__init__.py", line 2, in <module>
    from .structures import BaseInstanceMasks, BitmapMasks, PolygonMasks
  File "/home/shl/shl_res/MMlab/mmdetection/mmdet/core/mask/structures.py", line 6, in <module>
    import pycocotools.mask as maskUtils
  File "/home/shl/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/pycocotools-2.0.2-py3.7-linux-x86_64.egg/pycocotools/mask.py", line 3, in <module>
    import pycocotools._mask as _mask
  File "pycocotools/_mask.pyx", line 1, in init pycocotools._mask
ValueError: numpy.ndarray size changed, may indicate binary incompatibility. Expected 88 from C header, got 80 from PyObject
(open-mmlab) shl@zhihui-mint:~/shl_res/MMlab/mmdetection3d$ 

@Tai-Wang
Copy link
Member

Tai-Wang commented Apr 28, 2021

You need to remove the previous build directory and also maybe the previously built ops, then rebuild all the packages again. If it is convenient, I think you can just remove the original package folder and git clone a new one, then it must be clean enough.
BTW, #425 maybe also helpful for you.

@shliang0603
Copy link
Author

Thank you very much. I have solved this problem. I deleted the conda virtual environment that I built, and then re-created a new virtual environment, and then first installed the numpy==1.18.0 version in the virtual environment. I suggest that you can specify the numpy version in mmdetection, because the latest version is installed by default.

@Tai-Wang Tai-Wang reopened this Apr 28, 2021
@Tai-Wang
Copy link
Member

Tai-Wang commented Apr 28, 2021

Thanks for your kind suggestion. This compatibility issue has been resolved in the latest master version of all the packages. So if you use the latest mmdet3d, numpy and numba, all should be ok.

@Tai-Wang Tai-Wang self-assigned this Apr 28, 2021
@Wuziyi616
Copy link
Contributor

We may need to be cautious about that because MMDet3D may not rely on MMDet 2.12.0 in our next release. I still suggest installing numpy<1.20.0 at first to avoid such issues.

@shliang0603
Copy link
Author

@Tai-Wang @Wuziyi616 Hi, I reinstalled all the environment, MMDetection and MMDetection3D are the latest code, but when training, I received an error :TypeError: init_weights() takes 1 positional argument but 2 were given

(mmlab) shl@zhihui-mint:~/shl_res/mmlab/mmdetection3d$ conda list mmcv
# packages in environment at /home/shl/anaconda3/envs/mmlab:
#
# Name                    Version                   Build  Channel
mmcv-full                 1.3.2                    pypi_0    pypi
(mmlab) shl@zhihui-mint:~/shl_res/mmlab/mmdetection3d$ conda list mmdet
# packages in environment at /home/shl/anaconda3/envs/mmlab:
#
# Name                    Version                   Build  Channel
mmdet                     2.11.0                    dev_0    <develop>
mmdet3d                   0.12.0                    dev_0    <develop>
(mmlab) shl@zhihui-mint:~/shl_res/mmlab/mmdetection3d$ conda list torch
# packages in environment at /home/shl/anaconda3/envs/mmlab:
#
# Name                    Version                   Build  Channel
torch                     1.7.0                    pypi_0    pypi
torchvision               0.8.0                    pypi_0    pypi
(mmlab) shl@zhihui-mint:~/shl_res/mmlab/mmdetection3d$ 

Training command:

python tools/train.py configs/pointpillars/hv_pointpillars_secfpn_6x8_160e_kitti-3d-3class.py

The following is the error message during training. I don't know if this is caused by the library package version

workflow = [('train', 1)]
gpu_ids = range(0, 1)

2021-04-28 20:32:03,352 - mmdet - INFO - Set random seed to 0, deterministic: False
Traceback (most recent call last):
  File "/home/shl/anaconda3/envs/mmlab/lib/python3.7/site-packages/mmcv/utils/registry.py", line 51, in build_from_cfg
    return obj_cls(**args)
  File "/home/shl/shl_res/mmlab/mmdetection3d/mmdet3d/models/detectors/voxelnet.py", line 32, in __init__
    pretrained=pretrained,
  File "/home/shl/shl_res/mmlab/mmdetection3d/mmdet3d/models/detectors/single_stage.py", line 41, in __init__
    self.init_weights(pretrained=pretrained)
  File "/home/shl/shl_res/mmlab/mmdetection3d/mmdet3d/models/detectors/single_stage.py", line 45, in init_weights
    super(SingleStage3DDetector, self).init_weights(pretrained)
TypeError: init_weights() takes 1 positional argument but 2 were given

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "tools/train.py", line 212, in <module>
    main()
  File "tools/train.py", line 176, in main
    test_cfg=cfg.get('test_cfg'))
  File "/home/shl/shl_res/mmlab/mmdetection3d/mmdet3d/models/builder.py", line 56, in build_detector
    cfg, default_args=dict(train_cfg=train_cfg, test_cfg=test_cfg))
  File "/home/shl/anaconda3/envs/mmlab/lib/python3.7/site-packages/mmcv/utils/registry.py", line 210, in build
    return self.build_func(*args, **kwargs, registry=self)
  File "/home/shl/anaconda3/envs/mmlab/lib/python3.7/site-packages/mmcv/cnn/builder.py", line 26, in build_model_from_cfg
    return build_from_cfg(cfg, registry, default_args)
  File "/home/shl/anaconda3/envs/mmlab/lib/python3.7/site-packages/mmcv/utils/registry.py", line 54, in build_from_cfg
    raise type(e)(f'{obj_cls.__name__}: {e}')
TypeError: VoxelNet: init_weights() takes 1 positional argument but 2 were given

@Wuziyi616
Copy link
Contributor

@shliang0603 Yes, this is an emergent bug caused by the newly updated MMDet. They refactor the model's init_weight function, so all our 3D detectors that inherit from them will raise an error when calling init_weight. The workaround now is to install previous version of MMDet (e.g. v2.11.0 which is their release last month) and then install our MMDet3D. For example, you can call

pip install mmdet==2.11.0

and then install mmdet3d. Please report if there exists any problems, because we are on an emergency fix now. Sorry for the inconvenience!

@Wuziyi616
Copy link
Contributor

Also remember to install the newest MMDet3D repo because our fix PR was just merged 10min ago!

@shliang0603
Copy link
Author

shliang0603 commented Apr 29, 2021

@Wuziyi616 Hi, thank you very much for your help. I have solved the above problem. There was a memory leak during the training, so I tried to modify the batch size. But I did not find in the training parameter options and configuration files how to modify the batch size. I looked at the benchmarks document you provided, the GPU hardware you gave was Hardwares: 8 NVIDIA Tesla V100 (32G) GPU, but my GPU is NVIDIA 1080, its video memory is only 8GB, So I want to confirm a few things with you:

  • First, how to modify the size of batch_size
  • How much video memory is required for the pointpillars model at least
  • Is there a smaller model that can run on my NVIDIA 1080 graphics card

Training command:

python tools/train.py configs/pointpillars/hv_pointpillars_secfpn_6x8_160e_kitti-3d-3class.py

2021-04-29 10:07:13,961 - mmdet - INFO - load 337 Misc database infos
2021-04-29 10:07:13,961 - mmdet - INFO - load 56 Person_sitting database infos
2021-04-29 10:07:30,144 - mmdet - INFO - Start running, host: shl@zhihui-mint, work_dir: /home/shl/shl_res/mmlab/mmdetection3d/work_dirs/hv_pointpillars_secfpn_6x8_160e_kitti-3d-3class
2021-04-29 10:07:30,144 - mmdet - INFO - workflow: [('train', 1)], max: 80 epochs
Traceback (most recent call last):
  File "tools/train.py", line 212, in <module>
    main()
  File "tools/train.py", line 208, in main
    meta=meta)
  File "/home/shl/anaconda3/envs/mmlab/lib/python3.7/site-packages/mmdet/apis/train.py", line 170, in train_detector
    runner.run(data_loaders, cfg.workflow)
  File "/home/shl/anaconda3/envs/mmlab/lib/python3.7/site-packages/mmcv/runner/epoch_based_runner.py", line 125, in run
    epoch_runner(data_loaders[i], **kwargs)
  File "/home/shl/anaconda3/envs/mmlab/lib/python3.7/site-packages/mmcv/runner/epoch_based_runner.py", line 51, in train
    self.call_hook('after_train_iter')
  File "/home/shl/anaconda3/envs/mmlab/lib/python3.7/site-packages/mmcv/runner/base_runner.py", line 307, in call_hook
    getattr(hook, fn_name)(self)
  File "/home/shl/anaconda3/envs/mmlab/lib/python3.7/site-packages/mmcv/runner/hooks/optimizer.py", line 35, in after_train_iter
    runner.outputs['loss'].backward()
  File "/home/shl/anaconda3/envs/mmlab/lib/python3.7/site-packages/torch/tensor.py", line 221, in backward
    torch.autograd.backward(self, gradient, retain_graph, create_graph)
  File "/home/shl/anaconda3/envs/mmlab/lib/python3.7/site-packages/torch/autograd/__init__.py", line 132, in backward
    allow_unreachable=True)  # allow_unreachable flag
RuntimeError: CUDA out of memory. Tried to allocate 472.00 MiB (GPU 0; 7.93 GiB total capacity; 5.03 GiB already allocated; 405.12 MiB free; 5.71 GiB reserved in total by PyTorch)
(mmlab) shl@zhihui-mint:~/shl_res/mmlab/mmdetection3d$ ls

@Wuziyi616
Copy link
Contributor

Yes, you can modify the batch size. As you can see here, the dataset part in your config file is inherited from configs/_base_/datasets/kitti-3d-3class.py, and the batch size setting is this line. Because of the inheritance of config, you can simply modify the batch size by adding one line samples_per_gpu=xxx here. It will overwrite the value set previously. I highly recommend you to have a look at this document about configs, it will helps you better understand our parameters and config names.

I am not familiar with outdoor lidar models, can you @Tai-Wang kindly have a look here about another two questions? As far as I know, you can look at the README for each model, and see what's their GPU usage during training. For example, this page says that training SECOND only uses less than 6GB per GPU. Maybe you can have a try!

@Wuziyi616
Copy link
Contributor

Also some message about multi-GPU training is here. Most of our models are trained on 4/8 GPUs.

@shliang0603
Copy link
Author

@Wuziyi616 Thank you very much for your prompt reply. I will try it according to your suggestion

@Tai-Wang
Copy link
Member

For your case, I think you can simply reduce the samples_per_gpu to 2~4 and have a try. Please also remember to reduce the learning rate proportionally to make them consistent.

@Wuziyi616
Copy link
Contributor

Feel to re-open this issue if you have any further questions :)

tpoisonooo pushed a commit to tpoisonooo/mmdetection3d that referenced this issue Sep 5, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants