-
Notifications
You must be signed in to change notification settings - Fork 648
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug] mmengine - WARNING - Failed to search registry with scope "mmdet" in the "Codebases" registry tree #2334
Comments
hi, there's something wrong with rtmdet-inst +onnxruntime. Will debug and fix later. |
Great, thanks! |
hi, the registry warnings could be safely ignored as the modules are in mmdet and other codebases, and the default scope is mmdeploy. |
I thought the warnings might be a problem, because changing the code in rtmdet_ins_head.py does not seem to do anything. Do you have any idea how that is possible if it is not because of the registry warnings? Also I think the problem mentioned in this issue #2328 is not a problem I am having, because I do get correct masks, but only if I use a batch size of 1. When I use a batch size larger than one, session.run just leads to a crash, so I don't even get any output masks. |
@cozeybozey hi could you try this pr #2343? batch infer should be fixed for ort. |
Thanks for the pr! But I still get the same error with an RTMDet export from your forked repository unfortunately. Although I am not sure, should I do an entire fresh install in a new conda environment for the pr to work? |
hi, you have to rebuild mmdeploy and rerun |
I see, I am relatively new to the building of packages and such. I am following this guide since I am using Linux: https://github.com/open-mmlab/mmdeploy/blob/main/docs/en/01-how-to-build/linux-x86_64.md But it seems this guide is intended for Ubuntu users. I on the other hand am using Debian. Does that mean I am out of luck? For now I am specifically stuck on this command:
The first line seems to be for Ubuntu users only, but when I run the last 3 lines I get this:
|
Hi, you could just use your old env and run this script to build after git pull the code
|
Ah that makes building a lot easier, thanks!
So maybe me not being able to install g++-7 like I sent in the comment before is still the problem. |
you can run |
When I run g++ --version I get this:
So I changed g++-7 to g++ in CMakeCache.txt, but then I get the following error:
|
I fixed the above problem, so now it uses the g++-11 compiler (g++-12 would not work either because it does not work with my cuda version). However after doing that I still had issues with it not recognizing OpenCV, ONNX Runtime, pplcv and the third party folders that seem contain a link according to this github page: https://github.com/RunningLeon/mmdeploy/tree/fix_rtmdet_inst/third_party I now fixed the OpenCV and ONNX Runtime issues (by installing them both), but after building pplcv I still get errors with that and I haven't tried installing the third party packages yet. But I was wondering, is it normal that I have to do all this? Because you said that if I already had a working environment, then all I have to do is run:
But now it seems I am rebuilding and installing the entire environment from scratch, so I think I might be doing something wrong. |
hi, How did you install mmdeploy before? If you have successfully used mmdeploy in your local machine, then you are supposed to have a runnable env and all you need is to git pull the new code and rebuild the mmdeploy. |
I followed this guide: https://github.com/open-mmlab/mmdeploy/blob/main/docs/en/get_started.md |
hi, if you only need to do torch2onnx without inference, you can ignore building part and just run |
Thanks for all the help! I finally got it working and I can confirm that this PR indeed fixes the batch inference issue with RTMDet. |
Don't worry about it. There are constant tensors. |
Maybe I misunderstand, but if they are not connected to anything then they don't do anything right? Also I am having a bit of a memory problem, when I use a batch size of 12 or larger it crashes. I found this a bit strange, since I can run a batch size of 128 easily with yolov8 and I am using the smallest possible model for both instances. Finally I find the output a bit confusing, there is a dimension for number of detections, which seems to be dynamic. But it also seems capped at 99 and when I give it a bunch of black images it becomes equal to 5 * batch_size. I was wondering whether you can maybe give some insight into how the number of detections are produced. |
hi, for batch inference, you can refer to the following code: mmdeploy/mmdeploy/mmcv/ops/nms.py Lines 219 to 231 in a7e9be2
|
Okay thanks, now I understand how the number of detections are produced. However, I still don't really understand why the model takes up so much memory. Have you experienced this issue as well? |
hi, we did not test the running mem on large batchsize. |
This issue is marked as stale because it has been marked as invalid or awaiting response for 7 days without any further response. It will be closed in 5 days if the stale label is not removed or if there is no further response. |
This issue is closed because it has been stale for 5 days. Please open a new issue if you have similar issues or you have any new updates now. |
Checklist
Describe the bug
I am trying to fix the bug in RTMDET that causes the batch inference to not work for instance segmentation on a deployed onnx model. I already opened an issue about this bug: #2075.
To fix this issue I am trying to change the code in the rtmdet_ins_head.py file. But all the code I put in there does not get executed when I run deploy.py. I expect this is because the registries are not working. So my question is how I can fix these registries warnings.
Reproduction
Environment
Error traceback
The text was updated successfully, but these errors were encountered: