-
-
Notifications
You must be signed in to change notification settings - Fork 372
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
When generated the Engine(FP16) in Tesla T4, an error occured. And the Engine(FP32) that generated in the Tesla T4 couldn't detect. #353
Comments
@marcoslucianops Hope you could help us address this problem. Thank you a lot!!! |
@marcoslucianops When I use the ONNX to generate the engine in the PC with the environment: DeepStream 6.0, TensorRT 8.4.2, CUDA 11.4, and cuDNN 8.2,(This PC's GPU is T4. I guess maybe something wrong with this GPU to generate engine?) there still had something wrong: ERROR: [TRT]: ModelImporter.cpp:776: --- End node --- I don't know how to solve this problem. I did the same procedure on my own computer with a GPU--RTX 3060. It would not meet this problem.(I could successfully generate the engine.) Hope you could help us solve this problem. Thank you a lot!!! |
@PX-Xu +1 |
About the |
I have the same issue as well. I want to ask why I can't generate an engine on the Tesla T4 GPU using the method of converting weights to an engine. The error message I receive is the same as mentioned in this issue。[property] [class-attrs-all] This is the configuration file I am using. |
@marcoslucianops Please help me, I also encountered the issue where I can use weights to convert to an engine on NVIDIA RTX series GPUs, but it doesn't work on T4 GPUs. My environment is similar to this |
@lzylzylzy123456, the |
etworkDefinition&, nvinfer1::IBuilderConfig&) () |
@marcoslucianops Thanks for your answer. The Concat error is solved. But I still couldn't generate the engine. The error is the following: It is like the fault that I met the first time when I used the darknet Model. Actually, we always used the Darknet model to generate the engine. But this time we met the fault. Could you answer the first question I proposed? Thank you a lot. |
@lzylzylzy123456, to use Darknet model (
|
@PX-Xu, can you debug the segmentation fault using gdb? |
@marcoslucianops I debug it in the gdb. This is the result: It seems like there is something wrong with the libnvdsinfer_custom_impl_Yolo.so. But I had already made in my PC. |
Can you send your model to my email to check this error? |
I have already sent the model and configuration files to you email. |
I can generate the ENGINE using this configuration on RTX series graphics cards, as well as on the JETSON development board. However, I am unable to generate the ENGINE on the T4 graphics card of the cloud server. |
I have discovered something: we have developed a plugin for DeepStream, and we encountered a memory leak issue. We used Valgrind for detection and initially suspected that it might be due to OpenCV or our custom code. However, after commenting out all the custom parts, we found that the plugin was still leaking memory. Therefore, we decided to investigate the unmodified plugin and confirmed that it indeed had a memory leak. The memory usage only increases and does not decrease. I found a forum post (https://forums.developer.nvidia.com/t/memory-leak-in-deepstream/204163/9) where you also encountered this issue a year ago. Now, my environment is similar to yours, and I'm using DeepStream version 6.0.The memory leak issue occurs on every machine. |
If you have any solutions, please let us know. Thank you very much! |
Do you have any solutions? I tried reinstalling the environment and recompiling the dynamic library files, but I still can't generate an fp16 engine. However, I can generate an fp32 engine, but it is not effective for detection.Do you have a solution for the memory leaks in DeepStream? If so, please let me know. Thank you very much! |
I just tested in T4 (AWS: |
Which version is the Ubuntu you are using? |
Thank you very much!!! I use Ubuntu 18.04. |
@marcoslucianops Thank you for using the T4 test generation engine yourself. If you successfully generated the engine (FP16 and FP32), can you send me a copy by email. Thank you very much! ! ! We've been stuck with this problem for a week and would really appreciate your help in solving it!! ! |
The engine doesn't work in other computer (only if you use exactly the same enviroment, but it's recommended to generate it in each computer). I will try on DeepStream 6.0 and update you. |
Thank you! Waiting for your good news! |
Probably the problem is in your enviroment. I can run it on DeepStream 6.0. I checked your first comment and you are using DeepStream 6.0, TensorRT 8.4.2, CUDA 11.4 and cuDNN 8.2.4. The requirements for DeepStreeam 6.0.1/6.0 are:
Please follow the instructions in DeepStream 6.0.1 / 6.0 (https://github.com/marcoslucianops/DeepStream-Yolo#dgpu-installation) to install the correct versions and try again. |
|
I found a forum post (https://forums.developer.nvidia.com/t/memory-leak-in-deepstream/204163/9) where you also encountered this issue a year ago. Now, my environment is similar to yours, and I'm using DeepStream version 6.0.The memory leak issue occurs on every machine. Have you solved the issue of memory leakage in DeepStream? I have discovered that it is caused by a specific open-source plugin in DeepStream. Thank you very much! |
Have you solved the memory leak issue in deepstream? If you have, please let us know the solution. Thank you very much. I'm using DeepStream version 6.0. |
The NVIDIA says to use |
@marcoslucianops |
Hi! I met a question when I generated the engine for yolov7 in the GPU Tesla T4. The environment I configured is DeepStream 6.0, TensorRT 8.4.2, CUDA 11.4, and a version of cuDNN compatible with CUDA 11.4. Actually, we use the darknet to train our yolov7 model. And then we use the cfg file and weight file to generate the engine like it was used in the yolov3. We have successfully used this method to generate the engine in the RTX 3060,4080 etc. But, this time when we want to generate the engine in the Tesla T4, the engine(FP16) couldn't be generated. And the engine(FP32) could be generated, but we could not get the detecting result. We compile the dynamic library 'libnvdsinfer_custom_impl_Yolo.so' for this conversion on each new machine and place it in the corresponding folder of our project. We use the 'config_infer_primary_yoloV7.txt' file to specify the paths, which guides the engine generation process.
The error we met when generating the engine(FP16) is the following:
etworkDefinition&, nvinfer1::IBuilderConfig&) ()
from /home/dgp/ITS_code/its-deepstream/remote_gnr/model/libnvdsinfer_custom_impl_Yolo.so
#23 0x00007fff822e0c7b in Yolo::createEngine(nvinfer1::IBuilder*, nvinfer1::IBuilderConfig*) ()
from /home/dgp/ITS_code/its-deepstream/remote_gnr/model/libnvdsinfer_custom_impl_Yolo.so
#24 0x00007fff822f4a56 in NvDsInferYoloCudaEngineGet ()
from /home/dgp/ITS_code/its-deepstream/remote_gnr/model/libnvdsinfer_custom_impl_Yolo.so
#25 0x00007fffce435562 in nvdsinfer::TrtModelBuilder::getCudaEngineFromCustomLib(bool ()(nvinfer1::IBuilder, _NvDsInferContextInitParams*, nvinfer1::DataType, nvinfer1::ICudaEngine*&), bool ()(nvinfer1::IBuilder, nvinfer1::IBuilderConfig*, _NvDsInferContextInitParams const*, nvinfer1::DataType, nvinfer1::ICudaEngine*&), _NvDsInferContextInitParams const&, NvDsInferNetworkMode&) ()
from ///opt/nvidia/deepstream/deepstream-6.0/lib/libnvds_infer.so
#26 0x00007fffce4359b4 in nvdsinfer::TrtModelBuilder::buildModel(_NvDsInferContextInitParams const&, std::__cxx11::basic_string<char, std::char_traits, std::allocator >&) () from ///opt/nvidia/deepstream/deepstream-6.0/lib/libnvds_infer.so
#27 0x00007fffce3f55e4 in nvdsinfer::NvDsInferContextImpl::buildModel(_NvDsInferContextInitParams&) ()
from ///opt/nvidia/deepstream/deepstream-6.0/lib/libnvds_infer.so
#28 0x00007fffce3f62a1 in nvdsinfer::NvDsInferContextImpl::generateBackendContext(_NvDsInferContextInitParams&) ()
from ///opt/nvidia/deepstream/deepstream-6.0/lib/libnvds_infer.so
#29 0x00007fffce3f053b in nvdsinfer::NvDsInferContextImpl::initialize(_NvDsInferContextInitParams&, void*, void ()(INvDsInferContext, unsigned int, NvDsInferLogLevel, char const*, void*)) () from ///opt/nvidia/deepstream/deepstream-6.0/lib/libnvds_infer.so
#30 0x00007fffce3f6ce9 in createNvDsInferContext(INvDsInferContext**, _NvDsInferContextInitParams&, void*, void ()(INvDsInferContext, unsigned int, NvDsInferLogLevel, char const*, void*)) () from ///opt/nvidia/deepstream/deepstream-6.0/lib/libnvds_infer.so
#31 0x00007fffd45677c1 in gst_nvinfer_start(_GstBaseTransform*) ()
from /usr/lib/x86_64-linux-gnu/gstreamer-1.0/deepstream/libnvdsgst_infer.so
#32 0x00007fffe9bf6270 in ?? () from /home/dgp/ITS_code/its-deepstream/remote_gnr/lib/libgstbase-1.0.so.0
#33 0x00007fffe9bf6505 in ?? () from /home/dgp/ITS_code/its-deepstream/remote_gnr/lib/libgstbase-1.0.so.0
#34 0x00007ffff1e8c6ab in ?? () from /home/dgp/ITS_code/its-deepstream/remote_gnr/lib/libgstreamer-1.0.so.0
#35 0x00007ffff1e8d126 in gst_pad_set_active () from /home/dgp/ITS_code/its-deepstream/remote_gnr/lib/libgstreamer-1.0.so.0
#36 0x00007ffff1e6af0d in ?? () from /home/dgp/ITS_code/its-deepstream/remote_gnr/lib/libgstreamer-1.0.so.0
#37 0x00007ffff1e7d884 in gst_iterator_fold () from /home/dgp/ITS_code/its-deepstream/remote_gnr/lib/libgstreamer-1.0.so.0
#38 0x00007ffff1e6ba16 in ?? () from /home/dgp/ITS_code/its-deepstream/remote_gnr/lib/libgstreamer-1.0.so.0
#39 0x00007ffff1e6d95e in ?? () from /home/dgp/ITS_code/its-deepstream/remote_gnr/lib/libgstreamer-1.0.so.0
#40 0x00007ffff1e6dc8f in ?? () from /home/dgp/ITS_code/its-deepstream/remote_gnr/lib/libgstreamer-1.0.so.0
#41 0x00007ffff1e6fd5e in gst_element_change_state () from /home/dgp/ITS_code/its-deepstream/remote_gnr/lib/libgstreamer-1.0.so.0
#42 0x00007ffff1e70499 in ?? () from /home/dgp/ITS_code/its-deepstream/remote_gnr/lib/libgstreamer-1.0.so.0
#43 0x00007ffff1e4da02 in ?? () from /home/dgp/ITS_code/its-deepstream/remote_gnr/lib/libgstreamer-1.0.so.0
#44 0x00007ffff1e6fd5e in gst_element_change_state () from /home/dgp/ITS_code/its-deepstream/remote_gnr/lib/libgstreamer-1.0.so.0
#45 0x00007ffff1e70045 in gst_element_change_state () from /home/dgp/ITS_code/its-deepstream/remote_gnr/lib/libgstreamer-1.0.so.0
#46 0x00007ffff1e70499 in ?? () from /home/dgp/ITS_code/its-deepstream/remote_gnr/lib/libgstreamer-1.0.so.0
#47 0x0000555555612e48 in StreamControl::init(std::vector<std::__cxx11::basic_string<char, std::char_traits, std::allocator >, std::allocator<std::__cxx11::basic_string<char, std::char_traits, std::allocator > > > const&, std::vector<std::__cxx11::basic_string<char, std::char_traits, std::allocator >, std::allocator<std::__cxx11::basic_string<char, std::char_traits, std::allocator > > > const&, int, long, int, int) ()
#48 0x0000555555679f39 in SmartDeviceControl::init_deep_pipeline() ()
#49 0x000055555567d49e in SmartDeviceControl::init() ()
#50 0x000055555556e678 in main ()”We have discovered that the code for generating the engine is causing the program to crash. How can we resolve this issue.
Hope you can help us to address this problem. Thank you a lot !!!
The text was updated successfully, but these errors were encountered: