-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Convert to ONNX then to TensorRT format #2
Comments
Hi Shakhizat, Unfortunately we haven't attempted yet, but we will try and will let you know in these days Thank you for the patience, |
Hi, Sorry for the delay, we encountered some setbacks during the installation of onnx-tensorrt backend. Do you have a more specific question on the matter? For example, do you need the .onnx file? Are you interested in the partial model or end to end model? Best regards, |
Hello @jchenghu , Thank you for your response. Could you please provide me with a step-by-step procedure on how you converted the model to ONNX? I am interested in running your model on the Nvidia Jetson NX using TensorRT optimization. |
Hi, Thank you for the interest. I'm currently preparing the edited version of the model's code as well as the export script. The exporting the graph has been being trickier than expected (e.g. I encountered strange errors such as SIGFPE whenever I included the masking operation). I will provide you the files as soon as I can. |
I pushed the onnx conversion file and the onnx file here https://drive.google.com/drive/folders/1bBMH4-Fw1LcQZmSzkMCqpEl0piIP88Y3?usp=share_link. However, although the the model can be converted successfully and passes the onnx checker, the onnx graph strangely fails both onnx_tensorrt backend and onnxruntime tests raising errors such as:
From the graph visualization https://netron.app/ It should be caused by this operation in the SwinTransformer block,
but I did not figure out why...
My system info:
I hope the file Beside that, I edited the code in few parts:
I'm still working on it. For the time being, the onnx conversion succeeds but the onnx_tensorrt backend fails. I wonder if it's caused by a version mismatch of my packages. I'd like know if the same occurs to you. |
Hi @jchenghu, Thanks for your detailed response. Just FYI, I experienced below issue
output
|
I believe your particular error is caused by the absence of pre-processing. The input should be resized first to (384, 384) otherwise the model can't be generalized to all images (a pre-processing snippet is shown in |
Anyway, the same error, if you don't mind, please provide a code for inference with Onnx Runtime.
|
Sure, I updated the conversion file with the following part:
in your case you just need to replace I also report the error I'm currently investigating on.
|
Hi, @jchenghu thanks, I apologize if I am being pushy. I am hopeful that it will be resolved soon. |
Hi, Don't worry, you're not being pushy at all :-) Actually, I'm sorry but I've been busy for the previous days because of deadlines. I restarted working on it yesterday. Currently I'm trying to find alternate equivalent implementations in the attempt of bypassing that error. I'll keep you updated. |
IMHO, integrating the ONNX support and TensorRT framework into ExpansionNet will undoubtedly enhance its functionality. I believe this project deserves attention from machine learning enthusiasts worldwide. It's a fantastic project! |
Thank you very much for the support! Actually, we have NVIDIA Jetson in our lab as well, so we share the same goal :-) |
Hi @jchenghu, just fyi, we've already trained image captioning for Kazakh language using ExpansionNetV2 and deployed onto Nvidia Jetson Xavier NX. Regular pytorch model sometimes can overload board and jetson throttles ifself afterwards due to overheating. Our github project: https://github.com/IS2AI/kaz-image-captioning |
That's simply amazing. I'm truly happy and extremely honored this project helped in such cause. Update, Good news: Bad news:
I'll try to fix this as well as soon as possible. I'll keep you updated Jia |
Hi @jchenghu, thanks for your detailed response. I am still experiencing below issue:
the code is
|
Hi, That's odd... can you confirm that
Edit: This is a trivial matter, but since your code seems a little bit different compared to |
Good news, after a lot of digging and fixing I solved the problems and warnings mentioned in the previous post and successfully exported the model on TensorRT on my machine (machine details are reported above) . I suggest you to check the latest commit. I renamed the folder (*) the testing phase on onnx_tensorrt may take a while, you can stop the execution as soon as the onnx file is generated and feed the ONNX file directly to |
@jchenghu, good to hear it. I solved the format conversion from .pth to .onnx, the issue was due to torch and torchvision versions. I assume torchvision needs to be compiled from source on the Jetson rather than installing using pip. Currently, i can not build from the source - https://github.com/onnx/onnx-tensorrt package. Did u install it from the source as well? |
Thank you for the feedback. Yes, the onnx-tensorrt package was installed from the source. However, if you encounter issues with this package, since it was included in |
@jchenghu , just fyi, I failed to convert to tensorRT format using below command: Could you please share your opinion about it ? Thanks |
@jchenghu it might be due to I missed the onnx simplify operation. But I am not sure ,if it is needed. |
My bad. Like in the case of the Demo I implemented the ONNX conversion on CPU rather than GPU because I was afraid the memory cost of the end to end forward could be too much for some GPUs. But probably it's safe to assume that if anyone is interested in running on TensorRT it should have enough memory for the conversion... :-)
Could you please share your TensorRT version? Mine is 8.0 and from what I understand the topk should be supported (https://github.com/onnx/onnx-tensorrt/blob/8.0-GA/docs/operators.md)... |
@jchenghu , no problem, I am also trying to help u, but I am no so experienced like you. I was just able convert to TensorRT using trtexec. The problem was due to that onnx model was not simplified(https://github.com/daquexian/onnx-simplifier). Could you please confirm it?You need to add onnx simplify into your code, If I am not mistaken here. TenrsorRT version is 8.5.2. |
I'm learning a lot thanks to your feedbacks. If I'm not mistaken, since the ONNX graph is an intermediate representation I expect it is not affected by whether the model was converted using the CPU or GPU.
Good to hear!
That's odd, I did not use the simplifier. it may be an issue related to your specific version. However, the solution seem to be a general good practice, I'll try it on my version and if everything works fine I'll integrate it in the next commit. |
@jchenghu, imho, an additional code snippet for TensorRT-based engine file inference using "import tensorrt as trt" should be included. This is because Onnx models are unable to fully leverage the potential of Nvidia-based GPUs. Also just recently, i was not able convert our .pth model in Kazakh language to onnx format, your English model I can. |
Hi @shahizat
Thank you for the suggestion. Building the TensorRT engine and performing an inference comparison, which is currently left to the user, would indeed complete the conversion. Unfortunately, at the present moment I can't perform tests because of maintenances and deadlines. Would it be fine for you if I worked on it in about 5 days?
This is due to the "hard-coded" max sequence length (to avoid the dataset requirement in the demo and onnx conversion), 74 was the case of english COCO. Replacing the argument with 63 should be able to fix it :-) |
Hi @jchenghu, thanks for your support. It fixed the error with our Kazakh model. I prepared a code snippet for inference using TensorRT engine file, but unfortunately it raised the warnings and errors. Please have a look when you have a free time.
Here is the code
|
@jchenghu, I have resolved the issue mentioned earlier. If you would like, you can use the code I've created. I am pleased to share that it is functional and ready for use. I used Nvidia PyTorch NGC 23.01 container with NVIDIA TensorRT 8.5.2.2 for that purpose. Here is working code:
|
@jchenghu sorry again, food for thought, my FP32 TensorRT model is working correctly rather than FP16. |
Hi @shahizat Sorry for the waiting, I'm almost back on track.
That's great, thank you for the code snippet. I'll gladly add it to the project (and make sure to give you the credits).
I'll try it soon and see if I can reproduce the latest issue. |
@jchenghu Hi, you are welcome, do you have any plans to make it multi modal? Or might be adding the functionality of VQA after performing image to text. |
Hi, I managed to reproduce the issue in FP16. The output probabilities are: Edit of 15/03/2023: my hypothesis was wrong, removing almost all encoding and decoding operations the results are still incorrect (the same as above), further investigating on the issue. I'll keep you updated.
Yes, we do. Unfortunately, we cannot share too many details on the matter. I can only guarantee it will be open and it could be part of a bigger project but it is currently hard to predict when it will be released. My goal would be this year, but it is hard to say at the current time. |
Hi @jchenghu, imho FP16, it's a nice-to-have but not a must-have. Anyway you did a great job. Thanks a lot! |
Thank you! Update: sorry for the waiting, I'm very close to solving the issue, expect the commit soon :-) |
Hi @jchenghu , please add me to your LinkedIn Network(https://www.linkedin.com/in/shakhizat-nurgaliyev/). Also, I am waiting when https://viper.cs.columbia.edu/ will be released. |
Hi, Just when I thought I was close, an even more obscure bug popped out...
Therefore, sorry to say this but I'm still investigating... (was good to hear it was not an urgent matter for you) but I Will eventually solve it :-)
Actually, I'm not active on LinkedIn unfortunately, if you want to keep in touch for any reason, like get updated on my works (when I'll be able to release them in public) you can email me here at [email protected] (it's just my proxy mail open to the internet, I'll send you the "real" one in case of poke). Needless to say, in case of other problems/matters feel free to open a new issue.
That's a beautiful idea and a nice paper, is this the reason you were interested in VQA? I cannot share many details but I'm currently working on supervised VQA systems As always, I will keep you updated. Update 23/03/23: Update 27/03/23: Update 04/04/23: Update 9/04/23: Update 11/04/23: Update 14/04/23: |
hi, @jchenghu I'm pleased to see that you are in the final stage of resolving the issue. I was also able to convert using below command and mix the fp16 and int8 precisions. So it means model size can be reduced dramatically as well. |
hi @jchenghu, I was aware that from floating point to integer quantization can cause a high degradation in accuracy, but not from FP32 to FP16. For many neural networks, FP16 should achieve the same accuracy as FP32. Maybe, I am mistaken here. |
I agree it's odd, from FP32 to FP16 shouldn't change that much. More testing needs to be done. Update 21/04/23 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Until now, I've attempted the following patterns: Introduced two new patterns: unfortunately, with no luck. Additionally, I'm busy with a deadline these weeks, and because of that I'm slowed down on finding new solutions, I hope that time it's still not an urgent matter for you |
Hi @shahizat, it's been a long time I just wanted to notify I'm back to investigating the issue these days, I'm trying to tweak the numerically sensitive parts to find the reason behind the different results between fp32 and fp16. However, it seems to be a not-so-rare problem with TensoRT,
I hope the "real" solution is not too far beyond my reach at the moment. How are you doing? Any issue with FP32? Best regards, |
After several attempts to tweak the numerical stability, re-installations, and additional libraries, results in FP16 were still rather odd and always conflicted with the FP32 case. I still don't know if it's an unlucky weight configuration of my model, some library version-related issue or my particular onnx graph is not well received by tensorRT. I wonder if training from scratch the model in FP16 instead of FP32 would enable the conversion in FP16 in tensorRT... Best regards, |
Hi @jchenghu, thanks for your detailed reponse. So far, I haven't observed any issue with the FP32 model yet. Anyway, you did a great job. |
Hello ExpansionNet_v2 contributors,
I was wondering if anyone has attempted to convert ExpansionNet_v2 based ".pth" model to ".onnx" and then to TensorRT format?
Thank you and best regards,
Shakhizat
The text was updated successfully, but these errors were encountered: