Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Int8 calculation problem #76

Open
CoinCheung opened this issue Sep 21, 2024 · 1 comment
Open

Int8 calculation problem #76

CoinCheung opened this issue Sep 21, 2024 · 1 comment

Comments

@CoinCheung
Copy link

Hi,

I posted the error message in the repo of Tensorrt and they refered to this repo, so I open an issue here. The problem is that when I quantize the model in pytorch with modelopt and export it to onnx, the tensorrt will fail to compile the onnx file into tensorrt engine..

Here is the link with an example code piece which shows how to reproduce the error message:

NVIDIA/TensorRT#4095 (comment)

Please help me to make it work.

@cjluo-omniml
Copy link
Collaborator

Have you tried the onnx PTQ workflow which exports to the onnx first then do the quantization? See https://github.com/NVIDIA/TensorRT-Model-Optimizer/tree/main/onnx_ptq

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants