Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Onnx to tensorrt conversion for input layer: cast uint8 to fp32 #4131

Open
maxlacourchristensen opened this issue Sep 17, 2024 · 1 comment
Labels
triaged Issue has been triaged by maintainers

Comments

@maxlacourchristensen
Copy link

My pytorch and onnx model has an uint8 to fp32 cast layer which divides by 255. This cast layer is applied to the input tensor. When i convert the onnx model to tensorrt INT8 i get the following warning:

“Missing scale and zero-point for tensor input, expect fall back to non-int8 implementation for any layer consuming or producing given tensor”

For INT8 should i remove the cast layer before exporting the onnx model or does tensorrt deal with it itself? What is the recommended approach for best INT8 performance?

Platform is Jetson Orin AGX, Xavier NX and Orin NX

@lix19937
Copy link

Similar case #3959

@moraxu moraxu added Quantization: PTQ triaged Issue has been triaged by maintainers labels Sep 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
triaged Issue has been triaged by maintainers
Projects
None yet
Development

No branches or pull requests

4 participants