-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dynamic batch results are strange. #3744
Comments
use |
Exactly, did you set input shapes before you feed another input shape for inference? |
polygraphy run model.onnx --trt --onnxrt --input-shapes=input_ids:3x128 attention_mask:3x128
An error occurs... |
The shape is not set separately before inference.
|
There is no problem with the polygraphy command in fp32, but a problem occurs in fp16. polygraphy run model.onnx --trt --onnxr --fp16
|
The diff look good to me for FP16, did you observe significant accuracy loss on the real dataset? |
The failure here was due the the default polygraphy error tolerance is 1e-5. |
@KyungHyunLim, is this issue solved? |
@littleMatch03 I'm still trying to figure it out. |
There was no problem when learning. When I gave the ["I love dog"] as input, The onnx model shows similar results. For example,
There is no problem when typing 1 like this, but if the batch size exceeds 1, there is a problem. |
This is the result of using polygraphy inspect model
|
Normally it's caused by not setting the correct dynamic shape profile or input shapes. |
closing since no activity for more than 3 weeks, pls reopen if you still have question, thanks all! |
Description
I have created an onnx model with the python sript below.
Then, I created the tensorrt model with the following command.
I moved both models to the triton inference server.
tensorrt example
In the case of the tensorrt model, the results are different when inferring one same sentence and when inferring three sentences.
==> output
onnx example
The onnx model gives the same results.
==> output
I need to use dynamic batch from 1x128 to 100x128.
Is there a problem with the tensorrt model conversion process?
Environment
nvcr.io/nvidia/tritonserver:23.04-py3
nvcr.io/nvidia/tensorrt:23.04-py3
TensorRT Version: 8.6.1
NVIDIA GPU: GPU: 4090
NVIDIA Driver Version: 530.41.03
CUDA Version: 12.1
CUDNN Version: 8.9.0
[03/22/2024-01:11:21] [I] [TRT] Input filename: model.onnx
[03/22/2024-01:11:21] [I] [TRT] ONNX IR version: 0.0.8
[03/22/2024-01:11:21] [I] [TRT] Opset version: 15
[03/22/2024-01:11:21] [I] [TRT] Producer name: pytorch
[03/22/2024-01:11:21] [I] [TRT] Producer version: 1.13.1
Relevant Files
make_onnx.py
The text was updated successfully, but these errors were encountered: