terrible TensorRT accuracy when running inference on an object tracking algorithm #3609

ninono12345 · 2024-01-18T01:50:37Z

Description

I am trying to convert the inference part of the pytracking tomp101 algorithm to tensorrt

I've converted it to onnx, the inference seems to be fine, the bounding box catches on correctly ALTHOUGH comparing the difference between output tensors of the original model and onnx model using this code seem to differ by quite a lot (the tensor values seem to differ, but tracks objects just fine :D ):

{sample_x [dtype=float32, shape=(1, 1024, 18, 18)],
train_samples [dtype=float32, shape=(1, 1024, 18, 18)],
target_labels [dtype=float32, shape=(1, 1, 18, 18)],
train_ltrb [dtype=float32, shape=(1, 4, 18, 18)]}
[I] trt-runner-N0-01/18/24-04:45:38
---- Inference Output(s) ----
{bbreg_test_feat_enc [dtype=float32, shape=(1, 1, 256, 18, 18)],
bbreg_weights [dtype=float32, shape=(1, 256, 1, 1)],
target_scores [dtype=float32, shape=(1, 1, 18, 18)]}

r1 = original_model(inputs)
r2 = session.run(inputs)

avg11=avg11+(torch.mean(torch.abs(r1[0] - torch.from_numpy(r2[0]).cuda())))
avg12=avg12+(torch.mean(torch.abs(r1[1] - torch.from_numpy(r2[1]).cuda())))
avg13=avg13+(torch.mean(torch.abs(r1[2] - torch.from_numpy(r2[2]).cuda())))

print(avg11/30)
print(avg12/30)
print(avg13/30)

BUT

when the model is converted to TensorRT the accuracy drops!!! inference is terrible.

Does anybody have any suggestions on how to improve it? Maybe should I modify the onnx model with graph surgeon? maybe theres some polygraphy tool that I could use?

Maybe there is a trtexec method of converting that preserves accuracy?

THANK YOU

Environment

TensorRT Version: 8.6

NVIDIA GPU: GTX 1660 Ti

NVIDIA Driver Version: 546.01

CUDA Version: 12.1

CUDNN Version: 8.9.7

Operating System:

Python Version (if applicable): 3.10.13

PyTorch Version (if applicable): 2.1.2+cu121

Baremetal or Container (if so, version): no environment

Relevant Files

Model link: https://drive.google.com/file/d/1rKmrrktevdtL9Namevg3XdpMXWjTc3Gv/view?usp=sharing

zerollzeng · 2024-01-19T09:31:31Z

Could you please check this quickly with polygraphy? The usage would be like polygraphy run model.onnx --trt --onnxrt

You can feed real input too, see https://github.com/NVIDIA/TensorRT/tree/main/tools/Polygraphy/examples/cli/run/05_comparing_with_custom_input_data

ninono12345 · 2024-01-19T09:56:05Z

Thank you @zerollzeng, here is the answer:

D:\pyth\pytracking-master>polygraphy run modified_latest3_sanitized2.onnx --trt --onnxrt
[I] RUNNING | Command: C:\Users\Tomas\AppData\Local\Programs\Python\Python310\Scripts\polygraphy run modified_latest3_sanitized2.onnx --trt --onnxrt
[I] trt-runner-N0-01/19/24-12:30:50 | Activating and starting inference
[W] onnx2trt_utils.cpp:374: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[W] onnx2trt_utils.cpp:400: One or more weights outside the range of INT32 was clamped
[W] Input tensor: sample_x (dtype=DataType.FLOAT, shape=(-1, 1024, 18, 18)) | No shapes provided; Will use shape: [1, 1024, 18, 18] for min/opt/max in profile.
[W] This will cause the tensor to have a static shape. If this is incorrect, please set the range of shapes for this input tensor.
[W] Input tensor: train_samples (dtype=DataType.FLOAT, shape=(-1, 1024, 18, 18)) | No shapes provided; Will use shape: [1, 1024, 18, 18] for min/opt/max in profile.
[W] Input tensor: target_labels (dtype=DataType.FLOAT, shape=(-1, 1, 18, 18)) | No shapes provided; Will use shape: [1, 1, 18, 18] for min/opt/max in profile.
[W] Input tensor: train_ltrb (dtype=DataType.FLOAT, shape=(-1, 4, 18, 18)) | No shapes provided; Will use shape: [1, 4, 18, 18] for min/opt/max in profile.
[I] Configuring with profiles:[
Profile 0:
{sample_x [min=[1, 1024, 18, 18], opt=[1, 1024, 18, 18], max=[1, 1024, 18, 18]],
train_samples [min=[1, 1024, 18, 18], opt=[1, 1024, 18, 18], max=[1, 1024, 18, 18]],
target_labels [min=[1, 1, 18, 18], opt=[1, 1, 18, 18], max=[1, 1, 18, 18]],
train_ltrb [min=[1, 4, 18, 18], opt=[1, 4, 18, 18], max=[1, 4, 18, 18]]}
]
[I] Building engine with configuration:
Flags | []
Engine Capability | EngineCapability.DEFAULT
Memory Pools | [WORKSPACE: 6143.69 MiB, TACTIC_DRAM: 6143.69 MiB]
Tactic Sources | [CUBLAS, CUBLAS_LT, CUDNN, EDGE_MASK_CONVOLUTIONS, JIT_CONVOLUTIONS]
Profiling Verbosity | ProfilingVerbosity.DETAILED
Preview Features | [FASTER_DYNAMIC_SHAPES_0805, DISABLE_EXTERNAL_TACTIC_SOURCES_FOR_CORE_0805]
[I] Finished engine building in 39.848 seconds
[I] trt-runner-N0-01/19/24-12:30:50
---- Inference Input(s) ----
{sample_x [dtype=float32, shape=(1, 1024, 18, 18)],
train_samples [dtype=float32, shape=(1, 1024, 18, 18)],
target_labels [dtype=float32, shape=(1, 1, 18, 18)],
train_ltrb [dtype=float32, shape=(1, 4, 18, 18)]}
[I] trt-runner-N0-01/19/24-12:30:50
---- Inference Output(s) ----
{bbreg_test_feat_enc [dtype=float32, shape=(1, 1, 256, 18, 18)],
bbreg_weights [dtype=float32, shape=(1, 256, 1, 1)],
target_scores [dtype=float32, shape=(1, 1, 18, 18)]}
[I] trt-runner-N0-01/19/24-12:30:50 | Completed 1 iteration(s) in 185.7 ms | Average inference time: 185.7 ms.
[I] onnxrt-runner-N0-01/19/24-12:30:50 | Activating and starting inference
[I] Creating ONNX-Runtime Inference Session with providers: ['CPUExecutionProvider']
[I] onnxrt-runner-N0-01/19/24-12:30:50
---- Inference Input(s) ----
{sample_x [dtype=float32, shape=(1, 1024, 18, 18)],
train_samples [dtype=float32, shape=(1, 1024, 18, 18)],
target_labels [dtype=float32, shape=(1, 1, 18, 18)],
train_ltrb [dtype=float32, shape=(1, 4, 18, 18)]}
[I] onnxrt-runner-N0-01/19/24-12:30:50
---- Inference Output(s) ----
{target_scores [dtype=float32, shape=(1, 1, 18, 18)],
bbreg_test_feat_enc [dtype=float32, shape=(1, 1, 256, 18, 18)],
bbreg_weights [dtype=float32, shape=(1, 256, 1, 1)]}
[I] onnxrt-runner-N0-01/19/24-12:30:50 | Completed 1 iteration(s) in 242.9 ms | Average inference time: 242.9 ms.
[I] Accuracy Comparison | trt-runner-N0-01/19/24-12:30:50 vs. onnxrt-runner-N0-01/19/24-12:30:50
[I] Comparing Output: 'bbreg_test_feat_enc' (dtype=float32, shape=(1, 1, 256, 18, 18)) with 'bbreg_test_feat_enc' (dtype=float32, shape=(1, 1, 256, 18, 18))
[I] Tolerance: [abs=1e-05, rel=1e-05] | Checking elemwise error
[I] trt-runner-N0-01/19/24-12:30:50: bbreg_test_feat_enc | Stats: mean=0.012191, std-dev=0.45428, var=0.20637, median=-0.011814, min=-3.197 at (0, 0, 92, 0, 17), max=5.5255 at (0, 0, 125, 14, 9), avg-magnitude=0.11202
[I] onnxrt-runner-N0-01/19/24-12:30:50: bbreg_test_feat_enc | Stats: mean=0.012191, std-dev=0.45428, var=0.20637, median=-0.011814, min=-3.197 at (0, 0, 92, 0, 17), max=5.5255 at (0, 0, 125, 14, 9), avg-magnitude=0.11202
[I] Error Metrics: bbreg_test_feat_enc
[I] Minimum Required Tolerance: elemwise error | [abs=2.861e-06] OR [rel=1.0297] (requirements may be lower if both abs/rel tolerances are set)
[I] Absolute Difference | Stats: mean=7.755e-08, std-dev=1.0897e-07, var=1.1875e-14, median=5.5879e-08, min=0 at (0, 0, 0, 0, 0), max=2.861e-06 at (0, 0, 125, 6, 1), avg-magnitude=7.755e-08
[I] Relative Difference | Stats: mean=2.8524e-05, std-dev=0.0041397, var=1.7137e-05, median=1.1497e-06, min=0 at (0, 0, 0, 0, 0), max=1.0297 at (0, 0, 163, 14, 12), avg-magnitude=2.8524e-05
[I] PASSED | Output: 'bbreg_test_feat_enc' | Difference is within tolerance (rel=1e-05, abs=1e-05)
[I] Comparing Output: 'bbreg_weights' (dtype=float32, shape=(1, 256, 1, 1)) with 'bbreg_weights' (dtype=float32, shape=(1, 256, 1, 1))
[I] Tolerance: [abs=1e-05, rel=1e-05] | Checking elemwise error
[I] trt-runner-N0-01/19/24-12:30:50: bbreg_weights | Stats: mean=0.016638, std-dev=0.27469, var=0.075457, median=-6.1898e-05, min=-0.029308 at (0, 75, 0, 0), max=4.4027 at (0, 232, 0, 0), avg-magnitude=0.019857
[I] onnxrt-runner-N0-01/19/24-12:30:50: bbreg_weights | Stats: mean=0.016638, std-dev=0.27469, var=0.075457, median=-6.1894e-05, min=-0.029308 at (0, 75, 0, 0), max=4.4027 at (0, 232, 0, 0), avg-magnitude=0.019857
[I] Error Metrics: bbreg_weights
[I] Minimum Required Tolerance: elemwise error | [abs=2.7765e-08] OR [rel=0.00084576] (requirements may be lower if both abs/rel tolerances are set)
[I] Absolute Difference | Stats: mean=1.0353e-08, std-dev=5.4175e-09, var=2.9349e-17, median=1.071e-08, min=0 at (0, 232, 0, 0), max=2.7765e-08 at (0, 58, 0, 0), avg-magnitude=1.0353e-08
[I] Relative Difference | Stats: mean=1.8881e-05, std-dev=5.9611e-05, var=3.5535e-09, median=5.0967e-06, min=0 at (0, 232, 0, 0), max=0.00084576 at (0, 228, 0, 0), avg-magnitude=1.8881e-05
[I] PASSED | Output: 'bbreg_weights' | Difference is within tolerance (rel=1e-05, abs=1e-05)
[I] Comparing Output: 'target_scores' (dtype=float32, shape=(1, 1, 18, 18)) with 'target_scores' (dtype=float32, shape=(1, 1, 18, 18))
[I] Tolerance: [abs=1e-05, rel=1e-05] | Checking elemwise error
[I] trt-runner-N0-01/19/24-12:30:50: target_scores | Stats: mean=0.016897, std-dev=0.016446, var=0.00027048, median=0.010701, min=0.00054304 at (0, 0, 13, 9), max=0.092922 at (0, 0, 2, 15), avg-magnitude=0.016897
[I] onnxrt-runner-N0-01/19/24-12:30:50: target_scores | Stats: mean=0.016897, std-dev=0.016446, var=0.00027048, median=0.010701, min=0.00054308 at (0, 0, 13, 9), max=0.092922 at (0, 0, 2, 15), avg-magnitude=0.016897
[I] Error Metrics: target_scores
[I] Minimum Required Tolerance: elemwise error | [abs=1.9372e-07] OR [rel=0.00015655] (requirements may be lower if both abs/rel tolerances are set)
[I] Absolute Difference | Stats: mean=6.085e-08, std-dev=3.7659e-08, var=1.4182e-15, median=5.9372e-08, min=0 at (0, 0, 4, 0), max=1.9372e-07 at (0, 0, 2, 13), avg-magnitude=6.085e-08
[I] Relative Difference | Stats: mean=9.9251e-06, std-dev=1.6575e-05, var=2.7473e-10, median=4.1957e-06, min=0 at (0, 0, 4, 0), max=0.00015655 at (0, 0, 17, 10), avg-magnitude=9.9251e-06
[I] PASSED | Output: 'target_scores' | Difference is within tolerance (rel=1e-05, abs=1e-05)
[I] PASSED | All outputs matched | Outputs: ['bbreg_test_feat_enc', 'bbreg_weights', 'target_scores']
[I] Accuracy Summary | trt-runner-N0-01/19/24-12:30:50 vs. onnxrt-runner-N0-01/19/24-12:30:50 | Passed: 1/1 iterations | Pass Rate: 100.0%
[I] PASSED | Runtime: 60.464s | Command: C:\Users\Tomas\AppData\Local\Programs\Python\Python310\Scripts\polygraphy run modified_latest3_sanitized2.onnx --trt --onnxrt

accuracy seems to pass the tests, but when I insert the tensorrt model inside the code, the tracking box in is all over the place on the webcam feed.
I assume, that this particular model needs to be explicitly accurate to function...

When I run the model with onnxruntime inside my code, the tracking seems fine, but when I switch to run the engine, everything is not fine... If you want I can record a video sample, of how the model runs on pytorch, on onnxruntime and trt

I ran polygraphy inspect model modified_latest3_sanitized2.onnx --show layers attrs weights and noticed that many many layers are using int64, BUT those layers have 0 tensors, all layers with weights are float32 . Can this be an accuracy issue?

Polygraphy inspect log file modified_latest_3_sanitized2_inspect.txt

zerollzeng · 2024-01-19T10:03:05Z

accuracy seems to pass the tests, but when I insert the tensorrt model inside the code, the tracking box in is all over the place on the webcam feed.
I assume, that this particular model needs to be explicitly accurate to function...

Did you sync after each inference? or a bug in the pre or post processing?

I ran polygraphy inspect model modified_latest3_sanitized2.onnx --show layers attrs weights and noticed that many many layers are using int64, BUT those layers have 0 tensors, all layers with weights are float32 . Can this be an accuracy issue?

int64 weights should be good.

ninono12345 · 2024-01-19T10:33:02Z

@zerollzeng What do you mean by this question?

Did you sync after each inference? or a bug in the pre or post processing?

ninono12345 · 2024-01-23T00:22:44Z

I was using polygraphys inference, everything was fixed, when I transfered tensors from gpu to cpu before inference!

zerollzeng self-assigned this Jan 19, 2024

zerollzeng added the triaged Issue has been triaged by maintainers label Jan 19, 2024

ninono12345 closed this as completed Jan 23, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

terrible TensorRT accuracy when running inference on an object tracking algorithm #3609

terrible TensorRT accuracy when running inference on an object tracking algorithm #3609

ninono12345 commented Jan 18, 2024 •

edited

Loading

zerollzeng commented Jan 19, 2024

ninono12345 commented Jan 19, 2024

zerollzeng commented Jan 19, 2024

ninono12345 commented Jan 19, 2024

ninono12345 commented Jan 23, 2024

terrible TensorRT accuracy when running inference on an object tracking algorithm #3609

terrible TensorRT accuracy when running inference on an object tracking algorithm #3609

Comments

ninono12345 commented Jan 18, 2024 • edited Loading

Description

Environment

Relevant Files

zerollzeng commented Jan 19, 2024

ninono12345 commented Jan 19, 2024

zerollzeng commented Jan 19, 2024

ninono12345 commented Jan 19, 2024

ninono12345 commented Jan 23, 2024

ninono12345 commented Jan 18, 2024 •

edited

Loading