onnx runtime error 6 when using ORT 1.16.1 with TRT and CUDA EP #18065
Labels
ep:CUDA
issues related to the CUDA execution provider
ep:TensorRT
issues related to TensorRT execution provider
Describe the issue
When running inferences with Triton's ORT backend with ORT 1.16.1 with CUDA EP and TRT EP, we have ran into the issue below:
Note that the ORT backend works fine when using the ORT 1.16.0.
To reproduce
Run an inference using the image_client and the vulture.jpg image.
Observe the error.
Urgency
No response
Platform
Linux
OS Version
Ubuntu 22.04
ONNX Runtime Installation
Built from Source
ONNX Runtime Version or Commit ID
1.16.1
ONNX Runtime API
C++
Architecture
X64
Execution Provider
CUDA, TensorRT
Execution Provider Library Version
CUDA 12.2
The text was updated successfully, but these errors were encountered: