[Build] TRT EP cannot be built without CUDA EP #18542

gedoensmax · 2023-11-21T19:19:32Z

Describe the issue

I am trying to reduce the binary size by only compiling TRT for a shipment for which I know that my models are completely eligible for TRT execution. For memory allocation though it relies on CUDA EP being part of the library.

onnxruntime/onnxruntime/core/providers/shared_library/provider_bridge_provider.cc

Lines 336 to 348 in 762703e

    
           #ifdef USE_TENSORRT 
        
           std::unique_ptr<IAllocator> CreateCUDAAllocator(int16_t device_id, const char* name) { 
        
             return g_host->CreateCUDAAllocator(device_id, name); 
        
           } 
        
           std::unique_ptr<IAllocator> CreateCUDAPinnedAllocator(const char* name) { 
        
             return g_host->CreateCUDAPinnedAllocator(name); 
        
           } 
        
           std::unique_ptr<IDataTransfer> CreateGPUDataTransfer() { 
        
             return g_host->CreateGPUDataTransfer(); 
        
           } 
        
           #endif

@chilo-ms for viz

Urgency

No response

Target platform

Windows

Build script

cmake -G Ninja -Donnxruntime_BUILD_UNIT_TESTS=ON -Donnxruntime_ENABLE_NVTX_PROFILE=ON -Donnxruntime_USE_CUDA=OFF -Donnxruntime_CUDA_HOME="C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v12.2" -Donnxruntime_USE_TENSORRT=ON -Donnxruntime_USE_DML=ON -Donnxruntime_CUDNN_HOME=C:\CUDNN\8.9.99.55 -Donnxruntime_TENSORRT_HOME=C:\TRT\8.6.1.6 -Donnxruntime_USE_TENSORRT_BUILTIN_PARSER=OFF -DCMAKE_INSTALL_PREFIX=C:\Users\admin\CLionProjects\sensei-on-device-sdk\external\onnxruntime\windows\gpu -Donnxruntime_BUILD_SHARED_LIB=ON -DCMAKE_CXX_COMPILER_LAUNCHER=ccache -DCMAKE_CUDA_COMPILER_LAUNCHER=ccache -Donnxruntime_USE_OPENVINO=ON -DCMAKE_C_COMPILER_LAUNCHER=ccache -Donnxruntime_NVCC_THREADS=1 -DONNX_USE_MSVC_STATIC_RUNTIME=ON -Dprotobuf_MSVC_STATIC_RUNTIME=OFF -Donnxruntime_USE_CUDA_NHWC_OPS:BOOL=ON

Error / output

C:\Users\admin\CLionProjects\onnxruntime\onnxruntime\core\providers\shared_library\provider_bridge_provider.cc(338): error C2039: 'CreateCUDAAllocator': is not a member of 'onnxruntime::ProviderHost'
C:\Users\admin\CLionProjects\onnxruntime\onnxruntime\core/providers/shared/common.h(5): note: see declaration of 'onnxruntime::ProviderHost'
C:\Users\admin\CLionProjects\onnxruntime\onnxruntime\core\providers\shared_library\provider_bridge_provider.cc(342): error C2039: 'CreateCUDAPinnedAllocator': is not a member of 'onnxruntime::ProviderHost'
C:\Users\admin\CLionProjects\onnxruntime\onnxruntime\core/providers/shared/common.h(5): note: see declaration of 'onnxruntime::ProviderHost'
C:\Users\admin\CLionProjects\onnxruntime\onnxruntime\core\providers\shared_library\provider_bridge_provider.cc(346): error C2039: 'CreateGPUDataTransfer': is not a member of 'onnxruntime::ProviderHost'
C:\Users\admin\CLionProjects\onnxruntime\onnxruntime\core/providers/shared/common.h(5): note: see declaration of 'onnxruntime::ProviderHost'

Visual Studio Version

No response

GCC / Compiler Version

No response

jywu-msft · 2023-11-27T17:45:37Z

This will be difficult to separate. By design, we wanted TensorRT EP to be extension/fallback for CUDA EP. That's why ORT GPU package includes both CUDA and TensorRT EP.
one thing you can try to reduce the size of CUDA EP library is to build with --disable_contrib_ops

gedoensmax · 2023-11-28T01:40:36Z

This would be a major issue as it heavily increases the shipment size of TRT through ONNX Runtime. Especially with the new embedded TRT engines that would be a great thing (#18217 ) ! It should be possibly to make the CUDA allocations and maybe stream management a separate library somehow right ?

@RyanUnderhill

…19052) Adresses #18542. I followed the advice given by @RyanUnderhill [here](#18731 (comment)) and went with a minimal CUDA EP for now.

gedoensmax added the build build issues; typically submitted using template label Nov 21, 2023

github-actions bot added ep:CUDA issues related to the CUDA execution provider ep:TensorRT issues related to TensorRT execution provider platform:windows issues related to the Windows platform labels Nov 21, 2023

gedoensmax mentioned this issue Nov 26, 2023

[Documentation] Clearer TRT dependencies #18073

Open

gedoensmax mentioned this issue Dec 6, 2023

Draft: [TensorRT EP] Enable compiling TRT EP without CUDA EP #18731

Closed

gedoensmax mentioned this issue Jan 8, 2024

[TensorRT EP] Enable a minimal CUDA EP compilation without kernels #19052

Merged

tianleiwu pushed a commit that referenced this issue Jan 17, 2024

[TensorRT EP] Enable a minimal CUDA EP compilation without kernels (#…

bc219ed

…19052) Adresses #18542. I followed the advice given by @RyanUnderhill [here](#18731 (comment)) and went with a minimal CUDA EP for now.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Build] TRT EP cannot be built without CUDA EP #18542

[Build] TRT EP cannot be built without CUDA EP #18542

gedoensmax commented Nov 21, 2023

jywu-msft commented Nov 27, 2023

gedoensmax commented Nov 28, 2023

[Build] TRT EP cannot be built without CUDA EP #18542

[Build] TRT EP cannot be built without CUDA EP #18542

Comments

gedoensmax commented Nov 21, 2023

Describe the issue

Urgency

Target platform

Build script

Error / output

Visual Studio Version

GCC / Compiler Version

jywu-msft commented Nov 27, 2023

gedoensmax commented Nov 28, 2023