Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Build] TRT EP cannot be built without CUDA EP #18542

Open
gedoensmax opened this issue Nov 21, 2023 · 2 comments
Open

[Build] TRT EP cannot be built without CUDA EP #18542

gedoensmax opened this issue Nov 21, 2023 · 2 comments
Labels
build build issues; typically submitted using template ep:CUDA issues related to the CUDA execution provider ep:TensorRT issues related to TensorRT execution provider platform:windows issues related to the Windows platform

Comments

@gedoensmax
Copy link
Contributor

Describe the issue

I am trying to reduce the binary size by only compiling TRT for a shipment for which I know that my models are completely eligible for TRT execution. For memory allocation though it relies on CUDA EP being part of the library.

#ifdef USE_TENSORRT
std::unique_ptr<IAllocator> CreateCUDAAllocator(int16_t device_id, const char* name) {
return g_host->CreateCUDAAllocator(device_id, name);
}
std::unique_ptr<IAllocator> CreateCUDAPinnedAllocator(const char* name) {
return g_host->CreateCUDAPinnedAllocator(name);
}
std::unique_ptr<IDataTransfer> CreateGPUDataTransfer() {
return g_host->CreateGPUDataTransfer();
}
#endif

@chilo-ms for viz

Urgency

No response

Target platform

Windows

Build script

cmake -G Ninja -Donnxruntime_BUILD_UNIT_TESTS=ON -Donnxruntime_ENABLE_NVTX_PROFILE=ON -Donnxruntime_USE_CUDA=OFF -Donnxruntime_CUDA_HOME="C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v12.2" -Donnxruntime_USE_TENSORRT=ON -Donnxruntime_USE_DML=ON -Donnxruntime_CUDNN_HOME=C:\CUDNN\8.9.99.55 -Donnxruntime_TENSORRT_HOME=C:\TRT\8.6.1.6 -Donnxruntime_USE_TENSORRT_BUILTIN_PARSER=OFF -DCMAKE_INSTALL_PREFIX=C:\Users\admin\CLionProjects\sensei-on-device-sdk\external\onnxruntime\windows\gpu -Donnxruntime_BUILD_SHARED_LIB=ON -DCMAKE_CXX_COMPILER_LAUNCHER=ccache -DCMAKE_CUDA_COMPILER_LAUNCHER=ccache -Donnxruntime_USE_OPENVINO=ON -DCMAKE_C_COMPILER_LAUNCHER=ccache -Donnxruntime_NVCC_THREADS=1 -DONNX_USE_MSVC_STATIC_RUNTIME=ON -Dprotobuf_MSVC_STATIC_RUNTIME=OFF -Donnxruntime_USE_CUDA_NHWC_OPS:BOOL=ON

Error / output

C:\Users\admin\CLionProjects\onnxruntime\onnxruntime\core\providers\shared_library\provider_bridge_provider.cc(338): error C2039: 'CreateCUDAAllocator': is not a member of 'onnxruntime::ProviderHost'
C:\Users\admin\CLionProjects\onnxruntime\onnxruntime\core/providers/shared/common.h(5): note: see declaration of 'onnxruntime::ProviderHost'
C:\Users\admin\CLionProjects\onnxruntime\onnxruntime\core\providers\shared_library\provider_bridge_provider.cc(342): error C2039: 'CreateCUDAPinnedAllocator': is not a member of 'onnxruntime::ProviderHost'
C:\Users\admin\CLionProjects\onnxruntime\onnxruntime\core/providers/shared/common.h(5): note: see declaration of 'onnxruntime::ProviderHost'
C:\Users\admin\CLionProjects\onnxruntime\onnxruntime\core\providers\shared_library\provider_bridge_provider.cc(346): error C2039: 'CreateGPUDataTransfer': is not a member of 'onnxruntime::ProviderHost'
C:\Users\admin\CLionProjects\onnxruntime\onnxruntime\core/providers/shared/common.h(5): note: see declaration of 'onnxruntime::ProviderHost'

Visual Studio Version

No response

GCC / Compiler Version

No response

@gedoensmax gedoensmax added the build build issues; typically submitted using template label Nov 21, 2023
@github-actions github-actions bot added ep:CUDA issues related to the CUDA execution provider ep:TensorRT issues related to TensorRT execution provider platform:windows issues related to the Windows platform labels Nov 21, 2023
@jywu-msft
Copy link
Member

This will be difficult to separate. By design, we wanted TensorRT EP to be extension/fallback for CUDA EP. That's why ORT GPU package includes both CUDA and TensorRT EP.
one thing you can try to reduce the size of CUDA EP library is to build with --disable_contrib_ops

@gedoensmax
Copy link
Contributor Author

This would be a major issue as it heavily increases the shipment size of TRT through ONNX Runtime. Especially with the new embedded TRT engines that would be a great thing (#18217 ) ! It should be possibly to make the CUDA allocations and maybe stream management a separate library somehow right ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
build build issues; typically submitted using template ep:CUDA issues related to the CUDA execution provider ep:TensorRT issues related to TensorRT execution provider platform:windows issues related to the Windows platform
Projects
None yet
Development

No branches or pull requests

2 participants