-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Build] Windows 10 + CUDA 12.1 #15242
Comments
We do not build with CUDA 12. Not yet anyway. Feel free to submit a PR with a fix if you want this earlier. |
Based upon yesterday's meeting with Pranav, I believe the plan was for Microsoft to fix ORT build for cuda 12 and possibly have a CI. |
Is there a timeline for supporting Onnxruntime-gpu CUDAExecutioner for CUDA 12?
I tried nightly build
and can get correct devices and executioners
but still getting error when actually running an inference
Triton inference server which also uses Onnx seems to work well with CUDA 12.1, wonder why Onnxruntime-gpu cannot support it? |
CUDA is notorious for changing the set of DLLs that needs to load and the places where to look for them version by version. Check your DLL search order and their location. |
Thanks @yuslepukhin, is there a tutorial on how to For example, copy all DLLs to a folder and manually add it to PATH? #11826 (comment) |
See #13658 first |
@junwang-wish ORT does not officially support CUDA 12 yet. In the case of Triton, they build ORT from source and made it work with CUDA 12 on Linux (see this and this), but the windows build still fails which is why this issue was filed. Feel free to contribute fixes for the Windows build. |
Thanks @pranavsharma , I just tried on Linux Ubuntu 20.04.5 LTS, but the same error #15242 (comment) appears on Linux. |
We're not producing nightly builds for cuda 12. You'll need to build it yourself. We only support CUDA 11.x for all nightlies and release pks. As I said Triton is completely different; they don't use any pkgs. |
Gotcha thanks, I will downgrade to CUDA 11.x for now. |
Just spoke to the customer. Deadline for this is May 5th. |
I just checked again, our prebuilt binaries should work fine with any CUDA version from 11.0 to 11.8(both inclusive). Though in the upcoming release we will build our code with CUDA 11.8, the binaries should continue work fine with CUDA 11.6 or below. After the upcoming release, we will consider upgrading CUDA version to 12.x. |
@snnn there is no expectation to release CUDA 12.x binaries in the upcoming release (1.15). The only ask is to ensure that ORT builds with CUDA 12.x. |
### Description Fix CUDA 12.1 Windows build error of cuda namespace ambiguous. Use a new namespace for attention softmax. Tested with VS 2019 and VS 2022 with the following settings: - OS: Microsoft Windows 11 Enterprise (Version 10.0.22621 Build 22621) - CUDA: cuda_12.1.0_531.14_windows - TensorRT: TensorRT-8.6.0.12.Windows10.x86_64.cuda-12.0 - CUDNN: 8.8.1.3 for cuda 12 - Visual Studio Enterprise 2019, version 16.11.26 (MSVC v142) or Visual Studio Enterprise 2022 (64-bit), version 17.5.4 - Python: 3.10 - CMake: 3.25.2 VS 2019: ``` build.bat --cmake_generator "Visual Studio 16 2019" --config Release --cmake_extra_defines "CMAKE_CUDA_ARCHITECTURES=52;60;61;70;75;80;86" --skip_submodule_sync --parallel --build_shared_lib --update --build --build_dir .\build\trt --use_cuda --cuda_version "12.1" --cuda_home "C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.1" --cudnn_home "C:\CuDNN\8.8.1.3_cuda12" --use_tensorrt --tensorrt_home "C:\TensorRT-8.6.0.12.Windows10.x86_64.cuda-12.0\TensorRT-8.6.0.12" ``` VS 2022: ``` build.bat --cmake_generator "Visual Studio 17 2022" --config Release --cmake_extra_defines "CMAKE_CUDA_ARCHITECTURES=52;60;61;70;75;80;86" --skip_submodule_sync --parallel --build_shared_lib --update --build --build_dir .\build\trt_2022 --use_cuda --cuda_version "12.1" --cuda_home "C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.1" --cudnn_home "C:\CuDNN\8.8.1.3_cuda12" --use_tensorrt --tensorrt_home "C:\TensorRT-8.6.0.12.Windows10.x86_64.cuda-12.0\TensorRT-8.6.0.12" ``` ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> #15242
@tanmayv25, please try out the fix. |
@tianleiwu Thanks a lot for all the assistance. I can confirm that I was able to successfully build onnxruntime with cuda 12.1. |
@tianleiwu @pranavsharma Can you share when the fix will be available in a release branch? |
23rd May. That's the tentative release date. |
@pranavsharma so I can use onnxruntime on any machines with CUDA12.1 tentatively on 23rd May 2023? |
Yes. |
I have a PR that is related to this: #15781 . In the PR I updated one of our docker files to use CUDA 12.1, and it works fine. Therefore I believe our code is compatible with CUDA 12.1. However, the upcoming release ORT release, 1.15, will use CUDA 11.8 for all prebuilt packages. If you'd like to use CUDA 12.1 instead, please build it from source. And you can try it now. We have created the rel-1.15.0 release branch. Though the release tag is not created yet, most things are already ready. |
### Description Fix CUDA 12.1 Windows build error of cuda namespace ambiguous. Use a new namespace for attention softmax. Tested with VS 2019 and VS 2022 with the following settings: - OS: Microsoft Windows 11 Enterprise (Version 10.0.22621 Build 22621) - CUDA: cuda_12.1.0_531.14_windows - TensorRT: TensorRT-8.6.0.12.Windows10.x86_64.cuda-12.0 - CUDNN: 8.8.1.3 for cuda 12 - Visual Studio Enterprise 2019, version 16.11.26 (MSVC v142) or Visual Studio Enterprise 2022 (64-bit), version 17.5.4 - Python: 3.10 - CMake: 3.25.2 VS 2019: ``` build.bat --cmake_generator "Visual Studio 16 2019" --config Release --cmake_extra_defines "CMAKE_CUDA_ARCHITECTURES=52;60;61;70;75;80;86" --skip_submodule_sync --parallel --build_shared_lib --update --build --build_dir .\build\trt --use_cuda --cuda_version "12.1" --cuda_home "C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.1" --cudnn_home "C:\CuDNN\8.8.1.3_cuda12" --use_tensorrt --tensorrt_home "C:\TensorRT-8.6.0.12.Windows10.x86_64.cuda-12.0\TensorRT-8.6.0.12" ``` VS 2022: ``` build.bat --cmake_generator "Visual Studio 17 2022" --config Release --cmake_extra_defines "CMAKE_CUDA_ARCHITECTURES=52;60;61;70;75;80;86" --skip_submodule_sync --parallel --build_shared_lib --update --build --build_dir .\build\trt_2022 --use_cuda --cuda_version "12.1" --cuda_home "C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.1" --cudnn_home "C:\CuDNN\8.8.1.3_cuda12" --use_tensorrt --tensorrt_home "C:\TensorRT-8.6.0.12.Windows10.x86_64.cuda-12.0\TensorRT-8.6.0.12" ``` ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> microsoft#15242
@snnn By chance, are nightlies now built against CUDA 12 (it's been some time now)? |
No. They are not. |
@snnn Is it planned to use cuda 12 in the next release and any estimate when we can expect next release? Upcoming release roadmap is outdated. |
No, the next release's official packages will be still built with CUDA 11.x. However, you can build it from source with CUDA 12.x. |
Describe the issue
We are facing following errors when trying to build onnxruntime on windows with cuda 12.1.
Build command:
Errors:
Similar namespacing errors on linux build with cuda 12.1 was resolved by this commit.
Urgency
No response
Target platform
Windows 10
Build script
build.bat --cmake_generator "Visual Studio 16 2019" --config Release --cmake_extra_defines "CMAKE_CUDA_ARCHITECTURES=52;60;61;70;75;80;86" --skip_submodule_sync --parallel --build_shared_lib --update --build --build_dir /workspace/build --use_cuda --cuda_version "12.1" --cuda_home "C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.1" --cudnn_home "C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.1" --use_tensorrt --tensorrt_home "/tensorrt"
Error / output
C:\tmp\tritonbuild\onnxruntime\build\install.vcxproj" (default target) (1) ->
"C:\tmp\tritonbuild\onnxruntime\build\ALL_BUILD.vcxproj" (default target) (3) ->
"C:\tmp\tritonbuild\onnxruntime\build\triton-onnxruntime-backend.vcxproj" (default target) (12) ->
"C:\tmp\tritonbuild\onnxruntime\build\ort_target.vcxproj" (default target) (13) ->
(CustomBuild target) ->
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.1\include\cuda\std\detail/libcxx/include/type_traits(2101): error : "cuda" is ambiguous [C:\workspace\build\Release\onnxruntime_providers_cuda.vcxproj] [C:\tmp\tritonbuild\onnxruntime\build\ort_target.vcxproj]
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.1\include\cuda\std\detail/libcxx/include/type_traits(2101): error : expected an expression [C:\workspace\build\Release\onnxruntime_providers_cuda.vcxproj] [C:\tmp\tritonbuild\onnxruntime\build\ort_target.vcxproj]
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.1\include\cuda\std\detail/libcxx/include/type_traits(2101): error : "cuda" is ambiguous [C:\workspace\build\Release\onnxruntime_providers_cuda.vcxproj] [C:\tmp\tritonbuild\onnxruntime\build\ort_target.vcxproj]
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.1\include\cuda\std\detail/libcxx/include/type_traits(2101): error : expected an expression [C:\workspace\build\Release\onnxruntime_providers_cuda.vcxproj] [C:\tmp\tritonbuild\onnxruntime\build\ort_target.vcxproj]
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.1\include\cuda\std\detail/libcxx/include/type_traits(2101): error : "cuda" is ambiguous [C:\workspace\build\Release\onnxruntime_providers_cuda.vcxproj] [C:\tmp\tritonbuild\onnxruntime\build\ort_target.vcxproj]
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.1\include\cuda\std\detail/libcxx/include/type_traits(2101): error : expected an expression [C:\workspace\build\Release\onnxruntime_providers_cuda.vcxproj] [C:\tmp\tritonbuild\onnxruntime\build\ort_target.vcxproj]
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.1\include\cuda\std\detail/libcxx/include/type_traits(2101): error : "cuda" is ambiguous [C:\workspace\build\Release\onnxruntime_providers_cuda.vcxproj] [C:\tmp\tritonbuild\onnxruntime\build\ort_target.vcxproj]
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.1\include\cuda\std\detail/libcxx/include/type_traits(2101): error : expected an expression [C:\workspace\build\Release\onnxruntime_providers_cuda.vcxproj] [C:\tmp\tritonbuild\onnxruntime\build\ort_target.vcxproj]
C:\BuildTools\MSBuild\Microsoft\VC\v160\BuildCustomizations\CUDA 12.1.targets(799,9): error MSB3721: The command ""C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.1\bin\nvcc.exe" --use-local-env -ccbin "C:\BuildTools\VC\Tools\MSVC\14.29.30133\bin\HostX64\x64" -x cu -IC:\workspace\onnxruntime\include\onnxruntime -IC:\workspace\onnxruntime\include\onnxruntime\core\session -I"C:\workspace\build\Release_deps\pytorch_cpuinfo-src\include" -IC:\workspace\build\Release -IC:\workspace\onnxruntime\onnxruntime -I"C:\workspace\build\Release_deps\abseil_cpp-src" -I"C:\workspace\build\Release_deps\safeint-src" -I"C:\workspace\build\Release_deps\gsl-src\include" -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.1\include" -I"C:\workspace\build\Release_deps\onnx-src" -I"C:\workspace\build\Release_deps\onnx-build" -I"C:\workspace\build\Release_deps\protobuf-src\src" -I"C:\workspace\build\Release_deps\flatbuffers-src\include" -I"C:\workspace\build\Release_deps\cutlass-src\include" -I"C:\workspace\build\Release_deps\cutlass-src\examples" -I"C:\workspace\build\Release_deps\eigen-src" -I"C:\workspace\build\Release_deps\onnx_tensorrt-src" -I\TensorRT\include -I"C:\workspace\build\Release_deps\mp11-src\include" -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.1\include" --keep-dir x64\Release -maxrregcount=0 --machine 64 --compile -cudart shared --expt-relaxed-constexpr --Werror default-stream-launch -Xcudafe --diag_suppress=bad_friend_decl -Xcudafe --diag_suppress=unsigned_compare_with_zero -Xcudafe --diag_suppress=expr_has_no_effect --generate-code=arch=compute_52,code=[compute_52,sm_52] --generate-code=arch=compute_60,code=[compute_60,sm_60] --generate-code=arch=compute_61,code=[compute_61,sm_61] --generate-code=arch=compute_70,code=[compute_70,sm_70] --generate-code=arch=compute_75,code=[compute_75,sm_75] --generate-code=arch=compute_80,code=[compute_80,sm_80] --generate-code=arch=compute_86,code=[compute_86,sm_86] --diag-suppress 554 --threads 0 -std=c++17 -Werror all-warnings -Xcompiler="/EHsc -Ob2 /utf-8 /sdl /wd4251 /wd4201 /wd5054 /w15038 /guard:cf /wd4251 /wd4201 /wd5054 /w15038 /wd4834 /wd4127" -D_WINDOWS -DNDEBUG -DCPUINFO_SUPPORTED_PLATFORM=1 -DEIGEN_USE_THREADS -DPLATFORM_WINDOWS -DNOGDI -DNOMINMAX -D_USE_MATH_DEFINES -D_SILENCE_ALL_CXX17_DEPRECATION_WARNINGS -DUSE_CUDA=1 -DUSE_FLASH_ATTENTION=1 -DUSE_TENSORRT=1 -DONNX_NAMESPACE=onnx -DONNX_ML=1 -DWIN32_LEAN_AND_MEAN -DORT_ENABLE_STREAM -DEIGEN_MPL2_ONLY -DEIGEN_HAS_CONSTEXPR -DEIGEN_HAS_VARIADIC_TEMPLATES -DEIGEN_HAS_CXX11_MATH -DEIGEN_HAS_CXX11_ATOMIC -DEIGEN_STRONG_INLINE=inline -DWINAPI_FAMILY=100 -DWINVER=0x0601 -D_WIN32_WINNT=0x0601 -DNTDDI_VERSION=0x06010000 -D_SILENCE_EXPERIMENTAL_FILESYSTEM_DEPRECATION_WARNING=1 -D"CMAKE_INTDIR="Release"" -Donnxruntime_providers_cuda_EXPORTS -D_WINDLL -D_MBCS -DWIN32 -D_WINDOWS -DEIGEN_HAS_C99_MATH -DCPUINFO_SUPPORTED -DNDEBUG -DCPUINFO_SUPPORTED_PLATFORM=1 -DEIGEN_USE_THREADS -DPLATFORM_WINDOWS -DNOGDI -DNOMINMAX -D_USE_MATH_DEFINES -D_SILENCE_ALL_CXX17_DEPRECATION_WARNINGS -DUSE_CUDA=1 -DUSE_FLASH_ATTENTION=1 -DUSE_TENSORRT=1 -DONNX_NAMESPACE=onnx -DONNX_ML=1 -DWIN32_LEAN_AND_MEAN -DORT_ENABLE_STREAM -DEIGEN_MPL2_ONLY -DEIGEN_HAS_CONSTEXPR -DEIGEN_HAS_VARIADIC_TEMPLATES -DEIGEN_HAS_CXX11_MATH -DEIGEN_HAS_CXX11_ATOMIC -DEIGEN_STRONG_INLINE=inline -DWINAPI_FAMILY=100 -DWINVER=0x0601 -D_WIN32_WINNT=0x0601 -DNTDDI_VERSION=0x06010000 -D_SILENCE_EXPERIMENTAL_FILESYSTEM_DEPRECATION_WARNING=1 -D"CMAKE_INTDIR="Release"" -Donnxruntime_providers_cuda_EXPORTS -Xcompiler "/EHsc /W3 /nologo /O2 /FS /MD /GR" -Xcompiler "/Fdonnxruntime_providers_cuda.dir\Release\vc142.pdb" -o onnxruntime_providers_cuda.dir\Release\attention_impl.obj "C:\workspace\onnxruntime\onnxruntime\contrib_ops\cuda\bert\attention_impl.cu"" exited with code -1. [C:\workspace\build\Release\onnxruntime_providers_cuda.vcxproj] [C:\tmp\tritonbuild\onnxruntime\build\ort_target.vcxproj]
C:\BuildTools\MSBuild\Microsoft\VC\v160\Microsoft.CppCommon.targets(241,5): error MSB8066: Custom build for 'C:\tmp\tritonbuild\onnxruntime\build\CMakeFiles\1391fbda87be57075fb5bba7a38c2954\onnxruntime.rule;C:\tmp\tritonbuild\onnxruntime\build\CMakeFiles\c0b7ec8ce4dc22ca22ac8622f7a49e15\ort_target.rule;C:\tmp\tritonbuild\onnxruntime\CMakeLists.txt' exited with code 1. [C:\tmp\tritonbuild\onnxruntime\build\ort_target.vcxproj]
Visual Studio Version
Visual Studio 16 2019
GCC / Compiler Version
MSVC 19.29.30147.0
The text was updated successfully, but these errors were encountered: