Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Build] Windows 10 + CUDA 12.1 #15242

Closed
tanmayv25 opened this issue Mar 27, 2023 · 25 comments
Closed

[Build] Windows 10 + CUDA 12.1 #15242

tanmayv25 opened this issue Mar 27, 2023 · 25 comments
Labels
build build issues; typically submitted using template ep:CUDA issues related to the CUDA execution provider ep:TensorRT issues related to TensorRT execution provider platform:windows issues related to the Windows platform

Comments

@tanmayv25
Copy link

Describe the issue

We are facing following errors when trying to build onnxruntime on windows with cuda 12.1.

Build command:

 RUN build.bat --cmake_generator "Visual Studio 16 2019" --config Release --cmake_extra_defines "CMAKE_CUDA_ARCHITECTURES=52;60;61;70;75;80;86" --skip_submodule_sync --parallel --build_shared_lib --update --build --build_dir /workspace/build --use_cuda --cuda_version "12.1" --cuda_home "C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.1" --cudnn_home "C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.1" --use_tensorrt --tensorrt_home "/tensorrt"

Errors:

"C:\tmp\tritonbuild\onnxruntime\build\install.vcxproj" (default target) (1) ->
       "C:\tmp\tritonbuild\onnxruntime\build\ALL_BUILD.vcxproj" (default target) (3) ->
       "C:\tmp\tritonbuild\onnxruntime\build\triton-onnxruntime-backend.vcxproj" (default target) (12) ->
       "C:\tmp\tritonbuild\onnxruntime\build\ort_target.vcxproj" (default target) (13) ->
       (CustomBuild target) -> 
         C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.1\include\cuda\std\detail/libcxx/include/type_traits(2101): error : "cuda" is ambiguous [C:\workspace\build\Release\onnxruntime_providers_cuda.vcxproj] [C:\tmp\tritonbuild\onnxruntime\build\ort_target.vcxproj]
         C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.1\include\cuda\std\detail/libcxx/include/type_traits(2101): error : expected an expression [C:\workspace\build\Release\onnxruntime_providers_cuda.vcxproj] [C:\tmp\tritonbuild\onnxruntime\build\ort_target.vcxproj]
         C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.1\include\cuda\std\detail/libcxx/include/type_traits(2101): error : "cuda" is ambiguous [C:\workspace\build\Release\onnxruntime_providers_cuda.vcxproj] [C:\tmp\tritonbuild\onnxruntime\build\ort_target.vcxproj]
         C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.1\include\cuda\std\detail/libcxx/include/type_traits(2101): error : expected an expression [C:\workspace\build\Release\onnxruntime_providers_cuda.vcxproj] [C:\tmp\tritonbuild\onnxruntime\build\ort_target.vcxproj]
         C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.1\include\cuda\std\detail/libcxx/include/type_traits(2101): error : "cuda" is ambiguous [C:\workspace\build\Release\onnxruntime_providers_cuda.vcxproj] [C:\tmp\tritonbuild\onnxruntime\build\ort_target.vcxproj]
         C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.1\include\cuda\std\detail/libcxx/include/type_traits(2101): error : expected an expression [C:\workspace\build\Release\onnxruntime_providers_cuda.vcxproj] [C:\tmp\tritonbuild\onnxruntime\build\ort_target.vcxproj]
         C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.1\include\cuda\std\detail/libcxx/include/type_traits(2101): error : "cuda" is ambiguous [C:\workspace\build\Release\onnxruntime_providers_cuda.vcxproj] [C:\tmp\tritonbuild\onnxruntime\build\ort_target.vcxproj]
         C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.1\include\cuda\std\detail/libcxx/include/type_traits(2101): error : expected an expression [C:\workspace\build\Release\onnxruntime_providers_cuda.vcxproj] [C:\tmp\tritonbuild\onnxruntime\build\ort_target.vcxproj]

Similar namespacing errors on linux build with cuda 12.1 was resolved by this commit.

Urgency

No response

Target platform

Windows 10

Build script

build.bat --cmake_generator "Visual Studio 16 2019" --config Release --cmake_extra_defines "CMAKE_CUDA_ARCHITECTURES=52;60;61;70;75;80;86" --skip_submodule_sync --parallel --build_shared_lib --update --build --build_dir /workspace/build --use_cuda --cuda_version "12.1" --cuda_home "C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.1" --cudnn_home "C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.1" --use_tensorrt --tensorrt_home "/tensorrt"

Error / output

C:\tmp\tritonbuild\onnxruntime\build\install.vcxproj" (default target) (1) ->
"C:\tmp\tritonbuild\onnxruntime\build\ALL_BUILD.vcxproj" (default target) (3) ->
"C:\tmp\tritonbuild\onnxruntime\build\triton-onnxruntime-backend.vcxproj" (default target) (12) ->
"C:\tmp\tritonbuild\onnxruntime\build\ort_target.vcxproj" (default target) (13) ->
(CustomBuild target) ->
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.1\include\cuda\std\detail/libcxx/include/type_traits(2101): error : "cuda" is ambiguous [C:\workspace\build\Release\onnxruntime_providers_cuda.vcxproj] [C:\tmp\tritonbuild\onnxruntime\build\ort_target.vcxproj]
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.1\include\cuda\std\detail/libcxx/include/type_traits(2101): error : expected an expression [C:\workspace\build\Release\onnxruntime_providers_cuda.vcxproj] [C:\tmp\tritonbuild\onnxruntime\build\ort_target.vcxproj]
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.1\include\cuda\std\detail/libcxx/include/type_traits(2101): error : "cuda" is ambiguous [C:\workspace\build\Release\onnxruntime_providers_cuda.vcxproj] [C:\tmp\tritonbuild\onnxruntime\build\ort_target.vcxproj]
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.1\include\cuda\std\detail/libcxx/include/type_traits(2101): error : expected an expression [C:\workspace\build\Release\onnxruntime_providers_cuda.vcxproj] [C:\tmp\tritonbuild\onnxruntime\build\ort_target.vcxproj]
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.1\include\cuda\std\detail/libcxx/include/type_traits(2101): error : "cuda" is ambiguous [C:\workspace\build\Release\onnxruntime_providers_cuda.vcxproj] [C:\tmp\tritonbuild\onnxruntime\build\ort_target.vcxproj]
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.1\include\cuda\std\detail/libcxx/include/type_traits(2101): error : expected an expression [C:\workspace\build\Release\onnxruntime_providers_cuda.vcxproj] [C:\tmp\tritonbuild\onnxruntime\build\ort_target.vcxproj]
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.1\include\cuda\std\detail/libcxx/include/type_traits(2101): error : "cuda" is ambiguous [C:\workspace\build\Release\onnxruntime_providers_cuda.vcxproj] [C:\tmp\tritonbuild\onnxruntime\build\ort_target.vcxproj]
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.1\include\cuda\std\detail/libcxx/include/type_traits(2101): error : expected an expression [C:\workspace\build\Release\onnxruntime_providers_cuda.vcxproj] [C:\tmp\tritonbuild\onnxruntime\build\ort_target.vcxproj]
C:\BuildTools\MSBuild\Microsoft\VC\v160\BuildCustomizations\CUDA 12.1.targets(799,9): error MSB3721: The command ""C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.1\bin\nvcc.exe" --use-local-env -ccbin "C:\BuildTools\VC\Tools\MSVC\14.29.30133\bin\HostX64\x64" -x cu -IC:\workspace\onnxruntime\include\onnxruntime -IC:\workspace\onnxruntime\include\onnxruntime\core\session -I"C:\workspace\build\Release_deps\pytorch_cpuinfo-src\include" -IC:\workspace\build\Release -IC:\workspace\onnxruntime\onnxruntime -I"C:\workspace\build\Release_deps\abseil_cpp-src" -I"C:\workspace\build\Release_deps\safeint-src" -I"C:\workspace\build\Release_deps\gsl-src\include" -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.1\include" -I"C:\workspace\build\Release_deps\onnx-src" -I"C:\workspace\build\Release_deps\onnx-build" -I"C:\workspace\build\Release_deps\protobuf-src\src" -I"C:\workspace\build\Release_deps\flatbuffers-src\include" -I"C:\workspace\build\Release_deps\cutlass-src\include" -I"C:\workspace\build\Release_deps\cutlass-src\examples" -I"C:\workspace\build\Release_deps\eigen-src" -I"C:\workspace\build\Release_deps\onnx_tensorrt-src" -I\TensorRT\include -I"C:\workspace\build\Release_deps\mp11-src\include" -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.1\include" --keep-dir x64\Release -maxrregcount=0 --machine 64 --compile -cudart shared --expt-relaxed-constexpr --Werror default-stream-launch -Xcudafe --diag_suppress=bad_friend_decl -Xcudafe --diag_suppress=unsigned_compare_with_zero -Xcudafe --diag_suppress=expr_has_no_effect --generate-code=arch=compute_52,code=[compute_52,sm_52] --generate-code=arch=compute_60,code=[compute_60,sm_60] --generate-code=arch=compute_61,code=[compute_61,sm_61] --generate-code=arch=compute_70,code=[compute_70,sm_70] --generate-code=arch=compute_75,code=[compute_75,sm_75] --generate-code=arch=compute_80,code=[compute_80,sm_80] --generate-code=arch=compute_86,code=[compute_86,sm_86] --diag-suppress 554 --threads 0 -std=c++17 -Werror all-warnings -Xcompiler="/EHsc -Ob2 /utf-8 /sdl /wd4251 /wd4201 /wd5054 /w15038 /guard:cf /wd4251 /wd4201 /wd5054 /w15038 /wd4834 /wd4127" -D_WINDOWS -DNDEBUG -DCPUINFO_SUPPORTED_PLATFORM=1 -DEIGEN_USE_THREADS -DPLATFORM_WINDOWS -DNOGDI -DNOMINMAX -D_USE_MATH_DEFINES -D_SILENCE_ALL_CXX17_DEPRECATION_WARNINGS -DUSE_CUDA=1 -DUSE_FLASH_ATTENTION=1 -DUSE_TENSORRT=1 -DONNX_NAMESPACE=onnx -DONNX_ML=1 -DWIN32_LEAN_AND_MEAN -DORT_ENABLE_STREAM -DEIGEN_MPL2_ONLY -DEIGEN_HAS_CONSTEXPR -DEIGEN_HAS_VARIADIC_TEMPLATES -DEIGEN_HAS_CXX11_MATH -DEIGEN_HAS_CXX11_ATOMIC -DEIGEN_STRONG_INLINE=inline -DWINAPI_FAMILY=100 -DWINVER=0x0601 -D_WIN32_WINNT=0x0601 -DNTDDI_VERSION=0x06010000 -D_SILENCE_EXPERIMENTAL_FILESYSTEM_DEPRECATION_WARNING=1 -D"CMAKE_INTDIR="Release"" -Donnxruntime_providers_cuda_EXPORTS -D_WINDLL -D_MBCS -DWIN32 -D_WINDOWS -DEIGEN_HAS_C99_MATH -DCPUINFO_SUPPORTED -DNDEBUG -DCPUINFO_SUPPORTED_PLATFORM=1 -DEIGEN_USE_THREADS -DPLATFORM_WINDOWS -DNOGDI -DNOMINMAX -D_USE_MATH_DEFINES -D_SILENCE_ALL_CXX17_DEPRECATION_WARNINGS -DUSE_CUDA=1 -DUSE_FLASH_ATTENTION=1 -DUSE_TENSORRT=1 -DONNX_NAMESPACE=onnx -DONNX_ML=1 -DWIN32_LEAN_AND_MEAN -DORT_ENABLE_STREAM -DEIGEN_MPL2_ONLY -DEIGEN_HAS_CONSTEXPR -DEIGEN_HAS_VARIADIC_TEMPLATES -DEIGEN_HAS_CXX11_MATH -DEIGEN_HAS_CXX11_ATOMIC -DEIGEN_STRONG_INLINE=inline -DWINAPI_FAMILY=100 -DWINVER=0x0601 -D_WIN32_WINNT=0x0601 -DNTDDI_VERSION=0x06010000 -D_SILENCE_EXPERIMENTAL_FILESYSTEM_DEPRECATION_WARNING=1 -D"CMAKE_INTDIR="Release"" -Donnxruntime_providers_cuda_EXPORTS -Xcompiler "/EHsc /W3 /nologo /O2 /FS /MD /GR" -Xcompiler "/Fdonnxruntime_providers_cuda.dir\Release\vc142.pdb" -o onnxruntime_providers_cuda.dir\Release\attention_impl.obj "C:\workspace\onnxruntime\onnxruntime\contrib_ops\cuda\bert\attention_impl.cu"" exited with code -1. [C:\workspace\build\Release\onnxruntime_providers_cuda.vcxproj] [C:\tmp\tritonbuild\onnxruntime\build\ort_target.vcxproj]
C:\BuildTools\MSBuild\Microsoft\VC\v160\Microsoft.CppCommon.targets(241,5): error MSB8066: Custom build for 'C:\tmp\tritonbuild\onnxruntime\build\CMakeFiles\1391fbda87be57075fb5bba7a38c2954\onnxruntime.rule;C:\tmp\tritonbuild\onnxruntime\build\CMakeFiles\c0b7ec8ce4dc22ca22ac8622f7a49e15\ort_target.rule;C:\tmp\tritonbuild\onnxruntime\CMakeLists.txt' exited with code 1. [C:\tmp\tritonbuild\onnxruntime\build\ort_target.vcxproj]

Visual Studio Version

Visual Studio 16 2019

GCC / Compiler Version

MSVC 19.29.30147.0

@tanmayv25 tanmayv25 added the build build issues; typically submitted using template label Mar 27, 2023
@tanmayv25 tanmayv25 changed the title [Build] Windows + CUDA 12.1 [Build] Windows 10 + CUDA 12.1 Mar 27, 2023
@github-actions github-actions bot added ep:CUDA issues related to the CUDA execution provider ep:TensorRT issues related to TensorRT execution provider model:transformer issues related to a transformer model: BERT, GPT2, Hugging Face, Longformer, T5, etc. platform:windows issues related to the Windows platform labels Mar 27, 2023
@tanmayv25
Copy link
Author

@pranavsharma

@yuslepukhin
Copy link
Member

We do not build with CUDA 12. Not yet anyway. Feel free to submit a PR with a fix if you want this earlier.

@tanmayv25
Copy link
Author

Based upon yesterday's meeting with Pranav, I believe the plan was for Microsoft to fix ORT build for cuda 12 and possibly have a CI.
@pranavsharma for more clarification.

@pranavsharma pranavsharma removed the model:transformer issues related to a transformer model: BERT, GPT2, Hugging Face, Longformer, T5, etc. label Mar 28, 2023
@junwang-wish
Copy link

junwang-wish commented Mar 31, 2023

Is there a timeline for supporting Onnxruntime-gpu CUDAExecutioner for CUDA 12?
This is my env

Fri Mar 31 16:38:20 2023       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 515.65.01    Driver Version: 515.65.01    CUDA Version: 12.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  Tesla T4            On   | 00000000:00:1E.0 Off |                    0 |
| N/A   61C    P0    37W /  70W |    168MiB / 15360MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
+-----------------------------------------------------------------------------+

I tried nightly build

pip3.8 install -U ort-nightly-gpu==1.15.0.dev20230330002 -i https://aiinfra.pkgs.visualstudio.com/PublicPackages/_packaging/ORT-Nightly/pypi/simple/

and can get correct devices and executioners

import onnxruntime as ort
ort.__version__, ort.get_device()
('1.15.0', 'GPU')
ort.get_available_providers()
['TensorrtExecutionProvider', 'CUDAExecutionProvider', 'CPUExecutionProvider']

but still getting error when actually running an inference

session = ort.InferenceSession(
    '/workspaces/model.onnx',
    providers=['CUDAExecutionProvider']
)

2023-03-31 16:24:26.783840208 [W:onnxruntime:Default, onnxruntime_pybind_state.cc:546 CreateExecutionProviderInstance] Failed to create CUDAExecutionProvider. Please reference https://onnxruntime.ai/docs/execution-providers/CUDA-ExecutionProvider.html#requirements to ensure all dependencies are met.

session._providers
['CPUExecutionProvider']

Triton inference server which also uses Onnx seems to work well with CUDA 12.1, wonder why Onnxruntime-gpu cannot support it?

@yuslepukhin
Copy link
Member

yuslepukhin commented Mar 31, 2023

2023-03-31 16:24:26.783840208 [W:onnxruntime:Default, onnxruntime_pybind_state.cc:546 CreateExecutionProviderInstance] Failed to create CUDAExecutionProvider. Please reference https://onnxruntime.ai/docs/execution-providers/CUDA-ExecutionProvider.html#requirements to ensure all dependencies are met.

CUDA is notorious for changing the set of DLLs that needs to load and the places where to look for them version by version. Check your DLL search order and their location.

@junwang-wish
Copy link

junwang-wish commented Mar 31, 2023

Thanks @yuslepukhin, is there a tutorial on how to Check your DLL search order and their location ? does it mean I need to add a config to guide onnxruntime to find proper DLLs? Or I need to copy DLLS to correct folders used by onnxruntime?

For example, copy all DLLs to a folder and manually add it to PATH? #11826 (comment)

@yuslepukhin
Copy link
Member

See #13658 first

@pranavsharma
Copy link
Contributor

@junwang-wish ORT does not officially support CUDA 12 yet. In the case of Triton, they build ORT from source and made it work with CUDA 12 on Linux (see this and this), but the windows build still fails which is why this issue was filed. Feel free to contribute fixes for the Windows build.

@junwang-wish
Copy link

Thanks @pranavsharma , I just tried on Linux Ubuntu 20.04.5 LTS, but the same error #15242 (comment) appears on Linux.

@pranavsharma
Copy link
Contributor

We're not producing nightly builds for cuda 12. You'll need to build it yourself. We only support CUDA 11.x for all nightlies and release pks. As I said Triton is completely different; they don't use any pkgs.

@junwang-wish
Copy link

Gotcha thanks, I will downgrade to CUDA 11.x for now.

@pranavsharma
Copy link
Contributor

Just spoke to the customer. Deadline for this is May 5th.

@snnn
Copy link
Member

snnn commented Apr 10, 2023

I just checked again, our prebuilt binaries should work fine with any CUDA version from 11.0 to 11.8(both inclusive). Though in the upcoming release we will build our code with CUDA 11.8, the binaries should continue work fine with CUDA 11.6 or below.

After the upcoming release, we will consider upgrading CUDA version to 12.x.

@pranavsharma
Copy link
Contributor

@snnn there is no expectation to release CUDA 12.x binaries in the upcoming release (1.15). The only ask is to ensure that ORT builds with CUDA 12.x.

tianleiwu added a commit that referenced this issue Apr 24, 2023
### Description
Fix CUDA 12.1 Windows build error of cuda namespace ambiguous. Use a new namespace for attention softmax.

Tested with VS 2019 and VS 2022 with the following settings:
- OS: Microsoft Windows 11 Enterprise (Version 10.0.22621 Build 22621)
- CUDA: cuda_12.1.0_531.14_windows
- TensorRT: TensorRT-8.6.0.12.Windows10.x86_64.cuda-12.0
- CUDNN: 8.8.1.3 for cuda 12
- Visual Studio Enterprise 2019, version 16.11.26 (MSVC v142) or
  Visual Studio Enterprise 2022 (64-bit), version 17.5.4
- Python: 3.10
- CMake: 3.25.2

VS 2019:
```
build.bat --cmake_generator "Visual Studio 16 2019" --config Release --cmake_extra_defines "CMAKE_CUDA_ARCHITECTURES=52;60;61;70;75;80;86" --skip_submodule_sync --parallel --build_shared_lib --update --build --build_dir .\build\trt --use_cuda --cuda_version "12.1" --cuda_home "C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.1" --cudnn_home "C:\CuDNN\8.8.1.3_cuda12" --use_tensorrt --tensorrt_home "C:\TensorRT-8.6.0.12.Windows10.x86_64.cuda-12.0\TensorRT-8.6.0.12"
```

VS 2022:
```
build.bat --cmake_generator "Visual Studio 17 2022" --config Release --cmake_extra_defines "CMAKE_CUDA_ARCHITECTURES=52;60;61;70;75;80;86" --skip_submodule_sync --parallel --build_shared_lib --update --build --build_dir .\build\trt_2022 --use_cuda --cuda_version "12.1" --cuda_home "C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.1" --cudnn_home "C:\CuDNN\8.8.1.3_cuda12" --use_tensorrt --tensorrt_home "C:\TensorRT-8.6.0.12.Windows10.x86_64.cuda-12.0\TensorRT-8.6.0.12"
```


### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->

#15242
@tianleiwu
Copy link
Contributor

@tanmayv25, please try out the fix.

@tanmayv25
Copy link
Author

@tianleiwu Thanks a lot for all the assistance. I can confirm that I was able to successfully build onnxruntime with cuda 12.1.

@tanmayv25
Copy link
Author

@tianleiwu @pranavsharma Can you share when the fix will be available in a release branch?

@pranavsharma
Copy link
Contributor

23rd May. That's the tentative release date.

@junwang-wish
Copy link

@pranavsharma so I can use onnxruntime on any machines with CUDA12.1 tentatively on 23rd May 2023?

@pranavsharma
Copy link
Contributor

@pranavsharma so I can use onnxruntime on any machines with CUDA12.1 tentatively on 23rd May 2023?

Yes.

@snnn
Copy link
Member

snnn commented May 4, 2023

I have a PR that is related to this: #15781 . In the PR I updated one of our docker files to use CUDA 12.1, and it works fine. Therefore I believe our code is compatible with CUDA 12.1. However, the upcoming release ORT release, 1.15, will use CUDA 11.8 for all prebuilt packages. If you'd like to use CUDA 12.1 instead, please build it from source. And you can try it now. We have created the rel-1.15.0 release branch. Though the release tag is not created yet, most things are already ready.

ShukantPal pushed a commit to ShukantPal/onnxruntime that referenced this issue May 7, 2023
### Description
Fix CUDA 12.1 Windows build error of cuda namespace ambiguous. Use a new namespace for attention softmax.

Tested with VS 2019 and VS 2022 with the following settings:
- OS: Microsoft Windows 11 Enterprise (Version 10.0.22621 Build 22621)
- CUDA: cuda_12.1.0_531.14_windows
- TensorRT: TensorRT-8.6.0.12.Windows10.x86_64.cuda-12.0
- CUDNN: 8.8.1.3 for cuda 12
- Visual Studio Enterprise 2019, version 16.11.26 (MSVC v142) or
  Visual Studio Enterprise 2022 (64-bit), version 17.5.4
- Python: 3.10
- CMake: 3.25.2

VS 2019:
```
build.bat --cmake_generator "Visual Studio 16 2019" --config Release --cmake_extra_defines "CMAKE_CUDA_ARCHITECTURES=52;60;61;70;75;80;86" --skip_submodule_sync --parallel --build_shared_lib --update --build --build_dir .\build\trt --use_cuda --cuda_version "12.1" --cuda_home "C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.1" --cudnn_home "C:\CuDNN\8.8.1.3_cuda12" --use_tensorrt --tensorrt_home "C:\TensorRT-8.6.0.12.Windows10.x86_64.cuda-12.0\TensorRT-8.6.0.12"
```

VS 2022:
```
build.bat --cmake_generator "Visual Studio 17 2022" --config Release --cmake_extra_defines "CMAKE_CUDA_ARCHITECTURES=52;60;61;70;75;80;86" --skip_submodule_sync --parallel --build_shared_lib --update --build --build_dir .\build\trt_2022 --use_cuda --cuda_version "12.1" --cuda_home "C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.1" --cudnn_home "C:\CuDNN\8.8.1.3_cuda12" --use_tensorrt --tensorrt_home "C:\TensorRT-8.6.0.12.Windows10.x86_64.cuda-12.0\TensorRT-8.6.0.12"
```


### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->

microsoft#15242
@vadimkantorov
Copy link

vadimkantorov commented Sep 1, 2023

@snnn By chance, are nightlies now built against CUDA 12 (it's been some time now)?

@snnn
Copy link
Member

snnn commented Sep 1, 2023

No. They are not.

@j4ro
Copy link

j4ro commented Sep 18, 2023

@snnn Is it planned to use cuda 12 in the next release and any estimate when we can expect next release? Upcoming release roadmap is outdated.

@snnn
Copy link
Member

snnn commented Sep 18, 2023

No, the next release's official packages will be still built with CUDA 11.x. However, you can build it from source with CUDA 12.x.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
build build issues; typically submitted using template ep:CUDA issues related to the CUDA execution provider ep:TensorRT issues related to TensorRT execution provider platform:windows issues related to the Windows platform
Projects
None yet
Development

No branches or pull requests

8 participants