You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have multiple threads that are calling session.run on one session. I recently made it so I am using pinned memory and asynchronous mem copies, which is working great. However, to do this I am using separate cuda streams for the mem copies. I noticed that session.run does not work with these cuda streams. I can link one cuda stream to a session via the options, but I want to use multiple cuda streams and a different one for every run call. How can I achieve this? Or should I just use multiple sessions instead? But then I will have multiple instances of the same model in memory, which doesn't seem great.
To reproduce
--
Urgency
No response
Platform
Windows
OS Version
10
ONNX Runtime Installation
Built from Source
ONNX Runtime Version or Commit ID
16.2
ONNX Runtime API
C++
Architecture
X64
Execution Provider
CUDA
Execution Provider Library Version
No response
The text was updated successfully, but these errors were encountered:
I have the same issue. In the multiple threads that are calling session.run on one session. when calling Ort::MemoryInfo memoryInfo = Ort::MemoryInfo::CreateCpu(OrtAllocatorType::OrtArenaAllocator, OrtMemType::OrtMemTypeDefault);
it will throw an error like this: CUDA failure 700: an illegal memory access was encountered ; GPU=0 ; hostname=Windows11 ; file=D:\a\_work\1\s\onnxruntime\core\providers\cuda\gpu_data_transfer.cc ; line=63 ; expr=cudaMemcpyAsync(dst_data, src_data, bytes, cudaMemcpyHostToDevice, static_cast<cudaStream_t>(stream.GetHandle()));
Describe the issue
I have multiple threads that are calling session.run on one session. I recently made it so I am using pinned memory and asynchronous mem copies, which is working great. However, to do this I am using separate cuda streams for the mem copies. I noticed that session.run does not work with these cuda streams. I can link one cuda stream to a session via the options, but I want to use multiple cuda streams and a different one for every run call. How can I achieve this? Or should I just use multiple sessions instead? But then I will have multiple instances of the same model in memory, which doesn't seem great.
To reproduce
--
Urgency
No response
Platform
Windows
OS Version
10
ONNX Runtime Installation
Built from Source
ONNX Runtime Version or Commit ID
16.2
ONNX Runtime API
C++
Architecture
X64
Execution Provider
CUDA
Execution Provider Library Version
No response
The text was updated successfully, but these errors were encountered: