Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

multiple tests fail on Windows due to ORT_ENABLE_STREAM define logic error #20180

Open
diablodale opened this issue Apr 2, 2024 · 9 comments
Labels
build build issues; typically submitted using template ep:CUDA issues related to the CUDA execution provider ep:TensorRT issues related to TensorRT execution provider platform:windows issues related to the Windows platform

Comments

@diablodale
Copy link
Contributor

Describe the issue

v1.17.1 Windows, with build.bat for build + test, multiple tests fail with

Exception during initialization: C:\repos-nobackup\onnxruntime\onnxruntime\core\framework\allocator_utils.cc:61 onnxruntime::CreateAllocator StreamAwareArena should be transparent to minimal build.

that error is a static ORT_THROW at line 61

#ifdef ORT_ENABLE_STREAM
return AllocatorPtr(
std::make_unique<StreamAwareArena>(std::move(device_allocator),
max_mem,
info.enable_cross_stream_reusing,
arena_extend_str,
initial_chunk_size_bytes,
max_dead_bytes_per_chunk,
initial_growth_chunk_size_bytes));
#else
ORT_THROW("StreamAwareArena should be transparent to minimal build.");
#endif

Is above code correct? When ORT_ENABLE_STREAM is not defined (such as my build.bat options), then should the code calling CreateAllocator(info) not set use_stream_aware_arena = true?

Is it a logic error in setting up the cmake defines which cascade to compiler defs?

# Enable stream for all the non-minimal build, except for DML. There's currently a bug
# in the allocation planner when reusing buffers and more than one streams are used that
# make it possible (although rarely) to reach a reference count of 0 for a buffer that is
# still being used. Since DML doesn't benefit from multiple streams, disabling it is the
# safest option for now.
# https://github.com/microsoft/onnxruntime/issues/19480
if (NOT onnxruntime_MINIMAL_BUILD AND NOT onnxruntime_USE_DML)
add_compile_definitions(ORT_ENABLE_STREAM)
endif()

Urgency

No response

Target platform

Windows

Build script

.\build.bat --update --build --test ^
--use_cuda --cuda_home "C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.8" --cuda_version 11.8 --cudnn_home "C:\repos-nobackup\cudnn-windows-x86_64-8.9.3.28_cuda11-archive" --use_tensorrt --use_tensorrt_builtin_parser --tensorrt_home "C:\repos-nobackup\TensorRT-8.6.1.6" ^
--cmake_generator "Visual Studio 17 2022" --config Release --build_shared_lib --parallel --enable_lto --use_dml --cmake_extra_defines CMAKE_INSTALL_PREFIX=C:/repos-nobackup/onnxruntime/.install/Release onnxruntime_USE_AVX=ON

Error / output

The standard ORT tests are failing. Several. Here are a few

1: [ RUN      ] InferenceSessionTests.CheckRunProfilerWithSessionOptions
1: 2024-04-02 23:43:24.7854084 [E:onnxruntime:CheckRunProfiler, inference_session.cc:1981 onnxruntime::InferenceSession::Initialize::<lambda_43e79d08ab5e11029a762dd34e0e0eed>::operator ()] Exception during initialization: C:\repos-nobackup\onnxruntime\onnxruntime\core\framework\allocator_utils.cc:61 onnxruntime::CreateAllocator StreamAwareArena should be transparent to minimal build.
1: 
1: C:\repos-nobackup\onnxruntime\onnxruntime\test\framework\inference_session_test.cc(644): error: Value of: _tmp_status.IsOK()
1:   Actual: false
1: Expected: true
1: [ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Exception during initialization: C:\repos-nobackup\onnxruntime\onnxruntime\core\framework\allocator_utils.cc:61 onnxruntime::CreateAllocator StreamAwareArena should be transparent to minimal build.
1:
1: Stack trace:
1:   00007FF792294D25: testing::TestInfo::Run
1:   00007FF7922953C8: testing::TestSuite::Run
1:   00007FF7922A209A: testing::internal::UnitTestImpl::RunAllTests
1:   00007FF7922A7199: testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl,bool>
1: ... Google Test internal frames ...
1:
1: [  FAILED  ] InferenceSessionTests.CheckRunProfilerWithSessionOptions (3 ms)
1: [ RUN      ] InferenceSessionTests.CheckRunProfilerWithSessionOptions2
1: 2024-04-02 23:43:24.7939507 [E:onnxruntime:CheckRunProfiler, inference_session.cc:1981 onnxruntime::InferenceSession::Initialize::<lambda_43e79d08ab5e11029a762dd34e0e0eed>::operator ()] Exception during initialization: C:\repos-nobackup\onnxruntime\onnxruntime\core\framework\allocator_utils.cc:61 onnxruntime::CreateAllocator StreamAwareArena should be transparent to minimal build.
1: 
1: C:\repos-nobackup\onnxruntime\onnxruntime\test\framework\inference_session_test.cc(698): error: Value of: _tmp_status.IsOK()
1:   Actual: false
1: Expected: true
1: [ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Exception during initialization: C:\repos-nobackup\onnxruntime\onnxruntime\core\framework\allocator_utils.cc:61 onnxruntime::CreateAllocator StreamAwareArena should be transparent to minimal build.
1:
1: Stack trace:
1:   00007FF792294D25: testing::TestInfo::Run
1:   00007FF7922953C8: testing::TestSuite::Run
1:   00007FF7922A209A: testing::internal::UnitTestImpl::RunAllTests
1:   00007FF7922A7199: testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl,bool>
1: ... Google Test internal frames ...
1:
1: [  FAILED  ] InferenceSessionTests.CheckRunProfilerWithSessionOptions2 (8 ms)1: [ RUN      ] InferenceSessionTests.CheckRunProfilerWithSessionOptions
1: 2024-04-02 23:43:24.7854084 [E:onnxruntime:CheckRunProfiler, inference_session.cc:1981 onnxruntime::InferenceSession::Initialize::<lambda_43e79d08ab5e11029a762dd34e0e0eed>::operator ()] Exception during initialization: C:\repos-nobackup\onnxruntime\onnxruntime\core\framework\allocator_utils.cc:61 onnxruntime::CreateAllocator StreamAwareArena should be transparent to minimal build.
1: 
1: C:\repos-nobackup\onnxruntime\onnxruntime\test\framework\inference_session_test.cc(644): error: Value of: _tmp_status.IsOK()
1:   Actual: false
1: Expected: true
1: [ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Exception during initialization: C:\repos-nobackup\onnxruntime\onnxruntime\core\framework\allocator_utils.cc:61 onnxruntime::CreateAllocator StreamAwareArena should be transparent to minimal build.
1:
1: Stack trace:
1:   00007FF792294D25: testing::TestInfo::Run
1:   00007FF7922953C8: testing::TestSuite::Run
1:   00007FF7922A209A: testing::internal::UnitTestImpl::RunAllTests
1:   00007FF7922A7199: testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl,bool>
1: ... Google Test internal frames ...
1:
1: [  FAILED  ] InferenceSessionTests.CheckRunProfilerWithSessionOptions (3 ms)
1: [ RUN      ] InferenceSessionTests.CheckRunProfilerWithSessionOptions2
1: 2024-04-02 23:43:24.7939507 [E:onnxruntime:CheckRunProfiler, inference_session.cc:1981 onnxruntime::InferenceSession::Initialize::<lambda_43e79d08ab5e11029a762dd34e0e0eed>::operator ()] Exception during initialization: C:\repos-nobackup\onnxruntime\onnxruntime\core\framework\allocator_utils.cc:61 onnxruntime::CreateAllocator StreamAwareArena should be transparent to minimal build.
1: 
1: C:\repos-nobackup\onnxruntime\onnxruntime\test\framework\inference_session_test.cc(698): error: Value of: _tmp_status.IsOK()
1:   Actual: false
1: Expected: true
1: [ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Exception during initialization: C:\repos-nobackup\onnxruntime\onnxruntime\core\framework\allocator_utils.cc:61 onnxruntime::CreateAllocator StreamAwareArena should be transparent to minimal build.
1:
1: Stack trace:
1:   00007FF792294D25: testing::TestInfo::Run
1:   00007FF7922953C8: testing::TestSuite::Run
1:   00007FF7922A209A: testing::internal::UnitTestImpl::RunAllTests
1:   00007FF7922A7199: testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl,bool>
1: ... Google Test internal frames ...
1:
1: [  FAILED  ] InferenceSessionTests.CheckRunProfilerWithSessionOptions2 (8 ms)
1: [ RUN      ] InferenceSessionTests.TestBindCuda
1: 2024-04-02 23:43:24.8419108 [E:onnxruntime:InferenceSessionTests.TestBindCuda, inference_session.cc:1981 onnxruntime::InferenceSession::Initialize::<lambda_43e79d08ab5e11029a762dd34e0e0eed>::operator ()] Exception during initialization: C:\repos-nobackup\onnxruntime\onnxruntime\core\framework\allocator_utils.cc:61 onnxruntime::CreateAllocator StreamAwareArena should be transparent to minimal build.
1: 
1: C:\repos-nobackup\onnxruntime\onnxruntime\test\framework\inference_session_test.cc(1016): error: Value of: _tmp_status.IsOK()
1:   Actual: false
1: Expected: true
1: [ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Exception during initialization: C:\repos-nobackup\onnxruntime\onnxruntime\core\framework\allocator_utils.cc:61 onnxruntime::CreateAllocator StreamAwareArena should be transparent to minimal build.
1:
1: Stack trace:
1:   00007FF7922A6076: testing::internal::HandleExceptionsInMethodIfSupported<testing::Test,void>
1:   00007FF792294D25: testing::TestInfo::Run
1:   00007FF7922953C8: testing::TestSuite::Run
1:   00007FF7922A209A: testing::internal::UnitTestImpl::RunAllTests
1:   00007FF7922A7199: testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl,bool>
1: ... Google Test internal frames ...
1:
1: [  FAILED  ] InferenceSessionTests.TestBindCuda (2 ms)
5: [       OK ] CApiTestGlobalThreadPoolsWithProviders/CApiTestGlobalThreadPoolsWithProvider.simpleAsync/0 (224 ms)
5: [ RUN      ] CApiTestGlobalThreadPoolsWithProviders/CApiTestGlobalThreadPoolsWithProvider.simpleAsync/1
5: Running simple inference with cuda provider
5: 2024-04-02 23:43:52.7171382 [I:onnxruntime:, inference_session.cc:514 onnxruntime::InferenceSession::TraceSessionOptions] Session Options {  execution_mode:0 execution_order:DEFAULT enable_profiling:0 optimized_model_filepath: enable_mem_pattern:1 enable_mem_reuse:1 enable_cpu_mem_arena:1 profile_file_prefix:onnxruntime_profile_ session_logid: session_log_severity_level:-1 session_log_verbosity_level:0 max_num_graph_transformation_steps:10 graph_optimization_level:3 intra_op_param:OrtThreadPoolParams { thread_pool_size: 0 auto_set_affinity: 0 allow_spinning: 1 dynamic_block_base_: 0 stack_size: 0 affinity_str:  set_denormal_as_zero: 0 } inter_op_param:OrtThreadPoolParams { thread_pool_size: 0 auto_set_affinity: 0 allow_spinning: 1 dynamic_block_base_: 0 stack_size: 0 affinity_str:  set_denormal_as_zero: 0 } use_per_session_threads:0 thread_pool_allow_spinning:1 use_deterministic_compute:0 config_options: {  } }
5: 2024-04-02 23:43:52.7179308 [I:onnxruntime:, inference_session.cc:495 onnxruntime::InferenceSession::ConstructorCommon] Using global/env threadpools since use_per_session_threads_ is false
5: 2024-04-02 23:43:52.7237903 [I:onnxruntime:, inference_session.cc:1583 onnxruntime::InferenceSession::Initialize] Initializing session.
5: 2024-04-02 23:43:52.7239252 [I:onnxruntime:, inference_session.cc:1620 onnxruntime::InferenceSession::Initialize] Adding default CPU execution provider.
5: 2024-04-02 23:43:52.7241814 [E:onnxruntime:, inference_session.cc:1981 onnxruntime::InferenceSession::Initialize::<lambda_43e79d08ab5e11029a762dd34e0e0eed>::operator ()] Exception during initialization: C:\repos-nobackup\onnxruntime\onnxruntime\core\framework\allocator_utils.cc:61 onnxruntime::CreateAllocator StreamAwareArena should be transparent to minimal build.
5: 
5: unknown file: error: C++ exception with description "Exception during initialization: C:\repos-nobackup\onnxruntime\onnxruntime\core\framework\allocator_utils.cc:61 onnxruntime::CreateAllocator StreamAwareArena should be transparent to minimal build.
5: " thrown in the test body.
5:
5: [  FAILED  ] CApiTestGlobalThreadPoolsWithProviders/CApiTestGlobalThreadPoolsWithProvider.simpleAsync/1, where GetParam() = 1 (8 ms)

Visual Studio Version

VS2022 v17.9.5

GCC / Compiler Version

No response

@diablodale diablodale added the build build issues; typically submitted using template label Apr 2, 2024
@github-actions github-actions bot added ep:CUDA issues related to the CUDA execution provider ep:TensorRT issues related to TensorRT execution provider platform:windows issues related to the Windows platform labels Apr 2, 2024
@diablodale
Copy link
Contributor Author

diablodale commented Apr 2, 2024

This looks suspicious...both CUDA and ROCm EPs assume streams are always supported...which is a false assumption given the cmake logic above.

If cmake logic is correct, then probably an #ifdef ORT_ENABLE_STREAM is needed to toggle true/false in these constructions.

Or, perhaps the cmake logic is wrong and ORT_ENABLE_STREAM define needs to be per EP. So that when the CUDA EP is compiled, it has streams. But when the DML EP is compiled, it does not have streams. I don't have a holistic view on the streams feature to provide educated suggestions.

AllocatorCreationInfo default_memory_info(
[](OrtDevice::DeviceId id) {
return std::make_unique<CUDAAllocator>(id, CUDA);
},
device_id,
true,
{default_memory_arena_cfg ? *default_memory_arena_cfg
: OrtArenaCfg(gpu_mem_limit, static_cast<int>(arena_extend_strategy), -1, -1, -1, -1L)},
// make it stream aware
true,
// enable cross stream sharing?
false);

AllocatorCreationInfo default_memory_info(
[](OrtDevice::DeviceId id) {
return std::make_unique<ROCMAllocator>(id, HIP);
},
device_id,
true,
{default_memory_arena_cfg ? *default_memory_arena_cfg
: OrtArenaCfg(gpu_mem_limit, static_cast<int>(arena_extend_strategy), -1, -1, -1, -1L)},
// make it stream aware
true,
// enable cross stream sharing?
false);

@diablodale
Copy link
Contributor Author

Idea 1 failed. I added an #ifdef ORT_ENABLE_STREAM in the CUDA and ROCm ep's to toggle true/false in those constructions. CUDA and TensorRT eps failed constantly in tests. Lots of SEH exceptions

1: [ RUN      ] PackedAttentionTest.PackedWithRelativePositionBias
1: unknown file: error: SEH exception with code 0xc0000005 thrown in the test body.
1: Stack trace:
1:   00007FFA9F59340E: KiUserExceptionDispatcher
1:   00007FF9C4C6C839: (unknown)
1:   00007FF9C4AD5254: (unknown)
1:   00007FF64B2CB9AC: onnxruntime::ExecuteKernel
1:   00007FF64B2C9D54: onnxruntime::LaunchKernelStep::Execute
1:   00007FF64B2CD134: onnxruntime::RunSince
1:   00007FF64B2CC932: std::_Func_impl_no_alloc<<lambda_c992b183ba75cc416ad0135fbbf0b9d3>,void>::_Do_call
1:   00007FF64A4610DC: onnxruntime::concurrency::ThreadPool::Schedule
1:   00007FF64B2CC0B9: onnxruntime::ExecuteThePlan
1:   00007FF64B2B8E3E: onnxruntime::utils::ExecuteGraphImpl
1:   00007FF64ABD86C4: onnxruntime::InferenceSession::Run
1:   00007FF64ABDA695: onnxruntime::InferenceSession::Run
1:   00007FF64A5D5E2B: onnxruntime::test::BaseTester::ExecuteModel<onnxruntime::InferenceSession>
1:   00007FF64A5D3444: onnxruntime::test::BaseTester::ExecuteModelForEps
1:   00007FF64A5D1ED3: onnxruntime::test::BaseTester::RunWithConfig
1:   00007FF64A5D154D: onnxruntime::test::BaseTester::Run
1:   00007FF64A543875: onnxruntime::test::RunPackedAttentionTest
1:   00007FF64A5448D5: onnxruntime::test::PackedAttentionTest_PackedWithRelativePositionBias_Test::TestBody
1:   00007FF64B416F02: testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test,void>
1:   00007FF64B416076: testing::internal::HandleExceptionsInMethodIfSupported<testing::Test,void>
1:   00007FF64B404D25: testing::TestInfo::Run
1:   00007FF64B4053C8: testing::TestSuite::Run
1:   00007FF64B41209A: testing::internal::UnitTestImpl::RunAllTests
1:   00007FF64B417199: testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl,bool>
1: ... Google Test internal frames ...
1:
1: Google Test trace:
1: C:\repos-nobackup\onnxruntime\onnxruntime\test\providers\base_tester.cc(791): registered execution providers: CUDAExecutionProvider
1: C:\repos-nobackup\onnxruntime\onnxruntime\test\providers\base_tester.cc(791): registered execution providers: CUDAExecutionProvider
1: C:\repos-nobackup\onnxruntime\onnxruntime\test\providers\base_tester.cc(791): registered execution providers: CUDAExecutionProvider
1: C:\repos-nobackup\onnxruntime\onnxruntime\test\providers\base_tester.cc(791): registered execution providers: CUDAExecutionProvider:DmlExecutionProvider
1: C:\repos-nobackup\onnxruntime\onnxruntime\test\providers\base_tester.cc(791): registered execution providers: CUDAExecutionProvider:DmlExecutionProvider
1: C:\repos-nobackup\onnxruntime\onnxruntime\test\providers\base_tester.cc(791): registered execution providers: CUDAExecutionProvider:DmlExecutionProvider
1: C:\repos-nobackup\onnxruntime\onnxruntime\test\providers\base_tester.cc(791): registered execution providers: CUDAExecutionProvider:DmlExecutionProvider
1: C:\repos-nobackup\onnxruntime\onnxruntime\test\providers\base_tester.cc(791): registered execution providers: CUDAExecutionProvider:DmlExecutionProvider
1: C:\repos-nobackup\onnxruntime\onnxruntime\test\providers\base_tester.cc(791): registered execution providers: CUDAExecutionProvider:DmlExecutionProvider
1: C:\repos-nobackup\onnxruntime\onnxruntime\test\providers\base_tester.cc(791): registered execution providers: CUDAExecutionProvider

@tianleiwu
Copy link
Contributor

tianleiwu commented Apr 3, 2024

Could you try change

if (NOT onnxruntime_MINIMAL_BUILD AND NOT onnxruntime_USE_DML) 
   add_compile_definitions(ORT_ENABLE_STREAM) 
 endif() 

to

if (NOT onnxruntime_MINIMAL_BUILD) 
   add_compile_definitions(ORT_ENABLE_STREAM) 
 endif() 

You can cherry-pick this commit

@diablodale
Copy link
Contributor Author

diablodale commented Apr 3, 2024

Are you sure that is ok? I double-check because that commit is part of the PR which...

Enable streams for DML EP. This change is to revert PR 19481 since the bug 19480 is fixed by PR 19515

PR #19515 is not part of v1.17.1. That would put ORT in a state expecting PR #19515 code that isn't available.

@diablodale
Copy link
Contributor Author

diablodale commented Apr 4, 2024

@tianleiwu I applied the same PRs to v1.17.1 and all tests pass except for one CApiTest.custom_op_set_input_memory_type
It is only with those PRs cherrypicked that I could get a test run worth evaluating. Unclear to me if those PRs also hide a bug related to that test case...or if they fixed enough so that a bug elsewhere is exposed.

To troubleshoot further, I manually ran that single test. The test passed.
But running the entire set of onnxruntime_shared_lib_test it fails. 🤔 Easy to reproduce.
I further isolated that if I run the failing test and and the one immediately before it... it fails.
I suspect there is an interaction between these two tests which cause the failure

C:\repos-nobackup\onnxruntime\build\Windows\Release>onnxruntime_shared_lib_test.exe --gtest_filter=*custom_op_set_input_memory_type*
Note: Google Test filter = *custom_op_set_input_memory_type*
[==========] Running 1 test from 1 test suite.
[----------] Global test environment set-up.
[----------] 1 test from CApiTest
[ RUN      ] CApiTest.custom_op_set_input_memory_type
Running custom op inference
Running simple inference with cuda provider
[       OK ] CApiTest.custom_op_set_input_memory_type (134 ms)
[----------] 1 test from CApiTest (135 ms total)

[----------] Global test environment tear-down
[==========] 1 test from 1 test suite ran. (137 ms total)
[  PASSED  ] 1 test.

////////////////// ...compared to... ///////////////////////

C:\repos-nobackup\onnxruntime\build\Windows\Release>onnxruntime_shared_lib_test.exe --gtest_filter=*CApiTest.custom_op_*:-CApiTest.custom_op_with_attributes_handler
Note: Google Test filter = *CApiTest.custom_op_*:-CApiTest.custom_op_with_attributes_handler
[==========] Running 2 tests from 1 test suite.
[----------] Global test environment set-up.
[----------] 2 tests from CApiTest
[ RUN      ] CApiTest.custom_op_handler
Running custom op inference
Running simple inference with cuda provider
[       OK ] CApiTest.custom_op_handler (128 ms)
[ RUN      ] CApiTest.custom_op_set_input_memory_type
Running custom op inference
Running simple inference with cuda provider
C:\repos-nobackup\onnxruntime\onnxruntime\test\shared_lib\test_inference.cc(87): error: The difference between values_y[i] and f[i] is 1, which exceeds 1e-3, where
values_y[i] evaluates to 2,
f[i] evaluates to 1, and
1e-3 evaluates to 0.001.
Stack trace:
  00007FF6BA92B892: testing::internal::HandleSehExceptionsInMethodIfSupported<testing::TestSuite,void>
  00007FF6BA929EE6: testing::internal::HandleExceptionsInMethodIfSupported<testing::TestSuite,void>
  00007FF6BA912E39: testing::TestInfo::Run
  00007FF6BA91360D: testing::TestSuite::Run
  00007FF6BA924F17: testing::internal::UnitTestImpl::RunAllTests
  00007FF6BA92BB29: testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl,bool>
... Google Test internal frames ...

[  FAILED  ] CApiTest.custom_op_set_input_memory_type (41 ms)
[----------] 2 tests from CApiTest (173 ms total)

[----------] Global test environment tear-down
[==========] 2 tests from 1 test suite ran. (177 ms total)
[  PASSED  ] 1 test.
[  FAILED  ] 1 test, listed below:
[  FAILED  ] CApiTest.custom_op_set_input_memory_type

 1 FAILED TEST

In VSCode MSVC debugger...the tests pass. 🤔 Same exe, args, etc.

Loaded 'C:\repos-nobackup\onnxruntime\build\Windows\Release\nvinfer_builder_resource.dll'. Module was built without symbols.
Unloaded 'C:\repos-nobackup\onnxruntime\build\Windows\Release\nvinfer_builder_resource.dll'.
Loaded 'C:\Windows\System32\bcryptprimitives.dll'. Symbol loading disabled by Include/Exclude setting.
Note: Google Test filter = *CApiTest.custom_op_*:-CApiTest.custom_op_with_attributes_handler
[==========] Running 2 tests from 1 test suite.
[----------] Global test environment set-up.
[----------] 2 tests from CApiTest
[ RUN      ] CApiTest.custom_op_handler
Running custom op inference
Running simple inference with cuda provider
Loaded 'C:\repos-nobackup\onnxruntime\build\Windows\Release\onnxruntime_providers_shared.dll'. Symbols loaded.
Loaded 'C:\repos-nobackup\onnxruntime\build\Windows\Release\onnxruntime_providers_cuda.dll'. Symbols loaded.
Loaded 'C:\repos-nobackup\onnxruntime\build\Windows\Release\cublas64_11.dll'. Module was built without symbols.
Loaded 'C:\repos-nobackup\onnxruntime\build\Windows\Release\cudnn64_8.dll'. Module was built without symbols.
Loaded 'C:\repos-nobackup\onnxruntime\build\Windows\Release\cublasLt64_11.dll'. Module was built without symbols.
Loaded 'C:\repos-nobackup\onnxruntime\build\Windows\Release\cufft64_10.dll'. Module was built without symbols.
Loaded 'C:\repos-nobackup\onnxruntime\build\Windows\Release\nvrtc64_112_0.dll'. Module was built without symbols.
Loaded 'C:\repos-nobackup\onnxruntime\build\Windows\Release\cudnn_ops_infer64_8.dll'. Module was built without symbols.
[       OK ] CApiTest.custom_op_handler (237 ms)
[ RUN      ] CApiTest.custom_op_set_input_memory_type
Running custom op inference
Running simple inference with cuda provider
[       OK ] CApiTest.custom_op_set_input_memory_type (13 ms)
[----------] 2 tests from CApiTest (251 ms total)

[----------] Global test environment tear-down
[==========] 2 tests from 1 test suite ran. (251 ms total)
[  PASSED  ] 2 tests.
Unloaded 'C:\repos-nobackup\onnxruntime\build\Windows\Release\cudnn64_8.dll'.
Unloaded 'C:\repos-nobackup\onnxruntime\build\Windows\Release\cufft64_10.dll'.
Unloaded 'C:\repos-nobackup\onnxruntime\build\Windows\Release\onnxruntime_providers_cuda.dll'.
Unloaded 'C:\repos-nobackup\onnxruntime\build\Windows\Release\onnxruntime_providers_shared.dll'.
The program '[5016] onnxruntime_shared_lib_test.exe' has exited with code 0 (0x0).

@diablodale
Copy link
Contributor Author

@yuslepukhin, the two CUDA (on windows) tests immediately above fail, but if only one test is run it passes.
Could this be related to your PR #20039?

YUNQIUGUO pushed a commit that referenced this issue Apr 5, 2024
### Description
Bring the fix for DML to 1.17.3 to resolve an issue
#20180

### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->

---------

Co-authored-by: cao lei <[email protected]>
Co-authored-by: Lei Cao <[email protected]>
Copy link
Contributor

github-actions bot commented May 5, 2024

This issue has been automatically marked as stale due to inactivity and will be closed in 30 days if no further activity occurs. If further support is needed, please provide an update and/or more details.

@github-actions github-actions bot added the stale issues that have not been addressed in a while; categorized by a bot label May 5, 2024
@diablodale
Copy link
Contributor Author

ping. keep alive. This issue is still relevant and has PRs pending as a fix.

@github-actions github-actions bot removed the stale issues that have not been addressed in a while; categorized by a bot label May 6, 2024
@kavunkaua
Copy link

I got a similar problem (Ubuntu 24.04 + RTX 3090)

[ RUN ] CApiTest.custom_op_set_input_memory_type
Running custom op inference
Running simple inference with cuda provider
...../onnxruntime/onnxruntime/test/shared_lib/custom_op_utils.cc:85: Failure
Expected equality of these values:
y_mem_type
Which is: -1
OrtMemType::OrtMemTypeCPUInput
Which is: -2

unknown file: Failure
C++ exception with description "Unsupported OrtValue type." thrown in the test body.

[ FAILED ] CApiTest.custom_op_set_input_memory_type (11 ms)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
build build issues; typically submitted using template ep:CUDA issues related to the CUDA execution provider ep:TensorRT issues related to TensorRT execution provider platform:windows issues related to the Windows platform
Projects
None yet
Development

No branches or pull requests

3 participants