Adding ortvalue features support for MGX EP #81

urpetkov-amd · 2024-12-20T16:28:33Z

Created PR request with implementation of ortvalue_from_numpy() and ortvalue_from_shape_and_type() features for MGX EP on Windows and Linux in order of getting better performance for llama2 int 4 model execution. Some methods have been overridden and some of them implemented similar like it was done in ROCm EP. Implementing these features we significantly decreased amount of time needed for creating and copying tensors, almost whole time is dedicated to GPU now, which caused much better performance in tok/s for our GPUs. Similar option added for ROCM EP.

TedThemistokleous · 2024-12-20T19:08:04Z

onnxruntime/python/onnxruntime_pybind_ortvalue.cc

+          if (!IsRocmDeviceIdValid(logging::LoggingManager::DefaultLogger(), device.Id())) {
+            throw std::runtime_error("The provided device id doesn't match any available GPUs on the machine.");
+          }
+          allocator = GetRocmAllocator(device.Id());


We might be in an odd situation here as our offering has both MIGraphx and ROCm EPs include, thus we should we get both allocators? Did you test this when we build both MIGraphX and ROCm EPs? How does the allocator work for that?

TedThemistokleous · 2024-12-20T19:08:59Z

onnxruntime/python/onnxruntime_pybind_ortvalue.cc

+#elif USE_MIGRAPHX
+      // InputDeflist is null because OrtValue creation is not tied to a specific model
+      // Likewise, there is no need to specify the name (as the name was previously used to lookup the def list)
+      // TODO: Add check to ensure that string arrays are not passed - we currently don't support string tensors in CUDA


Put this comment in reference to MIGraphX and not CUDA

TedThemistokleous · 2024-12-20T19:10:05Z

onnxruntime/python/onnxruntime_pybind_mlvalue.cc

+
+AllocatorPtr GetMIGraphXAllocator(OrtDevice::DeviceId id) {
+  // Current approach is not thread-safe, but there are some bigger infra pieces to put together in order to make
+  // multi-threaded MIGraphX allocation work we need to maintain a per-thread MIGraphX allocator


Make this an issue and attach it to the ticket if we want to make this on a per thread allocation. We should roadmap this out so we can tackle these pieces in the new year

TedThemistokleous · 2024-12-20T20:34:12Z

onnxruntime/core/providers/migraphx/migraphx_execution_provider.cc

+        // make it stream aware
+        true,
+        // enable cross stream sharing?
+        false);


Is this something we want to make controllable from he API later?

TedThemistokleous · 2024-12-20T20:36:25Z

onnxruntime/core/providers/migraphx/migraphx_provider_factory.cc

+    // The function will return once the pageable buffer has been copied to the staging memory for DMA transfer
+    // to device memory, but the DMA to final destination may not have completed.
+
+    HIP_CALL_THROW(hipStreamSynchronize(0));


Do we always want to be using hipstream 0 for this?

TedThemistokleous · 2024-12-20T21:19:15Z

onnxruntime/core/providers/migraphx/migraphx_execution_provider_info.h

+                  (static_cast<size_t>(info.model_cache_enable) << 21) ^
+                  (static_cast<size_t>(info.save_compiled_model) << 22) ^
+                  (static_cast<size_t>(info.load_compiled_model) << 23) ^
+                  (static_cast<size_t>(info.exhaustive_tune) << 24);


Going forward is the intent to add the other flags (fp16/int8) and other quantize modes in here as well?

TedThemistokleous

Thanks for the contribution!

Few questions about this. Overall looks good.

I've added questions/comments. One detail about combined ROCm/MIGraphX EP builds and if you've tested this with both.

TedThemistokleous · 2024-12-20T21:21:16Z

also if you can, download and use lintrunner in your env to solve the lint issue. It'll make upstreaming easier

lintrunner -a on a linux based system

Uros Petkovic added 3 commits December 20, 2024 15:19

Adding ourtvalue support for MGX EP

2285806

Change variable names to migx_* and migraphx_*

dbcb1e3

Fixing issue with allocating extend arena

0ceb1fd

urpetkov-amd requested a review from TedThemistokleous December 20, 2024 16:28

urpetkov-amd added the enhancement New feature or request label Dec 20, 2024

Fixing variables names

dc06f87

TedThemistokleous reviewed Dec 20, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding ortvalue features support for MGX EP #81

Adding ortvalue features support for MGX EP #81

urpetkov-amd commented Dec 20, 2024

TedThemistokleous Dec 20, 2024

TedThemistokleous Dec 20, 2024

TedThemistokleous Dec 20, 2024

TedThemistokleous Dec 20, 2024

TedThemistokleous Dec 20, 2024

TedThemistokleous Dec 20, 2024

TedThemistokleous left a comment

TedThemistokleous commented Dec 20, 2024

Adding ortvalue features support for MGX EP #81

Are you sure you want to change the base?

Adding ortvalue features support for MGX EP #81

Conversation

urpetkov-amd commented Dec 20, 2024

TedThemistokleous Dec 20, 2024

Choose a reason for hiding this comment

TedThemistokleous Dec 20, 2024

Choose a reason for hiding this comment

TedThemistokleous Dec 20, 2024

Choose a reason for hiding this comment

TedThemistokleous Dec 20, 2024

Choose a reason for hiding this comment

TedThemistokleous Dec 20, 2024

Choose a reason for hiding this comment

TedThemistokleous Dec 20, 2024

Choose a reason for hiding this comment

TedThemistokleous left a comment

Choose a reason for hiding this comment

TedThemistokleous commented Dec 20, 2024