Enable TRT provider option configuration for C# #7179

chilo-ms · 2021-03-30T12:22:35Z

Description: Enable TRT provider option configuration for C#

      Example for setting:
        SessionOptions.OrtTensorRTProviderOptions trt_options;
        trt_options.device_id = 0;
        trt_options.has_trt_options = 1;
        trt_options.trt_max_workspace_size = (UIntPtr) (1<<30);
        trt_options.trt_fp16_enable = 1;
        trt_options.trt_int8_enable = 1;
        trt_options.trt_int8_calibration_table_name = "C:\calibration.flatbuffers";
        trt_options.trt_int8_use_native_calibration_table = 0;

Motivation and Context

Why is this change required? What problem does it solve?
If it fixes an open issue, please link to the issue here.

jywu-msft · 2021-03-30T13:36:00Z

CI failure is due to failing api doc check.

jywu-msft · 2021-03-30T13:37:35Z

@hariharans29 can you also help take a look at this PR? Thanks!

csharp/src/Microsoft.ML.OnnxRuntime/SessionOptions.cs

jywu-msft · 2021-03-30T13:42:28Z

is it possible to add test cases?

chilo-ms · 2021-04-01T15:58:32Z

I've added some test cases.
Once From/ToProviderOptions() has been implemented on TRT EP, we can read the option back directly to check and which can cover more test cases.

csharp/test/Microsoft.ML.OnnxRuntime.Tests/InferenceTest.cs

hariharans29 · 2021-04-05T17:04:45Z

One of the open-ended questions that @pranavsharma and me were recently discussing is if we can allow transparent structs in the C API or whether we should make them opaque and have the creation of the struct go through an API. Some things to consider:

Will the presence of transparent structs in the C API make it harder for language bindings to support similar config structs ?
Versioning of the structs (adding a new field).

It is something to discuss and weigh the pros/cons for each approach before we merge this change and release this because if we decide to ship this approach, it might be harder to switch to the other (it will also make it harder on the users of ORT if we keep switching things around too much)

FWIW- I had a PR for OrtCudaProviderOptions that explores the other approach (make provider struct opaque and expose an API for creation) - #6291.

I feel it it is best to agree upon one approach now that we are enabling language bindings for provider config structs.

chilo-ms · 2021-04-09T03:35:39Z

Thanks @hariharans29 for the explanation. Based on what we've discussed offline, we decided to use transparent structs in C API for provider options. I will create another PR for adding CUDA options in order not to make this PR involve too much modifications.

include/onnxruntime/core/session/onnxruntime_c_api.h

jywu-msft · 2021-04-09T14:52:06Z

include/onnxruntime/core/session/onnxruntime_c_api.h

+  int trt_int8_enable;                          // enable TensorRT INT8 precision. Default 0 = false, nonzero = true
+  const char* trt_int8_calibration_table_name;  // TensorRT INT8 calibration table name.
+  int trt_int8_use_native_calibration_table;    // use native TensorRT generated calibration table. Default 0 = false, nonzero = true
+  int trt_max_partition_iterations;             // maximum number of iterations allowed in model partitioning for TensorRT.


to add these new options, don't you also need to update TensorrtExecutionProvider constructor to check these options (along side the code that checks the environment variables) ?
please test all these options are working properly.

Yes, previously I planed to do so in another PR, but I will include the modification in this PR for updating TensorrtExecutionProvider construct and checking theses options can be effective.

Have tested the internal provider options (e.g. max_workspace_size_, int8_use_native_tensorrt_calibration_table_ ....) can be configured via c# provider options by building the nuget package and run c# application.

revisit

yuslepukhin · 2021-04-15T21:36:47Z

csharp/src/Microsoft.ML.OnnxRuntime/NativeMethods.cs

+        public int trt_min_subgraph_size;                      // minimum node size in a subgraph after partitioning.
+        public int trt_dump_subgraphs;                         // dump the subgraphs that are transformed into TRT engines in onnx format to the filesystem. Default 0 = false, nonzero = true
+        public int trt_engine_cache_enable;                    // enable TensorRT engine caching. Default 0 = false, nonzero = true
+        public IntPtr trt_cache_path;                          // specify path for TensorRT engine and profile files if engine_cache_enable is enabled, or INT8 calibration table file if trt_int8_enable is enabled.


trt_cache_path

Need to clarify how to initialize this, since paths in Windows are in UTF-16 and in Linux it is UTF-8.

yuslepukhin · 2021-04-15T21:38:07Z

csharp/src/Microsoft.ML.OnnxRuntime/ProviderOptions.cs

+    /// <summary>
+    /// Holds provider options configuration for creating an InferenceSession.
+    /// </summary>
+    public class ProviderOptions : SafeHandle


SafeHandle

Similar comment. Need to decide if we want to keep this as a SafeHandle. If there is a little chance that this will ever hold a native resource, then it would be a burden on a user always dipose of it.

yuslepukhin · 2021-04-15T21:38:58Z

csharp/src/Microsoft.ML.OnnxRuntime/SessionOptions.cs

+        public static SessionOptions MakeSessionOptionWithTensorrtProvider(int deviceId = 0)
+        {
+            CheckTensorrtExecutionProviderDLLs();
+            SessionOptions options = new SessionOptions();


SessionOptions options = new SessionOptions();

Same comment on handling leaking options when the below throws.

yuslepukhin · 2021-04-15T21:39:28Z

csharp/src/Microsoft.ML.OnnxRuntime/SessionOptions.cs

+        public static SessionOptions MakeSessionOptionWithTensorrtProvider(OrtTensorRTProviderOptions trt_options)
+        {
+            CheckTensorrtExecutionProviderDLLs();
+            SessionOptions options = new SessionOptions();


SessionOptions

Ditto.Need to be dispose of on exception

yuslepukhin · 2021-04-15T21:39:51Z

        SessionOptions options = new SessionOptions();

There is a leak here too, although this is not a part of your changes.

Refers to: csharp/src/Microsoft.ML.OnnxRuntime/SessionOptions.cs:152 in 532c899. [](commit_id = 532c899, deletion_comment = False)

yuslepukhin · 2021-04-15T21:41:05Z

csharp/src/Microsoft.ML.OnnxRuntime/SessionOptions.cs

+            var tableNamePinned = GCHandle.Alloc(NativeOnnxValueHelper.StringToZeroTerminatedUtf8(trt_options.trt_int8_calibration_table_name), GCHandleType.Pinned);
+            using (var pinnedSettingsName = new PinnedGCHandle(tableNamePinned))
+            {
+                trt_options_native.trt_int8_calibration_table_name = pinnedSettingsName.Pointer;


pinnedSettingsName

But using clause will unpin this and will invalidate the Pointer value.

yuslepukhin · 2021-04-15T21:42:49Z

csharp/src/Microsoft.ML.OnnxRuntime/SessionOptions.cs

+            var cachePathPinned = GCHandle.Alloc(NativeOnnxValueHelper.StringToZeroTerminatedUtf8(trt_options.trt_cache_path), GCHandleType.Pinned);
+            using (var pinnedSettingsName2 = new PinnedGCHandle(cachePathPinned))
+            {
+                trt_options_native.trt_cache_path = pinnedSettingsName2.Pointer;


pinnedSettingsName2

Same thing here. Memory should say pinned for the duration of the native call.This is done, so GC does not move memory while it is accessed from the native code. Thus, pinning should take place just prior to native call that is using this and then unpinned right after the call.

yuslepukhin · 2021-04-15T21:47:38Z

csharp/test/Microsoft.ML.OnnxRuntime.Tests/InferenceTest.cs

+            trt_options.trt_int8_enable = 1;
+            trt_options.trt_int8_use_native_calibration_table = 0;
+
+            var session = new InferenceSession(modelPath, SessionOptions.MakeSessionOptionWithTensorrtProvider(trt_options));


session

Session is a disposable class. So this should either be wrapped into a using clause, tyr/finally. OR, if you have trouble managing so many disposables, you can add then to a disposable list which alone would require disposal. See examples in this file. The issue here, people copy this code as examples, and then complain about leaks.

yuslepukhin · 2021-04-15T21:48:58Z

include/onnxruntime/core/session/onnxruntime_c_api.h

+  int trt_min_subgraph_size;                    // minimum node size in a subgraph after partitioning.
+  int trt_dump_subgraphs;                       // dump the subgraphs that are transformed into TRT engines in onnx format to the filesystem. Default 0 = false, nonzero = true
+  int trt_engine_cache_enable;                  // enable TensorRT engine caching. Default 0 = false, nonzero = true
+  const char* trt_cache_path;                   // specify path for TensorRT engine and profile files if engine_cache_enable is enabled, or INT8 calibration table file if trt_int8_enable is enabled.


pecify path for

Need to document the encoding of the path. We seemed to require UTF-16 on windows and UTF-8 on LInux. So how does this fit into that pattern?

yuslepukhin · 2021-04-15T21:50:08Z

include/onnxruntime/core/session/onnxruntime_c_api.h

+  int trt_min_subgraph_size;                    // minimum node size in a subgraph after partitioning.
+  int trt_dump_subgraphs;                       // dump the subgraphs that are transformed into TRT engines in onnx format to the filesystem. Default 0 = false, nonzero = true
+  int trt_engine_cache_enable;                  // enable TensorRT engine caching. Default 0 = false, nonzero = true
+  const char* trt_cache_path;                   // specify path for TensorRT engine and profile files if engine_cache_enable is enabled, or INT8 calibration table file if trt_int8_enable is enabled.
 } OrtTensorRTProviderOptions;


We should watch the extensiblity of such structs and modify all languages that makes use of this C API at the same time.
The size of the struct changes and if this is used in the compiled applications, then it is not binary compatible. We can't extend the structs without versioning the API OR don't use the structs.

yuslepukhin · 2021-04-15T21:50:38Z

onnxruntime/test/perftest/ort_test_session.cc

+    int trt_min_subgraph_size = 1;
+    bool trt_dump_subgraphs = false;
+    bool trt_engine_cache_enable = false;
+    std::string trt_cache_path = "";


yuslepukhin · 2021-04-15T21:53:01Z

csharp/src/Microsoft.ML.OnnxRuntime/ProviderOptions.cs

+    //    trt_options.trt_int8_enable = 1;
+    //    trt_options.trt_int8_calibration_table_name = "calibration.flatbuffers";
+    //    trt_options.trt_int8_use_native_calibration_table = 0;
+    public struct OrtTensorRTProviderOptions


OrtTensorRTProviderOptions

Here we two have two duplicate structures essentially for the same purpose. Can we unify them?

yuslepukhin · 2021-04-15T21:58:15Z

include/onnxruntime/core/session/onnxruntime_c_api.h

+  size_t trt_max_workspace_size;                // maximum workspace size for TensorRT.
+  int trt_fp16_enable;                          // enable TensorRT FP16 precision. Default 0 = false, nonzero = true
+  int trt_int8_enable;                          // enable TensorRT INT8 precision. Default 0 = false, nonzero = true
+  const char* trt_int8_calibration_table_name;  // TensorRT INT8 calibration table name.


ensorRT INT8 calibration table nam

Need to document string encoding

chilo-ms · 2021-05-06T04:14:14Z

This PR is currently pending since we had some concerns regarding exposing provider options struct.

jywu-msft · 2021-12-17T17:03:16Z

closing. an updated version replaced this one.

chilo-ms added 2 commits March 30, 2021 03:06

Enable TensorRT EP for C#

b882a42

Add comment

fb79c1d

chilo-ms requested review from stevenlix and jywu-msft March 30, 2021 12:22

chilo-ms requested a review from a team as a code owner March 30, 2021 12:22

chilo-ms added 2 commits March 30, 2021 06:03

Fix bug due to build fail

5328c03

Remove unnecessary code

5e6e233

jywu-msft reviewed Mar 30, 2021

View reviewed changes

csharp/src/Microsoft.ML.OnnxRuntime/SessionOptions.cs Outdated Show resolved Hide resolved

chilo-ms added 3 commits March 31, 2021 21:51

Fix bug for documentation check

2074557

Add test cases

f016bc4

restore some changes

d274fe3

chilo-ms closed this Apr 5, 2021

chilo-ms reopened this Apr 5, 2021

fix CI build bug

767083f

jywu-msft reviewed Apr 5, 2021

View reviewed changes

csharp/test/Microsoft.ML.OnnxRuntime.Tests/InferenceTest.cs Outdated Show resolved Hide resolved

chilo-ms and others added 2 commits April 8, 2021 20:25

expose all tensorrt env provider options

7f3a544

Merge branch 'master' into c_sharp_tensorrt

fb5b4a3

chilo-ms added 2 commits April 8, 2021 21:05

fix typos

9af54d0

Fix bug

ee998f8

stevenlix reviewed Apr 9, 2021

View reviewed changes

include/onnxruntime/core/session/onnxruntime_c_api.h Outdated Show resolved Hide resolved

fix minor define issue

58f2b2f

jywu-msft reviewed Apr 9, 2021

View reviewed changes

chilo-ms added 2 commits April 11, 2021 21:27

modify trt ep constructor to take additional trt provider options

ba8b20e

minor refine

478a81c

chilo-ms added 2 commits April 12, 2021 02:57

refactor

1cb12c5

add documentation

532c899

jywu-msft previously approved these changes Apr 13, 2021

View reviewed changes

yuslepukhin reviewed Apr 15, 2021

View reviewed changes

chilo-ms mentioned this pull request Apr 28, 2021

Enable TRT EP for C# #7482

Merged

chilo-ms mentioned this pull request May 24, 2021

Enable TRT provider option configuration for C# (updated version) #7808

Merged

jywu-msft closed this Dec 17, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enable TRT provider option configuration for C# #7179

Enable TRT provider option configuration for C# #7179

chilo-ms commented Mar 30, 2021

jywu-msft commented Mar 30, 2021 •

edited

Loading

jywu-msft commented Mar 30, 2021

jywu-msft commented Mar 30, 2021 •

edited

Loading

chilo-ms commented Apr 1, 2021

hariharans29 commented Apr 5, 2021 •

edited

Loading

chilo-ms commented Apr 9, 2021

jywu-msft Apr 9, 2021

chilo-ms Apr 10, 2021

chilo-ms Apr 12, 2021

jywu-msft Apr 13, 2021

yuslepukhin Apr 15, 2021

yuslepukhin Apr 15, 2021

yuslepukhin Apr 15, 2021

yuslepukhin Apr 15, 2021

yuslepukhin commented Apr 15, 2021

yuslepukhin Apr 15, 2021 •

edited

Loading

yuslepukhin Apr 15, 2021 •

edited

Loading

yuslepukhin Apr 15, 2021

yuslepukhin Apr 15, 2021 •

edited

Loading

yuslepukhin Apr 15, 2021 •

edited

Loading

yuslepukhin Apr 15, 2021

yuslepukhin Apr 15, 2021

yuslepukhin Apr 15, 2021

chilo-ms commented May 6, 2021

jywu-msft commented Dec 17, 2021

Enable TRT provider option configuration for C# #7179

Enable TRT provider option configuration for C# #7179

Conversation

chilo-ms commented Mar 30, 2021

jywu-msft commented Mar 30, 2021 • edited Loading

jywu-msft commented Mar 30, 2021

jywu-msft commented Mar 30, 2021 • edited Loading

chilo-ms commented Apr 1, 2021

hariharans29 commented Apr 5, 2021 • edited Loading

chilo-ms commented Apr 9, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

yuslepukhin commented Apr 15, 2021

yuslepukhin Apr 15, 2021 • edited Loading

Choose a reason for hiding this comment

yuslepukhin Apr 15, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

yuslepukhin Apr 15, 2021 • edited Loading

Choose a reason for hiding this comment

yuslepukhin Apr 15, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

chilo-ms commented May 6, 2021

jywu-msft commented Dec 17, 2021

jywu-msft commented Mar 30, 2021 •

edited

Loading

jywu-msft commented Mar 30, 2021 •

edited

Loading

hariharans29 commented Apr 5, 2021 •

edited

Loading

yuslepukhin Apr 15, 2021 •

edited

Loading

yuslepukhin Apr 15, 2021 •

edited

Loading

yuslepukhin Apr 15, 2021 •

edited

Loading

yuslepukhin Apr 15, 2021 •

edited

Loading