-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable TRT provider option configuration for C# #7179
Conversation
CI failure is due to failing api doc check. |
@hariharans29 can you also help take a look at this PR? Thanks! |
is it possible to add test cases? |
I've added some test cases. |
One of the open-ended questions that @pranavsharma and me were recently discussing is if we can allow transparent structs in the C API or whether we should make them opaque and have the creation of the struct go through an API. Some things to consider:
It is something to discuss and weigh the pros/cons for each approach before we merge this change and release this because if we decide to ship this approach, it might be harder to switch to the other (it will also make it harder on the users of ORT if we keep switching things around too much) FWIW- I had a PR for I feel it it is best to agree upon one approach now that we are enabling language bindings for provider config structs. |
Thanks @hariharans29 for the explanation. Based on what we've discussed offline, we decided to use transparent structs in C API for provider options. I will create another PR for adding CUDA options in order not to make this PR involve too much modifications. |
int trt_int8_enable; // enable TensorRT INT8 precision. Default 0 = false, nonzero = true | ||
const char* trt_int8_calibration_table_name; // TensorRT INT8 calibration table name. | ||
int trt_int8_use_native_calibration_table; // use native TensorRT generated calibration table. Default 0 = false, nonzero = true | ||
int trt_max_partition_iterations; // maximum number of iterations allowed in model partitioning for TensorRT. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to add these new options, don't you also need to update TensorrtExecutionProvider constructor to check these options (along side the code that checks the environment variables) ?
please test all these options are working properly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, previously I planed to do so in another PR, but I will include the modification in this PR for updating TensorrtExecutionProvider construct and checking theses options can be effective.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have tested the internal provider options (e.g. max_workspace_size_, int8_use_native_tensorrt_calibration_table_ ....) can be configured via c# provider options by building the nuget package and run c# application.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks!
public int trt_min_subgraph_size; // minimum node size in a subgraph after partitioning. | ||
public int trt_dump_subgraphs; // dump the subgraphs that are transformed into TRT engines in onnx format to the filesystem. Default 0 = false, nonzero = true | ||
public int trt_engine_cache_enable; // enable TensorRT engine caching. Default 0 = false, nonzero = true | ||
public IntPtr trt_cache_path; // specify path for TensorRT engine and profile files if engine_cache_enable is enabled, or INT8 calibration table file if trt_int8_enable is enabled. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/// <summary> | ||
/// Holds provider options configuration for creating an InferenceSession. | ||
/// </summary> | ||
public class ProviderOptions : SafeHandle |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
public static SessionOptions MakeSessionOptionWithTensorrtProvider(int deviceId = 0) | ||
{ | ||
CheckTensorrtExecutionProviderDLLs(); | ||
SessionOptions options = new SessionOptions(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
public static SessionOptions MakeSessionOptionWithTensorrtProvider(OrtTensorRTProviderOptions trt_options) | ||
{ | ||
CheckTensorrtExecutionProviderDLLs(); | ||
SessionOptions options = new SessionOptions(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
var tableNamePinned = GCHandle.Alloc(NativeOnnxValueHelper.StringToZeroTerminatedUtf8(trt_options.trt_int8_calibration_table_name), GCHandleType.Pinned); | ||
using (var pinnedSettingsName = new PinnedGCHandle(tableNamePinned)) | ||
{ | ||
trt_options_native.trt_int8_calibration_table_name = pinnedSettingsName.Pointer; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
var cachePathPinned = GCHandle.Alloc(NativeOnnxValueHelper.StringToZeroTerminatedUtf8(trt_options.trt_cache_path), GCHandleType.Pinned); | ||
using (var pinnedSettingsName2 = new PinnedGCHandle(cachePathPinned)) | ||
{ | ||
trt_options_native.trt_cache_path = pinnedSettingsName2.Pointer; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
trt_options.trt_int8_enable = 1; | ||
trt_options.trt_int8_use_native_calibration_table = 0; | ||
|
||
var session = new InferenceSession(modelPath, SessionOptions.MakeSessionOptionWithTensorrtProvider(trt_options)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Session is a disposable class. So this should either be wrapped into a using clause, tyr/finally. OR, if you have trouble managing so many disposables, you can add then to a disposable list which alone would require disposal. See examples in this file. The issue here, people copy this code as examples, and then complain about leaks.
int trt_min_subgraph_size; // minimum node size in a subgraph after partitioning. | ||
int trt_dump_subgraphs; // dump the subgraphs that are transformed into TRT engines in onnx format to the filesystem. Default 0 = false, nonzero = true | ||
int trt_engine_cache_enable; // enable TensorRT engine caching. Default 0 = false, nonzero = true | ||
const char* trt_cache_path; // specify path for TensorRT engine and profile files if engine_cache_enable is enabled, or INT8 calibration table file if trt_int8_enable is enabled. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
int trt_min_subgraph_size; // minimum node size in a subgraph after partitioning. | ||
int trt_dump_subgraphs; // dump the subgraphs that are transformed into TRT engines in onnx format to the filesystem. Default 0 = false, nonzero = true | ||
int trt_engine_cache_enable; // enable TensorRT engine caching. Default 0 = false, nonzero = true | ||
const char* trt_cache_path; // specify path for TensorRT engine and profile files if engine_cache_enable is enabled, or INT8 calibration table file if trt_int8_enable is enabled. | ||
} OrtTensorRTProviderOptions; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should watch the extensiblity of such structs and modify all languages that makes use of this C API at the same time.
The size of the struct changes and if this is used in the compiled applications, then it is not binary compatible. We can't extend the structs without versioning the API OR don't use the structs.
int trt_min_subgraph_size = 1; | ||
bool trt_dump_subgraphs = false; | ||
bool trt_engine_cache_enable = false; | ||
std::string trt_cache_path = ""; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// trt_options.trt_int8_enable = 1; | ||
// trt_options.trt_int8_calibration_table_name = "calibration.flatbuffers"; | ||
// trt_options.trt_int8_use_native_calibration_table = 0; | ||
public struct OrtTensorRTProviderOptions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
size_t trt_max_workspace_size; // maximum workspace size for TensorRT. | ||
int trt_fp16_enable; // enable TensorRT FP16 precision. Default 0 = false, nonzero = true | ||
int trt_int8_enable; // enable TensorRT INT8 precision. Default 0 = false, nonzero = true | ||
const char* trt_int8_calibration_table_name; // TensorRT INT8 calibration table name. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR is currently pending since we had some concerns regarding exposing provider options struct. |
closing. an updated version replaced this one. |
Description: Enable TRT provider option configuration for C#
Motivation and Context