Add TensorRT timing cache feature #10297

chilo-ms · 2022-01-15T21:53:57Z

TensorRT provides the timing cache feature to reduce the builder time by keeping the layer profiling information during the builder phase. This PR added timing cache feature into ORT-TRT.

Also, please notice that ORT won't use OrtTensorRTProviderOptions struct anymore for TRT EP when adding additional provider option. Instead, it uses the opaque struct OrtTensorRTProviderOptionsV2 as internal struct for setting provider options that can be converted to a string.
Please see #7808 and #10188 for more details and context.

…vider options struct

chilo-ms · 2022-01-15T21:56:48Z

Will add unit test for testing this feature.

jywu-msft · 2022-01-19T05:26:54Z

Will add unit test for testing this feature.

yes, we definitely need some test cases here.
need to test in conjunction with engine cache enabled/disabled as well.

chilo-ms · 2022-02-05T23:02:45Z

Will add unit test for testing this feature.

yes, we definitely need some test cases here. need to test in conjunction with engine cache enabled/disabled as well.

Test cases for timing cache have been added

stale · 2022-04-16T05:53:39Z

This issue has been automatically marked as stale due to inactivity and will be closed in 7 days if no further activity occurs. If further support is needed, please provide an update and/or more details.

### Description This will enable a user to use a TensorRT timing cache based on #10297 to accelerate build times on a device with the same compute capability. This will work across models as it simply store kernel runtimes for specific configurations. Those files are usually very small (only a few MB) which makes them very easy to ship with an application to accelerate the build time on the user end. ### Motivation and Context Especially for workstation use cases TRT build times can be a roadblock. With a few model from ONNX model zoo i evaluated speedups when a timing cache is present. `./build/onnxruntime_perf_test -e tensorrt -I -t 5 -i "trt_timing_cache_enable|true" <onnx_path>` |Model | no Cache | with Cache| | ------------- | ------------- | ------------- | |efficientnet-lite4-11 | 34.6 s | 7.7 s| |yolov4 | 108.62 s | 9.4 s| To capture this is had to modify the onnxruntime_perf_test. The time is sometimes not captured within "Session creation time cost:" which is why i introduced "First inference time cost:". --------- Co-authored-by: Chi Lo <[email protected]>

gedoensmax · 2023-03-16T20:17:17Z

I think we can close this dur to #14767 right ?

chilo-ms · 2023-03-16T20:58:02Z

I think we can close this dur to #14767 right ?

yes, we can

chilo-ms added 12 commits December 23, 2021 19:19

add timing cache

3ba51a2

enable timing cache for test

f905d35

Make it only on Linux

086ba0e

undo last commit

191424b

Merge branch 'master' into trt_timing_cache

d08ea41

add 'timing_cache_enable' tensorrt provider options

4fe5a0a

fix bug

83e251d

fix bug

5b52f63

revert modification

4ff502d

small modification

2dd3194

remove intrumentation code for recording engine build latency

8d75a70

add timing_cache_enable as additional member of internal TensorRT pro…

58e37fc

…vider options struct

chilo-ms requested review from stevenlix and jywu-msft January 15, 2022 21:53

fix warning

b707c65

chilo-ms mentioned this pull request Jan 28, 2022

Enable cuda graph in TensorRT EP #10423

Open

chilo-ms added 12 commits January 28, 2022 23:23

enable trt timing cache for model tests

48ecbeb

Merge branch 'master' into trt_timing_cache

e248193

enable timing cache for model tests

9dc0d16

change pool

0f378b1

change back previous pool

72b7645

change path of trt_timing_cache

2c5ac28

refactor code

93e61f0

Add test cases for timing cache

c9813c2

fix bug

e69723e

fix bug for CI

c371a6d

Merge branch 'master' into trt_timing_cache

8da55b5

fix bug

e513740

fix bug

4a45c30

stale bot added the stale issues that have not been addressed in a while; categorized by a bot label Apr 16, 2022

gedoensmax mentioned this pull request Feb 22, 2023

TensorRT EP - timing cache #14767

Merged

stale bot removed the stale issues that have not been addressed in a while; categorized by a bot label Mar 16, 2023

chilo-ms closed this Mar 16, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add TensorRT timing cache feature #10297

Add TensorRT timing cache feature #10297

chilo-ms commented Jan 15, 2022 •

edited

Loading

chilo-ms commented Jan 15, 2022 •

edited

Loading

jywu-msft commented Jan 19, 2022 •

edited by chilo-ms

Loading

chilo-ms commented Feb 5, 2022

stale bot commented Apr 16, 2022

gedoensmax commented Mar 16, 2023

chilo-ms commented Mar 16, 2023

Add TensorRT timing cache feature #10297

Add TensorRT timing cache feature #10297

Conversation

chilo-ms commented Jan 15, 2022 • edited Loading

chilo-ms commented Jan 15, 2022 • edited Loading

jywu-msft commented Jan 19, 2022 • edited by chilo-ms Loading

chilo-ms commented Feb 5, 2022

stale bot commented Apr 16, 2022

gedoensmax commented Mar 16, 2023

chilo-ms commented Mar 16, 2023

chilo-ms commented Jan 15, 2022 •

edited

Loading

chilo-ms commented Jan 15, 2022 •

edited

Loading

jywu-msft commented Jan 19, 2022 •

edited by chilo-ms

Loading