TensorRT EP - timing cache #14767

gedoensmax · 2023-02-22T10:48:59Z

Description

This will enable a user to use a TensorRT timing cache based on #10297 to accelerate build times on a device with the same compute capability. This will work across models as it simply store kernel runtimes for specific configurations. Those files are usually very small (only a few MB) which makes them very easy to ship with an application to accelerate the build time on the user end.

Motivation and Context

Especially for workstation use cases TRT build times can be a roadblock. With a few model from ONNX model zoo i evaluated speedups when a timing cache is present.
./build/onnxruntime_perf_test -e tensorrt -I -t 5 -i "trt_timing_cache_enable|true" <onnx_path>

Model	no Cache	with Cache
efficientnet-lite4-11	34.6 s	7.7 s
yolov4	108.62 s	9.4 s

To capture this is had to modify the onnxruntime_perf_test. The time is sometimes not captured within "Session creation time cost:" which is why i introduced "First inference time cost:".

…vider options struct

# Conflicts: # include/onnxruntime/core/providers/tensorrt/tensorrt_provider_options.h # include/onnxruntime/core/session/onnxruntime_cxx_api.h # onnxruntime/core/providers/tensorrt/tensorrt_execution_provider.cc # onnxruntime/core/providers/tensorrt/tensorrt_execution_provider.h # onnxruntime/core/providers/tensorrt/tensorrt_execution_provider_info.cc # onnxruntime/core/providers/tensorrt/tensorrt_execution_provider_info.h # onnxruntime/core/providers/tensorrt/tensorrt_provider_factory.cc # onnxruntime/core/session/provider_bridge_ort.cc # onnxruntime/python/onnxruntime_pybind_state.cc # onnxruntime/test/perftest/ort_test_session.cc # onnxruntime/test/providers/cpu/model_tests.cc # onnxruntime/test/providers/tensorrt/tensorrt_basic_test.cc

onnxruntime/core/providers/tensorrt/tensorrt_execution_provider.cc

chilo-ms · 2023-03-01T20:03:55Z

/azp run Linux CPU CI Pipeline, Linux CPU Minimal Build E2E CI Pipeline, Linux GPU CI Pipeline, Linux GPU TensorRT CI Pipeline, Linux Nuphar CI Pipeline, Linux OpenVINO CI Pipeline, MacOS CI Pipeline, ONNX Runtime Web CI Pipeline, Windows CPU CI Pipeline, Windows GPU CI Pipeline, Windows GPU TensorRT CI Pipeline, onnxruntime-binary-size-checks-ci-pipeline, onnxruntime-python-checks-ci-pipeline, orttraining-linux-ci-pipeline, orttraining-linux-gpu-ci-pipeline, orttraining-ortmodule-distributed

azure-pipelines · 2023-03-01T20:04:01Z

You have several pipelines (over 10) configured to build pull requests in this repository. Specify which pipelines you would like to run by using /azp run [pipelines] command. You can specify multiple pipelines using a comma separated list.

chilo-ms · 2023-03-01T20:04:57Z

/azp run Linux CPU CI Pipeline, Linux CPU Minimal Build E2E CI Pipeline, Linux GPU CI Pipeline, Linux GPU TensorRT CI Pipeline, Linux Nuphar CI Pipeline, Linux OpenVINO CI Pipeline, MacOS CI Pipeline, ONNX Runtime Web CI Pipeline, Windows CPU CI Pipeline, Windows GPU CI Pipeline

chilo-ms · 2023-03-01T20:05:08Z

/azp run Windows GPU TensorRT CI Pipeline, onnxruntime-binary-size-checks-ci-pipeline, onnxruntime-python-checks-ci-pipeline, orttraining-linux-ci-pipeline, orttraining-linux-gpu-ci-pipeline, orttraining-ortmodule-distributed

azure-pipelines · 2023-03-01T20:05:37Z

Azure Pipelines successfully started running 6 pipeline(s).

azure-pipelines · 2023-03-01T20:05:38Z

Azure Pipelines successfully started running 9 pipeline(s).

chilo-ms · 2023-03-02T04:54:57Z

The CIs failed.
We should also add additional fields for OrtTensorRTProviderOptionsV2.
https://github.com/microsoft/onnxruntime/blob/main/onnxruntime/python/onnxruntime_pybind_state.cc#L349

gedoensmax · 2023-03-06T20:14:40Z

@chilo-ms could you run the pipeline again ?

chilo-ms · 2023-03-06T20:26:12Z

/azp run Linux CPU CI Pipeline, Linux CPU Minimal Build E2E CI Pipeline, Linux GPU CI Pipeline, Linux GPU TensorRT CI Pipeline, Linux Nuphar CI Pipeline, Linux OpenVINO CI Pipeline, MacOS CI Pipeline, ONNX Runtime Web CI Pipeline, Windows CPU CI Pipeline, Windows GPU CI Pipeline

chilo-ms · 2023-03-06T20:26:20Z

/azp run Windows GPU TensorRT CI Pipeline, onnxruntime-binary-size-checks-ci-pipeline, onnxruntime-python-checks-ci-pipeline, orttraining-linux-ci-pipeline, orttraining-linux-gpu-ci-pipeline, orttraining-ortmodule-distributed

azure-pipelines · 2023-03-06T20:26:53Z

Azure Pipelines successfully started running 6 pipeline(s).

azure-pipelines · 2023-03-06T20:27:03Z

Azure Pipelines successfully started running 9 pipeline(s).

onnxruntime/test/util/default_providers.cc

chilo-ms · 2023-03-09T00:26:05Z

/azp run Linux CPU CI Pipeline, Linux CPU Minimal Build E2E CI Pipeline, Linux GPU CI Pipeline, Linux GPU TensorRT CI Pipeline, Linux Nuphar CI Pipeline, Linux OpenVINO CI Pipeline, MacOS CI Pipeline, ONNX Runtime Web CI Pipeline, Windows CPU CI Pipeline, Windows GPU CI Pipeline

chilo-ms · 2023-03-09T00:26:13Z

/azp run Windows GPU TensorRT CI Pipeline, onnxruntime-binary-size-checks-ci-pipeline, onnxruntime-python-checks-ci-pipeline, orttraining-linux-ci-pipeline, orttraining-linux-gpu-ci-pipeline, orttraining-ortmodule-distributed

azure-pipelines · 2023-03-09T00:26:40Z

Azure Pipelines successfully started running 6 pipeline(s).

azure-pipelines · 2023-03-09T00:26:43Z

Azure Pipelines successfully started running 9 pipeline(s).

chilo-ms · 2023-03-09T21:46:19Z

/azp run Linux QNN CI Pipeline

azure-pipelines · 2023-03-09T21:46:28Z

Azure Pipelines successfully started running 1 pipeline(s).

### Description Patch #14767 in order to make two provider options `force_timing_cache` and `detailed_build_log` can be updated. Otherwise, they only use default value. `timing_cache_enable` is good.

chilo-ms and others added 30 commits December 23, 2021 19:19

add timing cache

3ba51a2

enable timing cache for test

f905d35

Make it only on Linux

086ba0e

undo last commit

191424b

Merge branch 'master' into trt_timing_cache

d08ea41

add 'timing_cache_enable' tensorrt provider options

4fe5a0a

fix bug

83e251d

fix bug

5b52f63

revert modification

4ff502d

small modification

2dd3194

remove intrumentation code for recording engine build latency

8d75a70

add timing_cache_enable as additional member of internal TensorRT pro…

58e37fc

…vider options struct

fix warning

b707c65

enable trt timing cache for model tests

48ecbeb

Merge branch 'master' into trt_timing_cache

e248193

enable timing cache for model tests

9dc0d16

change pool

0f378b1

change back previous pool

72b7645

change path of trt_timing_cache

2c5ac28

refactor code

93e61f0

Add test cases for timing cache

c9813c2

fix bug

e69723e

fix bug for CI

c371a6d

Merge branch 'master' into trt_timing_cache

8da55b5

fix bug

e513740

fix bug

4a45c30

timing cache test

e38556a

append compute capability to cache and add force option

bf0b880

Take timing of first warm up inference

244b437

detailed build log option

5db55ff

chilo-ms reviewed Mar 1, 2023

View reviewed changes

onnxruntime/core/providers/tensorrt/tensorrt_execution_provider.cc Show resolved Hide resolved

format changes and adding force timing cache to provider options

6b0be1d

fix pybind OrtTensorRTProviderOptionsV2

c1c3f71

chilo-ms reviewed Mar 8, 2023

View reviewed changes

onnxruntime/test/util/default_providers.cc Outdated Show resolved Hide resolved

reset OrtTensorRTProviderOptions

b888fc3

chilo-ms approved these changes Mar 9, 2023

View reviewed changes

microsoft deleted a comment from azure-pipelines bot Mar 9, 2023

chilo-ms merged commit ad4db12 into microsoft:main Mar 10, 2023

gedoensmax mentioned this pull request Mar 16, 2023

Add TensorRT timing cache feature #10297

Closed

chilo-ms mentioned this pull request Mar 19, 2023

TensorRT EP - timing cache [patch] #15113

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TensorRT EP - timing cache #14767

TensorRT EP - timing cache #14767

gedoensmax commented Feb 22, 2023 •

edited

Loading

chilo-ms commented Mar 1, 2023

azure-pipelines bot commented Mar 1, 2023

chilo-ms commented Mar 1, 2023

chilo-ms commented Mar 1, 2023

azure-pipelines bot commented Mar 1, 2023

azure-pipelines bot commented Mar 1, 2023

chilo-ms commented Mar 2, 2023

gedoensmax commented Mar 6, 2023

chilo-ms commented Mar 6, 2023

chilo-ms commented Mar 6, 2023

azure-pipelines bot commented Mar 6, 2023

azure-pipelines bot commented Mar 6, 2023

chilo-ms commented Mar 9, 2023

chilo-ms commented Mar 9, 2023

azure-pipelines bot commented Mar 9, 2023

azure-pipelines bot commented Mar 9, 2023

chilo-ms commented Mar 9, 2023

azure-pipelines bot commented Mar 9, 2023

TensorRT EP - timing cache #14767

TensorRT EP - timing cache #14767

Conversation

gedoensmax commented Feb 22, 2023 • edited Loading

Description

Motivation and Context

chilo-ms commented Mar 1, 2023

azure-pipelines bot commented Mar 1, 2023

chilo-ms commented Mar 1, 2023

chilo-ms commented Mar 1, 2023

azure-pipelines bot commented Mar 1, 2023

azure-pipelines bot commented Mar 1, 2023

chilo-ms commented Mar 2, 2023

gedoensmax commented Mar 6, 2023

chilo-ms commented Mar 6, 2023

chilo-ms commented Mar 6, 2023

azure-pipelines bot commented Mar 6, 2023

azure-pipelines bot commented Mar 6, 2023

chilo-ms commented Mar 9, 2023

chilo-ms commented Mar 9, 2023

azure-pipelines bot commented Mar 9, 2023

azure-pipelines bot commented Mar 9, 2023

chilo-ms commented Mar 9, 2023

azure-pipelines bot commented Mar 9, 2023

gedoensmax commented Feb 22, 2023 •

edited

Loading