Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TensorRT EP - timing cache #14767

Merged
merged 35 commits into from
Mar 10, 2023
Merged

Conversation

gedoensmax
Copy link
Contributor

@gedoensmax gedoensmax commented Feb 22, 2023

Description

This will enable a user to use a TensorRT timing cache based on #10297 to accelerate build times on a device with the same compute capability. This will work across models as it simply store kernel runtimes for specific configurations. Those files are usually very small (only a few MB) which makes them very easy to ship with an application to accelerate the build time on the user end.

Motivation and Context

Especially for workstation use cases TRT build times can be a roadblock. With a few model from ONNX model zoo i evaluated speedups when a timing cache is present.
./build/onnxruntime_perf_test -e tensorrt -I -t 5 -i "trt_timing_cache_enable|true" <onnx_path>

Model no Cache with Cache
efficientnet-lite4-11 34.6 s 7.7 s
yolov4 108.62 s 9.4 s

To capture this is had to modify the onnxruntime_perf_test. The time is sometimes not captured within "Session creation time cost:" which is why i introduced "First inference time cost:".

chilo-ms and others added 30 commits December 23, 2021 19:19
# Conflicts:
#	include/onnxruntime/core/providers/tensorrt/tensorrt_provider_options.h
#	include/onnxruntime/core/session/onnxruntime_cxx_api.h
#	onnxruntime/core/providers/tensorrt/tensorrt_execution_provider.cc
#	onnxruntime/core/providers/tensorrt/tensorrt_execution_provider.h
#	onnxruntime/core/providers/tensorrt/tensorrt_execution_provider_info.cc
#	onnxruntime/core/providers/tensorrt/tensorrt_execution_provider_info.h
#	onnxruntime/core/providers/tensorrt/tensorrt_provider_factory.cc
#	onnxruntime/core/session/provider_bridge_ort.cc
#	onnxruntime/python/onnxruntime_pybind_state.cc
#	onnxruntime/test/perftest/ort_test_session.cc
#	onnxruntime/test/providers/cpu/model_tests.cc
#	onnxruntime/test/providers/tensorrt/tensorrt_basic_test.cc
@chilo-ms
Copy link
Contributor

chilo-ms commented Mar 1, 2023

/azp run Linux CPU CI Pipeline, Linux CPU Minimal Build E2E CI Pipeline, Linux GPU CI Pipeline, Linux GPU TensorRT CI Pipeline, Linux Nuphar CI Pipeline, Linux OpenVINO CI Pipeline, MacOS CI Pipeline, ONNX Runtime Web CI Pipeline, Windows CPU CI Pipeline, Windows GPU CI Pipeline, Windows GPU TensorRT CI Pipeline, onnxruntime-binary-size-checks-ci-pipeline, onnxruntime-python-checks-ci-pipeline, orttraining-linux-ci-pipeline, orttraining-linux-gpu-ci-pipeline, orttraining-ortmodule-distributed

@azure-pipelines
Copy link

You have several pipelines (over 10) configured to build pull requests in this repository. Specify which pipelines you would like to run by using /azp run [pipelines] command. You can specify multiple pipelines using a comma separated list.

@chilo-ms
Copy link
Contributor

chilo-ms commented Mar 1, 2023

/azp run Linux CPU CI Pipeline, Linux CPU Minimal Build E2E CI Pipeline, Linux GPU CI Pipeline, Linux GPU TensorRT CI Pipeline, Linux Nuphar CI Pipeline, Linux OpenVINO CI Pipeline, MacOS CI Pipeline, ONNX Runtime Web CI Pipeline, Windows CPU CI Pipeline, Windows GPU CI Pipeline

@chilo-ms
Copy link
Contributor

chilo-ms commented Mar 1, 2023

/azp run Windows GPU TensorRT CI Pipeline, onnxruntime-binary-size-checks-ci-pipeline, onnxruntime-python-checks-ci-pipeline, orttraining-linux-ci-pipeline, orttraining-linux-gpu-ci-pipeline, orttraining-ortmodule-distributed

@azure-pipelines
Copy link

Azure Pipelines successfully started running 6 pipeline(s).

@azure-pipelines
Copy link

Azure Pipelines successfully started running 9 pipeline(s).

@chilo-ms
Copy link
Contributor

chilo-ms commented Mar 2, 2023

The CIs failed.
We should also add additional fields for OrtTensorRTProviderOptionsV2.
https://github.com/microsoft/onnxruntime/blob/main/onnxruntime/python/onnxruntime_pybind_state.cc#L349

@gedoensmax
Copy link
Contributor Author

@chilo-ms could you run the pipeline again ?

@chilo-ms
Copy link
Contributor

chilo-ms commented Mar 6, 2023

/azp run Linux CPU CI Pipeline, Linux CPU Minimal Build E2E CI Pipeline, Linux GPU CI Pipeline, Linux GPU TensorRT CI Pipeline, Linux Nuphar CI Pipeline, Linux OpenVINO CI Pipeline, MacOS CI Pipeline, ONNX Runtime Web CI Pipeline, Windows CPU CI Pipeline, Windows GPU CI Pipeline

@chilo-ms
Copy link
Contributor

chilo-ms commented Mar 6, 2023

/azp run Windows GPU TensorRT CI Pipeline, onnxruntime-binary-size-checks-ci-pipeline, onnxruntime-python-checks-ci-pipeline, orttraining-linux-ci-pipeline, orttraining-linux-gpu-ci-pipeline, orttraining-ortmodule-distributed

@azure-pipelines
Copy link

Azure Pipelines successfully started running 6 pipeline(s).

@azure-pipelines
Copy link

Azure Pipelines successfully started running 9 pipeline(s).

@chilo-ms
Copy link
Contributor

chilo-ms commented Mar 9, 2023

/azp run Linux CPU CI Pipeline, Linux CPU Minimal Build E2E CI Pipeline, Linux GPU CI Pipeline, Linux GPU TensorRT CI Pipeline, Linux Nuphar CI Pipeline, Linux OpenVINO CI Pipeline, MacOS CI Pipeline, ONNX Runtime Web CI Pipeline, Windows CPU CI Pipeline, Windows GPU CI Pipeline

@chilo-ms
Copy link
Contributor

chilo-ms commented Mar 9, 2023

/azp run Windows GPU TensorRT CI Pipeline, onnxruntime-binary-size-checks-ci-pipeline, onnxruntime-python-checks-ci-pipeline, orttraining-linux-ci-pipeline, orttraining-linux-gpu-ci-pipeline, orttraining-ortmodule-distributed

@azure-pipelines
Copy link

Azure Pipelines successfully started running 6 pipeline(s).

@azure-pipelines
Copy link

Azure Pipelines successfully started running 9 pipeline(s).

@microsoft microsoft deleted a comment from azure-pipelines bot Mar 9, 2023
@chilo-ms
Copy link
Contributor

chilo-ms commented Mar 9, 2023

/azp run Linux QNN CI Pipeline

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@chilo-ms chilo-ms merged commit ad4db12 into microsoft:main Mar 10, 2023
chilo-ms added a commit that referenced this pull request Mar 21, 2023
### Description
Patch #14767 in order to
make two provider options `force_timing_cache` and `detailed_build_log`
can be updated. Otherwise, they only use default value.
`timing_cache_enable` is good.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants