From 2d68dbd2743ed1d0a31da43fc8ab227123f9c396 Mon Sep 17 00:00:00 2001 From: Dheeraj Peri Date: Tue, 10 Oct 2023 21:07:35 -0700 Subject: [PATCH 1/4] chore: update dynamo export doc Signed-off-by: Dheeraj Peri chore: fix bulleting Signed-off-by: Dheeraj Peri chore: Fix formatting chore: fix formatting Signed-off-by: Dheeraj Peri chore: fix formatting Signed-off-by: Dheeraj Peri chore: fix reference Signed-off-by: Dheeraj Peri --- docsrc/index.rst | 2 + docsrc/user_guide/dynamo_export.rst | 72 +++++++++++++++++++++++++++++ 2 files changed, 74 insertions(+) create mode 100644 docsrc/user_guide/dynamo_export.rst diff --git a/docsrc/index.rst b/docsrc/index.rst index d2f6c54b9d..97580541ea 100644 --- a/docsrc/index.rst +++ b/docsrc/index.rst @@ -41,6 +41,7 @@ User Guide * :ref:`creating_a_ts_mod` * :ref:`getting_started_with_fx` * :ref:`torch_compile` +* :ref:`dynamo_export` * :ref:`ptq` * :ref:`runtime` * :ref:`saving_models` @@ -56,6 +57,7 @@ User Guide user_guide/creating_torchscript_module_in_python user_guide/getting_started_with_fx_path user_guide/torch_compile + user_guide/dynamo_export user_guide/ptq user_guide/runtime user_guide/saving_models diff --git a/docsrc/user_guide/dynamo_export.rst b/docsrc/user_guide/dynamo_export.rst new file mode 100644 index 0000000000..587c2cd79c --- /dev/null +++ b/docsrc/user_guide/dynamo_export.rst @@ -0,0 +1,72 @@ +.. _dynamo_export: + +Torch-TensorRT (Dynamo) Backend +======================== +This guide presents Torch-TensorRT dynamo backend which compiles Pytorch programs +into TensorRT engines through torch dynamo. Pytorch 2.1 introduced ``torch.export`` APIs which +can export graphs from Pytorch programs using torch dynamo. Torch-TensorRT dynamo +backend compiles these exported graphs and optimizes them using TensorRT. Here's a simple +usage of the dynamo backend + +.. code-block:: python + + import torch + import torch_tensorrt + + model = MyModel().eval().cuda() + inputs = torch.randn((1, 3, 224, 224), dtype=torch.float32).cuda() + exp_program = torch.export(model, inputs) + trt_gm = torch_tensorrt.dynamo.compile(exp_program, inputs) # Output is a torch.fx.GraphModule + trt_gm(inputs) + +``torch_tensorrt.dynamo.compile`` is the main API for users to interact with Torch-TensorRT. +The input type of the model should be ``ExportedProgram`` (ideally the output of torch.export) and output types is a ``torch.fx.GraphModule`` object. + +Customizations +--------------------------------------------- + +There are lot of options for users to customize their settings for optimizing with TensorRT. +Some of the frequently used options are as follows: + + +* inputs - For static shapes, this can be a list of torch tensors or `torch_tensorrt.Input` objects. For dynamic shapes, this should be a list of ``torch_tensorrt.Input`` objects. +* enabled_precisions - Set of precisions that TensorRT builder can use during optimization. +* truncate_long_and_double - Truncates long and double values to int and floats respectively. +* torch_executed_ops - Operators which are forced to be executed by Torch. +* min_block_size - Minimum number of consecutive operators required to be executed as a TensorRT segment. + +The complete list of options can be found `here `_ +Note: We do not support INT precision currently in Dynamo. Support for this currently exists in +our Torchscript IR. We plan to implement similar support for dynamo in our next release. + +Under the hood +-------------- + +Under the hood, ``torch_tensorrt.dynamo.compile`` performs the following on the graph. + +* Lowering - Applies lowering passes to add/remove operators for optimal conversion. +* Partitioning - Partitions the graph into Pytorch and TensorRT segments based on the ``min_block_size`` and ``torch_executed_ops`` field. +* Conversion - Pytorch ops get converted into TensorRT ops in this phase. +* Optimization - Post conversion, we build the TensorRT engine and embed this inside the pytorch graph. + +Tracing +------- + +``torch_tensorrt.dynamo.trace`` can be used to trace a Pytorch graphs and produce ``ExportedProgram``. +This internally performs some decompositions of operators for downstream optimization. +The ``ExportedProgram`` can then be used with ``torch_tensorrt.dynamo.compile`` API. +If you have dynamic input shapes in your model, you can use this ``torch_tensorrt.dynamo.trace`` to export +the model with dynamic shapes. Alternatively, you can use ``torch.export`` `with constraints `_ directly as well. + +.. code-block:: python + + import torch + import torch_tensorrt + + inputs = torch_tensorrt.Input(min_shape=(1, 3, 224, 224), + opt_shape=(4, 3, 224, 224), + max_shape=(8, 3, 224, 224), + dtype=torch.float32) + model = MyModel().eval() + exp_program = torch_tensorrt.dynamo.trace(model, inputs) + \ No newline at end of file From 084856200f10b2dd8ebdc3ee24d9ee1a57add00b Mon Sep 17 00:00:00 2001 From: Dheeraj Peri Date: Wed, 25 Oct 2023 15:39:58 -0700 Subject: [PATCH 2/4] chore: update formatting Signed-off-by: Dheeraj Peri --- docsrc/user_guide/dynamo_export.rst | 50 +++++++++++++++++------------ docsrc/user_guide/saving_models.rst | 17 ++++++++-- 2 files changed, 45 insertions(+), 22 deletions(-) diff --git a/docsrc/user_guide/dynamo_export.rst b/docsrc/user_guide/dynamo_export.rst index 587c2cd79c..c663525c04 100644 --- a/docsrc/user_guide/dynamo_export.rst +++ b/docsrc/user_guide/dynamo_export.rst @@ -1,11 +1,22 @@ .. _dynamo_export: -Torch-TensorRT (Dynamo) Backend -======================== -This guide presents Torch-TensorRT dynamo backend which compiles Pytorch programs -into TensorRT engines through torch dynamo. Pytorch 2.1 introduced ``torch.export`` APIs which -can export graphs from Pytorch programs using torch dynamo. Torch-TensorRT dynamo -backend compiles these exported graphs and optimizes them using TensorRT. Here's a simple +Torch-TensorRT Dynamo Backend +============================================= +.. currentmodule:: torch_tensorrt.dynamo + +.. automodule:: torch_tensorrt.dynamo + :members: + :undoc-members: + :show-inheritance: + +This guide presents Torch-TensorRT dynamo backend which optimizes Pytorch models +using TensorRT in an Ahead-Of-Time fashion. + +Using the Dynamo backend +---------------------------------------- +Pytorch 2.1 introduced ``torch.export`` APIs which +can export graphs from Pytorch programs into ``ExportedProgram``s. Torch-TensorRT dynamo +backend compiles these ``ExportedProgram``s and optimizes them using TensorRT. Here's a simple usage of the dynamo backend .. code-block:: python @@ -14,26 +25,25 @@ usage of the dynamo backend import torch_tensorrt model = MyModel().eval().cuda() - inputs = torch.randn((1, 3, 224, 224), dtype=torch.float32).cuda() - exp_program = torch.export(model, inputs) + inputs = [torch.randn((1, 3, 224, 224), dtype=torch.float32).cuda()] + exp_program = torch.export.export(model, tuple(inputs)) trt_gm = torch_tensorrt.dynamo.compile(exp_program, inputs) # Output is a torch.fx.GraphModule - trt_gm(inputs) + trt_gm(*inputs) -``torch_tensorrt.dynamo.compile`` is the main API for users to interact with Torch-TensorRT. -The input type of the model should be ``ExportedProgram`` (ideally the output of torch.export) and output types is a ``torch.fx.GraphModule`` object. +.. note:: ``torch_tensorrt.dynamo.compile`` is the main API for users to interact with Torch-TensorRT dynamo backend. The input type of the model should be ``ExportedProgram`` (ideally the output of ``torch.export.export`` or ``torch_tensorrt.dynamo.trace`` (discussed in the section below)) and output type is a ``torch.fx.GraphModule`` object. -Customizations ---------------------------------------------- +Customizeable Settings +---------------------- There are lot of options for users to customize their settings for optimizing with TensorRT. Some of the frequently used options are as follows: -* inputs - For static shapes, this can be a list of torch tensors or `torch_tensorrt.Input` objects. For dynamic shapes, this should be a list of ``torch_tensorrt.Input`` objects. -* enabled_precisions - Set of precisions that TensorRT builder can use during optimization. -* truncate_long_and_double - Truncates long and double values to int and floats respectively. -* torch_executed_ops - Operators which are forced to be executed by Torch. -* min_block_size - Minimum number of consecutive operators required to be executed as a TensorRT segment. +* ``inputs`` - For static shapes, this can be a list of torch tensors or `torch_tensorrt.Input` objects. For dynamic shapes, this should be a list of ``torch_tensorrt.Input`` objects. +* ``enabled_precisions`` - Set of precisions that TensorRT builder can use during optimization. +* ``truncate_long_and_double`` - Truncates long and double values to int and floats respectively. +* ``torch_executed_ops`` - Operators which are forced to be executed by Torch. +* ``min_block_size`` - Minimum number of consecutive operators required to be executed as a TensorRT segment. The complete list of options can be found `here `_ Note: We do not support INT precision currently in Dynamo. Support for this currently exists in @@ -63,10 +73,10 @@ the model with dynamic shapes. Alternatively, you can use ``torch.export`` `with import torch import torch_tensorrt - inputs = torch_tensorrt.Input(min_shape=(1, 3, 224, 224), + inputs = [torch_tensorrt.Input(min_shape=(1, 3, 224, 224), opt_shape=(4, 3, 224, 224), max_shape=(8, 3, 224, 224), - dtype=torch.float32) + dtype=torch.float32)] model = MyModel().eval() exp_program = torch_tensorrt.dynamo.trace(model, inputs) \ No newline at end of file diff --git a/docsrc/user_guide/saving_models.rst b/docsrc/user_guide/saving_models.rst index 46fadcb905..00c3e45d7a 100644 --- a/docsrc/user_guide/saving_models.rst +++ b/docsrc/user_guide/saving_models.rst @@ -2,15 +2,24 @@ Saving models compiled with Torch-TensorRT ==================================== +.. currentmodule:: torch_tensorrt.dynamo +.. automodule:: torch_tensorrt.dynamo + :members: + :undoc-members: + :show-inheritance: + Saving models compiled with Torch-TensorRT varies slightly with the `ir` that has been used for compilation. -1) Dynamo IR +Dynamo IR +------------- Starting with 2.1 release of Torch-TensorRT, we are switching the default compilation to be dynamo based. The output of `ir=dynamo` compilation is a `torch.fx.GraphModule` object. There are two ways to save these objects a) Converting to Torchscript +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + `torch.fx.GraphModule` objects cannot be serialized directly. Hence we use `torch.jit.trace` to convert this into a `ScriptModule` object which can be saved to disk. The following code illustrates this approach. @@ -30,6 +39,8 @@ The following code illustrates this approach. model(inputs) b) ExportedProgram +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + `torch.export.ExportedProgram` is a new format introduced in Pytorch 2.1. After we compile a Pytorch module using Torch-TensorRT, the resultant `torch.fx.GraphModule` along with additional metadata can be used to create `ExportedProgram` which can be saved and loaded from disk. @@ -56,7 +67,9 @@ This is needed as `torch._export` serialization cannot handle serializing and de NOTE: This way of saving the models using `ExportedProgram` is experimental. Here is a known issue : https://github.com/pytorch/TensorRT/issues/2341 -2) Torchscript IR + +Torchscript IR +------------- In Torch-TensorRT 1.X versions, the primary way to compile and run inference with Torch-TensorRT is using Torchscript IR. This behavior stays the same in 2.X versions as well. From c502a795c237eb2d56a925002ef7d3be3d6d3463 Mon Sep 17 00:00:00 2001 From: Dheeraj Peri Date: Wed, 25 Oct 2023 16:33:11 -0700 Subject: [PATCH 3/4] chore: minor updates Signed-off-by: Dheeraj Peri --- docsrc/user_guide/dynamo_export.rst | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/docsrc/user_guide/dynamo_export.rst b/docsrc/user_guide/dynamo_export.rst index c663525c04..a5d430f8f2 100644 --- a/docsrc/user_guide/dynamo_export.rst +++ b/docsrc/user_guide/dynamo_export.rst @@ -15,9 +15,9 @@ using TensorRT in an Ahead-Of-Time fashion. Using the Dynamo backend ---------------------------------------- Pytorch 2.1 introduced ``torch.export`` APIs which -can export graphs from Pytorch programs into ``ExportedProgram``s. Torch-TensorRT dynamo -backend compiles these ``ExportedProgram``s and optimizes them using TensorRT. Here's a simple -usage of the dynamo backend +can export graphs from Pytorch programs into ``ExportedProgram`` objects. Torch-TensorRT dynamo +backend compiles these ``ExportedProgram`` objects and optimizes them using TensorRT. Here's a simple +usage of the dynamo backend .. code-block:: python @@ -38,7 +38,6 @@ Customizeable Settings There are lot of options for users to customize their settings for optimizing with TensorRT. Some of the frequently used options are as follows: - * ``inputs`` - For static shapes, this can be a list of torch tensors or `torch_tensorrt.Input` objects. For dynamic shapes, this should be a list of ``torch_tensorrt.Input`` objects. * ``enabled_precisions`` - Set of precisions that TensorRT builder can use during optimization. * ``truncate_long_and_double`` - Truncates long and double values to int and floats respectively. @@ -46,7 +45,8 @@ Some of the frequently used options are as follows: * ``min_block_size`` - Minimum number of consecutive operators required to be executed as a TensorRT segment. The complete list of options can be found `here `_ -Note: We do not support INT precision currently in Dynamo. Support for this currently exists in + +.. note:: We do not support INT precision currently in Dynamo. Support for this currently exists in our Torchscript IR. We plan to implement similar support for dynamo in our next release. Under the hood From cee625d9c5f90895bd08ac9f80a84c7bc3f279f9 Mon Sep 17 00:00:00 2001 From: Dheeraj Peri Date: Thu, 26 Oct 2023 10:26:25 -0700 Subject: [PATCH 4/4] chore: fix indexing Signed-off-by: Dheeraj Peri --- docsrc/user_guide/saving_models.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docsrc/user_guide/saving_models.rst b/docsrc/user_guide/saving_models.rst index 00c3e45d7a..3b50e7d761 100644 --- a/docsrc/user_guide/saving_models.rst +++ b/docsrc/user_guide/saving_models.rst @@ -1,4 +1,4 @@ -.. _runtime: +.. _saving_models: Saving models compiled with Torch-TensorRT ====================================