From 2d68dbd2743ed1d0a31da43fc8ab227123f9c396 Mon Sep 17 00:00:00 2001
From: Dheeraj Peri <peri.dheeraj@gmail.com>
Date: Tue, 10 Oct 2023 21:07:35 -0700
Subject: [PATCH 1/4] chore: update dynamo export doc

Signed-off-by: Dheeraj Peri <peri.dheeraj@gmail.com>

chore: fix bulleting

Signed-off-by: Dheeraj Peri <peri.dheeraj@gmail.com>

chore: Fix formatting

chore: fix formatting

Signed-off-by: Dheeraj Peri <peri.dheeraj@gmail.com>

chore: fix formatting

Signed-off-by: Dheeraj Peri <peri.dheeraj@gmail.com>

chore: fix reference

Signed-off-by: Dheeraj Peri <peri.dheeraj@gmail.com>
---
 docsrc/index.rst                    |  2 +
 docsrc/user_guide/dynamo_export.rst | 72 +++++++++++++++++++++++++++++
 2 files changed, 74 insertions(+)
 create mode 100644 docsrc/user_guide/dynamo_export.rst

diff --git a/docsrc/index.rst b/docsrc/index.rst
index d2f6c54b9d..97580541ea 100644
--- a/docsrc/index.rst
+++ b/docsrc/index.rst
@@ -41,6 +41,7 @@ User Guide
 * :ref:`creating_a_ts_mod`
 * :ref:`getting_started_with_fx`
 * :ref:`torch_compile`
+* :ref:`dynamo_export`
 * :ref:`ptq`
 * :ref:`runtime`
 * :ref:`saving_models`
@@ -56,6 +57,7 @@ User Guide
    user_guide/creating_torchscript_module_in_python
    user_guide/getting_started_with_fx_path
    user_guide/torch_compile
+   user_guide/dynamo_export
    user_guide/ptq
    user_guide/runtime
    user_guide/saving_models
diff --git a/docsrc/user_guide/dynamo_export.rst b/docsrc/user_guide/dynamo_export.rst
new file mode 100644
index 0000000000..587c2cd79c
--- /dev/null
+++ b/docsrc/user_guide/dynamo_export.rst
@@ -0,0 +1,72 @@
+.. _dynamo_export:
+
+Torch-TensorRT (Dynamo) Backend
+========================
+This guide presents Torch-TensorRT dynamo backend which compiles Pytorch programs 
+into TensorRT engines through torch dynamo. Pytorch 2.1 introduced ``torch.export`` APIs which 
+can export graphs from Pytorch programs using torch dynamo. Torch-TensorRT dynamo 
+backend compiles these exported graphs and optimizes them using TensorRT. Here's a simple 
+usage of the dynamo backend  
+
+.. code-block:: python
+
+    import torch
+    import torch_tensorrt
+
+    model = MyModel().eval().cuda()
+    inputs = torch.randn((1, 3, 224, 224), dtype=torch.float32).cuda()
+    exp_program = torch.export(model, inputs)
+    trt_gm = torch_tensorrt.dynamo.compile(exp_program, inputs) # Output is a torch.fx.GraphModule
+    trt_gm(inputs)
+
+``torch_tensorrt.dynamo.compile`` is the main API for users to interact with Torch-TensorRT.
+The input type of the model should be ``ExportedProgram`` (ideally the output of torch.export) and output types is a ``torch.fx.GraphModule`` object.
+
+Customizations
+---------------------------------------------
+
+There are lot of options for users to customize their settings for optimizing with TensorRT. 
+Some of the frequently used options are as follows: 
+
+
+* inputs - For static shapes, this can be a list of torch tensors or `torch_tensorrt.Input` objects. For dynamic shapes, this should be a list of ``torch_tensorrt.Input`` objects.
+* enabled_precisions - Set of precisions that TensorRT builder can use during optimization.
+* truncate_long_and_double - Truncates long and double values to int and floats respectively.
+* torch_executed_ops - Operators which are forced to be executed by Torch.
+* min_block_size - Minimum number of consecutive operators required to be executed as a TensorRT segment.
+
+The complete list of options can be found `here <https://github.com/pytorch/TensorRT/blob/123a486d6644a5bbeeec33e2f32257349acc0b8f/py/torch_tensorrt/dynamo/compile.py#L51-L77>`_
+Note: We do not support INT precision currently in Dynamo. Support for this currently exists in 
+our Torchscript IR. We plan to implement similar support for dynamo in our next release.
+
+Under the hood
+--------------
+
+Under the hood, ``torch_tensorrt.dynamo.compile`` performs the following on the graph.
+
+* Lowering - Applies lowering passes to add/remove operators for optimal conversion.
+* Partitioning - Partitions the graph into Pytorch and TensorRT segments based on the ``min_block_size`` and ``torch_executed_ops`` field.
+* Conversion - Pytorch ops get converted into TensorRT ops in this phase.
+* Optimization - Post conversion, we build the TensorRT engine and embed this inside the pytorch graph.
+
+Tracing
+-------
+
+``torch_tensorrt.dynamo.trace`` can be used to trace a Pytorch graphs and produce ``ExportedProgram``. 
+This internally performs some decompositions of operators for downstream optimization. 
+The ``ExportedProgram`` can then be used with ``torch_tensorrt.dynamo.compile`` API.
+If you have dynamic input shapes in your model, you can use this ``torch_tensorrt.dynamo.trace`` to export 
+the model with dynamic shapes. Alternatively, you can use ``torch.export`` `with constraints <https://pytorch.org/docs/stable/export.html#expressing-dynamism>`_ directly as well.
+
+.. code-block:: python
+
+    import torch
+    import torch_tensorrt
+
+    inputs = torch_tensorrt.Input(min_shape=(1, 3, 224, 224),
+                                  opt_shape=(4, 3, 224, 224),
+                                  max_shape=(8, 3, 224, 224),
+                                  dtype=torch.float32)
+    model = MyModel().eval()
+    exp_program = torch_tensorrt.dynamo.trace(model, inputs) 
+    
\ No newline at end of file

From 084856200f10b2dd8ebdc3ee24d9ee1a57add00b Mon Sep 17 00:00:00 2001
From: Dheeraj Peri <peri.dheeraj@gmail.com>
Date: Wed, 25 Oct 2023 15:39:58 -0700
Subject: [PATCH 2/4] chore: update formatting

Signed-off-by: Dheeraj Peri <peri.dheeraj@gmail.com>
---
 docsrc/user_guide/dynamo_export.rst | 50 +++++++++++++++++------------
 docsrc/user_guide/saving_models.rst | 17 ++++++++--
 2 files changed, 45 insertions(+), 22 deletions(-)

diff --git a/docsrc/user_guide/dynamo_export.rst b/docsrc/user_guide/dynamo_export.rst
index 587c2cd79c..c663525c04 100644
--- a/docsrc/user_guide/dynamo_export.rst
+++ b/docsrc/user_guide/dynamo_export.rst
@@ -1,11 +1,22 @@
 .. _dynamo_export:
 
-Torch-TensorRT (Dynamo) Backend
-========================
-This guide presents Torch-TensorRT dynamo backend which compiles Pytorch programs 
-into TensorRT engines through torch dynamo. Pytorch 2.1 introduced ``torch.export`` APIs which 
-can export graphs from Pytorch programs using torch dynamo. Torch-TensorRT dynamo 
-backend compiles these exported graphs and optimizes them using TensorRT. Here's a simple 
+Torch-TensorRT Dynamo Backend
+=============================================
+.. currentmodule:: torch_tensorrt.dynamo
+
+.. automodule:: torch_tensorrt.dynamo
+   :members:
+   :undoc-members:
+   :show-inheritance:
+   
+This guide presents Torch-TensorRT dynamo backend which optimizes Pytorch models 
+using TensorRT in an Ahead-Of-Time fashion. 
+
+Using the Dynamo backend 
+----------------------------------------
+Pytorch 2.1 introduced ``torch.export`` APIs which 
+can export graphs from Pytorch programs into ``ExportedProgram``s. Torch-TensorRT dynamo 
+backend compiles these ``ExportedProgram``s and optimizes them using TensorRT. Here's a simple 
 usage of the dynamo backend  
 
 .. code-block:: python
@@ -14,26 +25,25 @@ usage of the dynamo backend
     import torch_tensorrt
 
     model = MyModel().eval().cuda()
-    inputs = torch.randn((1, 3, 224, 224), dtype=torch.float32).cuda()
-    exp_program = torch.export(model, inputs)
+    inputs = [torch.randn((1, 3, 224, 224), dtype=torch.float32).cuda()]
+    exp_program = torch.export.export(model, tuple(inputs))
     trt_gm = torch_tensorrt.dynamo.compile(exp_program, inputs) # Output is a torch.fx.GraphModule
-    trt_gm(inputs)
+    trt_gm(*inputs)
 
-``torch_tensorrt.dynamo.compile`` is the main API for users to interact with Torch-TensorRT.
-The input type of the model should be ``ExportedProgram`` (ideally the output of torch.export) and output types is a ``torch.fx.GraphModule`` object.
+.. note::  ``torch_tensorrt.dynamo.compile`` is the main API for users to interact with Torch-TensorRT dynamo backend. The input type of the model should be ``ExportedProgram`` (ideally the output of ``torch.export.export`` or ``torch_tensorrt.dynamo.trace`` (discussed in the section below)) and output type is a ``torch.fx.GraphModule`` object.
 
-Customizations
----------------------------------------------
+Customizeable Settings
+----------------------
 
 There are lot of options for users to customize their settings for optimizing with TensorRT. 
 Some of the frequently used options are as follows: 
 
 
-* inputs - For static shapes, this can be a list of torch tensors or `torch_tensorrt.Input` objects. For dynamic shapes, this should be a list of ``torch_tensorrt.Input`` objects.
-* enabled_precisions - Set of precisions that TensorRT builder can use during optimization.
-* truncate_long_and_double - Truncates long and double values to int and floats respectively.
-* torch_executed_ops - Operators which are forced to be executed by Torch.
-* min_block_size - Minimum number of consecutive operators required to be executed as a TensorRT segment.
+* ``inputs`` - For static shapes, this can be a list of torch tensors or `torch_tensorrt.Input` objects. For dynamic shapes, this should be a list of ``torch_tensorrt.Input`` objects.
+* ``enabled_precisions`` - Set of precisions that TensorRT builder can use during optimization.
+* ``truncate_long_and_double`` - Truncates long and double values to int and floats respectively.
+* ``torch_executed_ops`` - Operators which are forced to be executed by Torch.
+* ``min_block_size`` - Minimum number of consecutive operators required to be executed as a TensorRT segment.
 
 The complete list of options can be found `here <https://github.com/pytorch/TensorRT/blob/123a486d6644a5bbeeec33e2f32257349acc0b8f/py/torch_tensorrt/dynamo/compile.py#L51-L77>`_
 Note: We do not support INT precision currently in Dynamo. Support for this currently exists in 
@@ -63,10 +73,10 @@ the model with dynamic shapes. Alternatively, you can use ``torch.export`` `with
     import torch
     import torch_tensorrt
 
-    inputs = torch_tensorrt.Input(min_shape=(1, 3, 224, 224),
+    inputs = [torch_tensorrt.Input(min_shape=(1, 3, 224, 224),
                                   opt_shape=(4, 3, 224, 224),
                                   max_shape=(8, 3, 224, 224),
-                                  dtype=torch.float32)
+                                  dtype=torch.float32)]
     model = MyModel().eval()
     exp_program = torch_tensorrt.dynamo.trace(model, inputs) 
     
\ No newline at end of file
diff --git a/docsrc/user_guide/saving_models.rst b/docsrc/user_guide/saving_models.rst
index 46fadcb905..00c3e45d7a 100644
--- a/docsrc/user_guide/saving_models.rst
+++ b/docsrc/user_guide/saving_models.rst
@@ -2,15 +2,24 @@
 
 Saving models compiled with Torch-TensorRT
 ====================================
+.. currentmodule:: torch_tensorrt.dynamo
 
+.. automodule:: torch_tensorrt.dynamo
+   :members:
+   :undoc-members:
+   :show-inheritance:
+   
 Saving models compiled with Torch-TensorRT varies slightly with the `ir` that has been used for compilation.
 
-1) Dynamo IR
+Dynamo IR 
+-------------
 
 Starting with 2.1 release of Torch-TensorRT, we are switching the default compilation to be dynamo based.
 The output of `ir=dynamo` compilation is a `torch.fx.GraphModule` object. There are two ways to save these objects
 
 a) Converting to Torchscript
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
 `torch.fx.GraphModule` objects cannot be serialized directly. Hence we use `torch.jit.trace` to convert this into a `ScriptModule` object which can be saved to disk. 
 The following code illustrates this approach. 
 
@@ -30,6 +39,8 @@ The following code illustrates this approach.
     model(inputs)
 
 b) ExportedProgram
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
 `torch.export.ExportedProgram` is a new format introduced in Pytorch 2.1. After we compile a Pytorch module using Torch-TensorRT, the resultant 
 `torch.fx.GraphModule` along with additional metadata can be used to create `ExportedProgram` which can be saved and loaded from disk.
 
@@ -56,7 +67,9 @@ This is needed as `torch._export` serialization cannot handle serializing and de
 
 NOTE: This way of saving the models using `ExportedProgram` is experimental. Here is a known issue : https://github.com/pytorch/TensorRT/issues/2341
 
-2) Torchscript IR
+
+Torchscript IR
+-------------
 
   In Torch-TensorRT 1.X versions, the primary way to compile and run inference with Torch-TensorRT is using Torchscript IR.
   This behavior stays the same in 2.X versions as well. 

From c502a795c237eb2d56a925002ef7d3be3d6d3463 Mon Sep 17 00:00:00 2001
From: Dheeraj Peri <peri.dheeraj@gmail.com>
Date: Wed, 25 Oct 2023 16:33:11 -0700
Subject: [PATCH 3/4] chore: minor updates

Signed-off-by: Dheeraj Peri <peri.dheeraj@gmail.com>
---
 docsrc/user_guide/dynamo_export.rst | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/docsrc/user_guide/dynamo_export.rst b/docsrc/user_guide/dynamo_export.rst
index c663525c04..a5d430f8f2 100644
--- a/docsrc/user_guide/dynamo_export.rst
+++ b/docsrc/user_guide/dynamo_export.rst
@@ -15,9 +15,9 @@ using TensorRT in an Ahead-Of-Time fashion.
 Using the Dynamo backend 
 ----------------------------------------
 Pytorch 2.1 introduced ``torch.export`` APIs which 
-can export graphs from Pytorch programs into ``ExportedProgram``s. Torch-TensorRT dynamo 
-backend compiles these ``ExportedProgram``s and optimizes them using TensorRT. Here's a simple 
-usage of the dynamo backend  
+can export graphs from Pytorch programs into ``ExportedProgram`` objects. Torch-TensorRT dynamo 
+backend compiles these ``ExportedProgram`` objects and optimizes them using TensorRT. Here's a simple 
+usage of the dynamo backend
 
 .. code-block:: python
 
@@ -38,7 +38,6 @@ Customizeable Settings
 There are lot of options for users to customize their settings for optimizing with TensorRT. 
 Some of the frequently used options are as follows: 
 
-
 * ``inputs`` - For static shapes, this can be a list of torch tensors or `torch_tensorrt.Input` objects. For dynamic shapes, this should be a list of ``torch_tensorrt.Input`` objects.
 * ``enabled_precisions`` - Set of precisions that TensorRT builder can use during optimization.
 * ``truncate_long_and_double`` - Truncates long and double values to int and floats respectively.
@@ -46,7 +45,8 @@ Some of the frequently used options are as follows:
 * ``min_block_size`` - Minimum number of consecutive operators required to be executed as a TensorRT segment.
 
 The complete list of options can be found `here <https://github.com/pytorch/TensorRT/blob/123a486d6644a5bbeeec33e2f32257349acc0b8f/py/torch_tensorrt/dynamo/compile.py#L51-L77>`_
-Note: We do not support INT precision currently in Dynamo. Support for this currently exists in 
+
+.. note:: We do not support INT precision currently in Dynamo. Support for this currently exists in 
 our Torchscript IR. We plan to implement similar support for dynamo in our next release.
 
 Under the hood

From cee625d9c5f90895bd08ac9f80a84c7bc3f279f9 Mon Sep 17 00:00:00 2001
From: Dheeraj Peri <peri.dheeraj@gmail.com>
Date: Thu, 26 Oct 2023 10:26:25 -0700
Subject: [PATCH 4/4] chore: fix indexing

Signed-off-by: Dheeraj Peri <peri.dheeraj@gmail.com>
---
 docsrc/user_guide/saving_models.rst | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/docsrc/user_guide/saving_models.rst b/docsrc/user_guide/saving_models.rst
index 00c3e45d7a..3b50e7d761 100644
--- a/docsrc/user_guide/saving_models.rst
+++ b/docsrc/user_guide/saving_models.rst
@@ -1,4 +1,4 @@
-.. _runtime:
+.. _saving_models:
 
 Saving models compiled with Torch-TensorRT
 ====================================