You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In the tools/pytorch-quantization/examples/torchvision/classification_flow.py file, in line 42, custom (Nvidia modified) pytorch models are imported. However, the code defining these models was later nested further under a classification directory, so the correct path should be models.classification. In the same file, a check is performed to see if a custom implementation exists for a model (specified by name), or if the script should fall back to torchvision implementation. This incorrect path results in the model always falling back to the torchvision implementation, without warning to the user. The modified models were authored because they made minor changes to allow quantization of residual connections (during TRT export), but by falling back to the torchvision implementation, the residual connections are left unquantized, resulting in a slower model.
The first figure below is a Netron visualization of a block in a ResNet-50 onnx model generated by the current implementation of the classification_flow.py script.
The second figure, below, highlights the addition of the QuantizeLinear and DequantizeLinear layers along the residual path that are added by the custom implementation:
Environment
N/A, this is a universal issue.
Relevant Files
In this Google drive folder, I have included the above screenshots, as well as the onnx files that were used to generate them.
Steps To Reproduce
Run the Nvidia authored classification_flow.py already included in the repository. Then either insert a PDB breakpoint at line 150 to verify this path is never being reached, or visualize the exported onnx file and verify the lack of QuantizeLinear and DequantizeLinear layers along residual paths.
The text was updated successfully, but these errors were encountered:
Thanks @Jeremalloch for the catch, yes we reorg the folder structure and the classification_flow.py can no longer find the correct resnet. I had just merged your code, thanks again!
In the
tools/pytorch-quantization/examples/torchvision/classification_flow.py
file, in line 42, custom (Nvidia modified) pytorch models are imported. However, the code defining these models was later nested further under a classification directory, so the correct path should bemodels.classification
. In the same file, a check is performed to see if a custom implementation exists for a model (specified by name), or if the script should fall back to torchvision implementation. This incorrect path results in the model always falling back to the torchvision implementation, without warning to the user. The modified models were authored because they made minor changes to allow quantization of residual connections (during TRT export), but by falling back to the torchvision implementation, the residual connections are left unquantized, resulting in a slower model.The first figure below is a Netron visualization of a block in a ResNet-50 onnx model generated by the current implementation of the
classification_flow.py
script.The second figure, below, highlights the addition of the QuantizeLinear and DequantizeLinear layers along the residual path that are added by the custom implementation:
Environment
N/A, this is a universal issue.
Relevant Files
In this Google drive folder, I have included the above screenshots, as well as the onnx files that were used to generate them.
Steps To Reproduce
Run the Nvidia authored
classification_flow.py
already included in the repository. Then either insert a PDB breakpoint at line 150 to verify this path is never being reached, or visualize the exported onnx file and verify the lack of QuantizeLinear and DequantizeLinear layers along residual paths.The text was updated successfully, but these errors were encountered: