-
Notifications
You must be signed in to change notification settings - Fork 361
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
🐛 [Bug] Encountered bug when using Torch-TensorRT with torchscript model Conformer Transducer #2197
Comments
Full debug log: |
updated the code for full reproduction^ P.S. no one conformer model from NeMo is compiled to tensorrt |
@gs-olive can you take a look at this nemo model? |
I am able to reproduce this error in the TorchScript path on the latest |
Thanks! Do I understand correctly that if I try to compile this model from native pytorch to tensorrt it might work? Or is the problem in the Conformer architecture itself? |
The issue does not seem to be with the Conformer architecture itself, since inference in plain PyTorch is working, and it is scripting to TorchScript successfully. There is a possibility that PyTorch --> ONNX --> TensorRT might work, yes. I have verified that with #2228 and #2234, we are able to compile this model with |
Regarding the TorchScript path, the bug occurs on this line, where the shape of |
After further investigation on this issue, we may be able to compile this model via the class ModelWrapper(torch.nn.Module):
def __init__(self, *args, **kwargs) -> None:
super().__init__(*args, **kwargs)
self.nemo_model = nemo_asr.models.EncDecRNNTBPEModel.from_pretrained(model_name="stt_en_conformer_transducer_small")
self.nemo_model.freeze()
self.nemo_model.eval().cuda()
def forward(self, x, y):
return self.nemo_model(processed_signal=x,
processed_signal_length=y) I will follow up on this issue again as these PRs and improvements are merged. |
Hello - the referenced PRs have been merged, and the model building/serialization is now functional for this model in the Dynamo path! The script I used to compile, serialize, and reload the model can be found below: Code Sampleimport nemo.collections.asr as nemo_asr
import torch
import torch_tensorrt as torchtrt
batch_size = 1
inputs = [
torchtrt.Input(shape=[batch_size, 80, 8269]),
torchtrt.Input(shape=[batch_size]),
]
torch_inputs = [torch.rand([batch_size, 80, 8269]).cuda(),
torch.rand([batch_size]).cuda()]
class Wrapper(torch.nn.Module):
def __init__(self, *args, **kwargs) -> None:
super().__init__(*args, **kwargs)
self.nemo_model = nemo_asr.models.EncDecRNNTBPEModel.from_pretrained(model_name="stt_en_conformer_transducer_small")
self.nemo_model.freeze()
self.nemo_model.eval().cuda()
def forward(self, x, y):
return self.nemo_model(processed_signal=x,
processed_signal_length=y)
# Trace the model through to FX
nemo_model = Wrapper().eval().cuda()
fx_graphmodule = torch.fx.experimental.proxy_tensor.make_fx(nemo_model)(*torch_inputs)
compile_settings = {
"inputs": inputs,
"enabled_precisions": {torch.float, torch.half},
"truncate_long_and_double": True,
"min_block_size": 85,
}
# Compile TRT-optimized model
trt_fx_module = torchtrt.compile(fx_graphmodule, ir="dynamo", **compile_settings)
trt_out = trt_fx_module(*torch_inputs)
# Trace through the output model with TorchScript
# Serialize and save the resultant graph
trt_script_model = torch.jit.trace(trt_fx_module, torch_inputs)
torch.jit.save(trt_script_model, "trt_model.ts")
# Reload model from save and perform inference
reloaded_model = torch.jit.load("trt_model.ts").cuda()
trt_reloaded_out = reloaded_model(*torch_inputs) Not all of the operators in the graph have converters currently, and I believe there are roughly 16 TRT engines generated as a result. If having full graph support for this model is important, please either let me know or file new issues, so each of the missing operators can be implemented. Additionally, please let me know if the compilation is functional on your machine! |
@gs-olive hello! image: nvcr.io/nvidia/tensorrt:23.07-py3
|
Thanks for the follow-up. Based on the logs it seems that compilation succeeded but model serialization did not. As suggested by @peri044 - could you add |
It works, thanks! (for tensor [1,80,8269] model weighs = 1.4Gb) |
Hello - based on the logs, I believe the large model size is due to segmentation, since there seem to be some operators which we don't currently have converters for in this model. Could you specify |
conf_trt_ts_debug.log Is that enough? If you export this model to torchscript or onnx, it decomposes into two files: encoder.ts and decoder.ts |
Yes, this is very helpful thank you - it looks like we are missing the |
Bug Description
I get an error when converting a conformer transducer enecoder to tensorrt. (asr task)
To Reproduce
requirenments.txt
CODE:
CONSOLE:
Expected behavior
I'm expecting a tensorrt file on the output
Environment
conda
,pip
,libtorch
, source): pipAdditional context
I want to export from torch script to tenorrt encoder and decoder conformer transducer models
The text was updated successfully, but these errors were encountered: