Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

convert a static size onnx model to tensorrt but for multiple batch inference #3632

Closed
ninono12345 opened this issue Jan 24, 2024 · 4 comments
Assignees
Labels
triaged Issue has been triaged by maintainers

Comments

@ninono12345
Copy link

Description

Hello, this is not a bug report, more like asking for advice.

I have an onnx model that's used in a tracking algorithm. But the model originally doesn't accept multiple batches, if there are 4 objects tracked, the model processes the frame 4 times, instead of sending a tensor with a batch of 4.

My question is, is it possible to convert that onnx model to tensorrt, but to make it accept multiple batches at once? so for example if an input should be 1x3x224x224, I would send in 4x3x224x224, so perhaps I could infer all 4 frames with a single call to the engine?

Thank you

@ninono12345
Copy link
Author

I understand, that to some of you this might sound like a stupid question, but I have had so many problems just convert a model to onnx, then to tensorrt and make it work, my head hurts. I have to dig deep again just to find an answer for this, perhaps anybody knows how to do it?

Thank you

@zerollzeng
Copy link
Collaborator

For some onnx models it's doable, it can be quickly done by polygraphy. See https://github.com/NVIDIA/TensorRT/tree/main/tools/Polygraphy/examples/cli/surgeon/03_modifying_input_shapes

While some models won't work, e.g. if your model has a reshape node that do the reshape to a fixed shape, then you cannot change the input batch/shape, otherwise it would fail when building engine.

@zerollzeng zerollzeng self-assigned this Jan 27, 2024
@zerollzeng zerollzeng added the triaged Issue has been triaged by maintainers label Jan 27, 2024
@ninono12345
Copy link
Author

@zerollzeng I found out that the model can be modified in a few places, I managed to make it be able to take and return multiple batches. But now I'm facing a different problem, when adding batches to a tensorrt engine the engines inference time slows down significantly... #3646

@zerollzeng
Copy link
Collaborator

Let's discuss the new issue in #3646

I'm closing this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
triaged Issue has been triaged by maintainers
Projects
None yet
Development

No branches or pull requests

2 participants