Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support UINT8 input / output and casting from UINT8 to FP16 and back #3026

Closed
MatthieuToulemont opened this issue May 31, 2023 · 16 comments
Closed
Assignees
Labels
triaged Issue has been triaged by maintainers

Comments

@MatthieuToulemont
Copy link

MatthieuToulemont commented May 31, 2023

Hello,

Thanks for the great work,

I am currently having trouble serving TensorRT models that take very large image inputs and return very large image outputs in Triton Inference Server as I need to send them as Float16.

Ideally I would like to send them as UINT8 as that is what my image decoder outputs.

The only way around I have find is to have an INT8 input / output and do the conversion to Float16 inside the model.
To account for the UINT8 -> INT8 signed bit I need to run:

# Torch
image = image.to(torch.float16)
image[image <= -1] = image[image <= -1] + 255 + 1

and for the output:

result[result >= 128] = result[result>=128] - (255+1)
result = result.to(torch.int8)

But this is not compatible with TensorRT due to an error in ScatterND.

We've found a way around this by implementing a plugin in CUDA that does these transformations for us, however we think it would make sense supporting UINT8 inputs and outputs as that is what image decoders outputs usually.
It would also make sense supporting casting UINT8 to FP16 and FP16 to UINT8.

On a side note, it would be great to be able to use int8 only for inputs / outputs without expanding the set of available tactics to int8 based tactics.

@zerollzeng
Copy link
Collaborator

@zerollzeng zerollzeng self-assigned this May 31, 2023
@zerollzeng zerollzeng added the triaged Issue has been triaged by maintainers label May 31, 2023
@david-PHR
Copy link

The trtexec program doesn't indicate any UINT8 datatype.

I've tried with --inputIOFormats=uint8:chw --outputIOFormats=uint8:chw but it returns this issue : [05/31/2023-15:35:54] [E] Invalid DataType uint8

@MatthieuToulemont
Copy link
Author

I think it's just that Nvidia didn't update trtexec after updating tensorrt. Even the documentation of trtexec does not include uint8 among the valid input / output types.

@MatthieuToulemont
Copy link
Author

@zerollzeng could we make sure the trtexec CLI is updated and conforms to the supported types in TensorRT ?
This is really not the best developer experience to be honest.

@zerollzeng
Copy link
Collaborator

Let's me check this internally and see if we can improve this :-)

@MatthieuToulemont
Copy link
Author

MatthieuToulemont commented Jun 1, 2023

Thank you !

Maybe we should label this issue as a bug ? The CLI is clearly not behaving as it should, right ?

@zerollzeng
Copy link
Collaborator

Filed internal bug 4146128 to support uint8 in trtexec.

To WAR this you can try polygraphy, we should support it in polygraphy tools.

@MatthieuToulemont
Copy link
Author

Thank you, to be more precise, you can use uint8 with trtexec if the onnx graph as uint8 inputs and or outputs but adding the following arguments in trtexec: --inputIOFormats=uint8:chw --outputIOFormats=uint8:chw triggers the following error: [E] Invalid DataType uint8.

@zerollzeng
Copy link
Collaborator

we added support for --inputIOFormats=uint8:chw --outputIOFormats=uint8:chw, it will be shipped in the next release.

I'm closing this now, feel free to reopen if you have any further questions.

@onurtore
Copy link

onurtore commented Oct 5, 2023

Is this shipped yet?

@zerollzeng
Copy link
Collaborator

Yes, it should be in the latest 8.6 or 9.0 release.

@onurtore
Copy link

@zerollzeng Thanks

@hyperf0cus
Copy link

hyperf0cus commented Jan 24, 2024

@zerollzeng How to do this in Python API?

@zerollzeng
Copy link
Collaborator

@zerollzeng
Copy link
Collaborator

You can also take the polygraphy source code as reference as it's open-source.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
triaged Issue has been triaged by maintainers
Projects
None yet
Development

No branches or pull requests

6 participants