Support UINT8 input / output and casting from UINT8 to FP16 and back #3026

MatthieuToulemont · 2023-05-31T12:30:55Z

Hello,

Thanks for the great work,

I am currently having trouble serving TensorRT models that take very large image inputs and return very large image outputs in Triton Inference Server as I need to send them as Float16.

Ideally I would like to send them as UINT8 as that is what my image decoder outputs.

The only way around I have find is to have an INT8 input / output and do the conversion to Float16 inside the model.
To account for the UINT8 -> INT8 signed bit I need to run:

# Torch
image = image.to(torch.float16)
image[image <= -1] = image[image <= -1] + 255 + 1

and for the output:

result[result >= 128] = result[result>=128] - (255+1)
result = result.to(torch.int8)

But this is not compatible with TensorRT due to an error in ScatterND.

We've found a way around this by implementing a plugin in CUDA that does these transformations for us, however we think it would make sense supporting UINT8 inputs and outputs as that is what image decoders outputs usually.
It would also make sense supporting casting UINT8 to FP16 and FP16 to UINT8.

On a side note, it would be great to be able to use int8 only for inputs / outputs without expanding the set of available tactics to int8 based tactics.

The text was updated successfully, but these errors were encountered:

zerollzeng · 2023-05-31T15:08:22Z

I think we support UINT8 casting since TRT 8.5, see https://docs.nvidia.com/deeplearning/tensorrt/api/c_api/namespacenvinfer1.html#a83aed11a1c160f30dcd13809678bdd29 and https://docs.nvidia.com/deeplearning/tensorrt/api/c_api/classnvinfer1_1_1_i_cast_layer.html#aa0e3a2bc657c7bd0c73f586d6dca5da2

david-PHR · 2023-05-31T15:50:49Z

The trtexec program doesn't indicate any UINT8 datatype.

I've tried with --inputIOFormats=uint8:chw --outputIOFormats=uint8:chw but it returns this issue : [05/31/2023-15:35:54] [E] Invalid DataType uint8

MatthieuToulemont · 2023-05-31T16:36:36Z

I think it's just that Nvidia didn't update trtexec after updating tensorrt. Even the documentation of trtexec does not include uint8 among the valid input / output types.

MatthieuToulemont · 2023-06-01T07:48:08Z

@zerollzeng could we make sure the trtexec CLI is updated and conforms to the supported types in TensorRT ?
This is really not the best developer experience to be honest.

zerollzeng · 2023-06-01T14:53:02Z

Let's me check this internally and see if we can improve this :-)

MatthieuToulemont · 2023-06-01T14:59:05Z

Thank you !

Maybe we should label this issue as a bug ? The CLI is clearly not behaving as it should, right ?

zerollzeng · 2023-06-03T13:42:02Z

Filed internal bug 4146128 to support uint8 in trtexec.

To WAR this you can try polygraphy, we should support it in polygraphy tools.

MatthieuToulemont · 2023-06-03T19:09:06Z

Thank you, to be more precise, you can use uint8 with trtexec if the onnx graph as uint8 inputs and or outputs but adding the following arguments in trtexec: --inputIOFormats=uint8:chw --outputIOFormats=uint8:chw triggers the following error: [E] Invalid DataType uint8.

zerollzeng · 2023-06-28T01:35:58Z

we added support for --inputIOFormats=uint8:chw --outputIOFormats=uint8:chw, it will be shipped in the next release.

I'm closing this now, feel free to reopen if you have any further questions.

onurtore · 2023-10-05T08:59:45Z

Is this shipped yet?

zerollzeng · 2023-10-08T09:58:33Z

Yes, it should be in the latest 8.6 or 9.0 release.

onurtore · 2023-10-10T13:14:13Z

@zerollzeng Thanks

hyperf0cus · 2024-01-24T07:00:43Z

@zerollzeng How to do this in Python API?

zerollzeng · 2024-01-27T11:32:41Z

Check https://docs.nvidia.com/deeplearning/tensorrt/api/python_api/infer/Graph/LayerBase.html#itensor

zerollzeng · 2024-01-27T11:33:03Z

You can also take the polygraphy source code as reference as it's open-source.

jax11235 · 2024-11-02T15:27:48Z

I think we support UINT8 casting since TRT 8.5, see https://docs.nvidia.com/deeplearning/tensorrt/api/c_api/namespacenvinfer1.html#a83aed11a1c160f30dcd13809678bdd29 and https://docs.nvidia.com/deeplearning/tensorrt/api/c_api/classnvinfer1_1_1_i_cast_layer.html#aa0e3a2bc657c7bd0c73f586d6dca5da2

Good job to the developers!

zerollzeng self-assigned this May 31, 2023

zerollzeng added the triaged Issue has been triaged by maintainers label May 31, 2023

zerollzeng closed this as completed Jun 28, 2023

lix19937 mentioned this issue Oct 24, 2024

int8/uint8/bool as input not support in trt plugin ? #3959

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support UINT8 input / output and casting from UINT8 to FP16 and back #3026

Support UINT8 input / output and casting from UINT8 to FP16 and back #3026

MatthieuToulemont commented May 31, 2023 •

edited

Loading

zerollzeng commented May 31, 2023

david-PHR commented May 31, 2023

MatthieuToulemont commented May 31, 2023

MatthieuToulemont commented Jun 1, 2023

zerollzeng commented Jun 1, 2023

MatthieuToulemont commented Jun 1, 2023 •

edited

Loading

zerollzeng commented Jun 3, 2023

MatthieuToulemont commented Jun 3, 2023

zerollzeng commented Jun 28, 2023

onurtore commented Oct 5, 2023

zerollzeng commented Oct 8, 2023

onurtore commented Oct 10, 2023

hyperf0cus commented Jan 24, 2024 •

edited

Loading

zerollzeng commented Jan 27, 2024

zerollzeng commented Jan 27, 2024

jax11235 commented Nov 2, 2024

Support UINT8 input / output and casting from UINT8 to FP16 and back #3026

Support UINT8 input / output and casting from UINT8 to FP16 and back #3026

Comments

MatthieuToulemont commented May 31, 2023 • edited Loading

zerollzeng commented May 31, 2023

david-PHR commented May 31, 2023

MatthieuToulemont commented May 31, 2023

MatthieuToulemont commented Jun 1, 2023

zerollzeng commented Jun 1, 2023

MatthieuToulemont commented Jun 1, 2023 • edited Loading

zerollzeng commented Jun 3, 2023

MatthieuToulemont commented Jun 3, 2023

zerollzeng commented Jun 28, 2023

onurtore commented Oct 5, 2023

zerollzeng commented Oct 8, 2023

onurtore commented Oct 10, 2023

hyperf0cus commented Jan 24, 2024 • edited Loading

zerollzeng commented Jan 27, 2024

zerollzeng commented Jan 27, 2024

jax11235 commented Nov 2, 2024

MatthieuToulemont commented May 31, 2023 •

edited

Loading

MatthieuToulemont commented Jun 1, 2023 •

edited

Loading

hyperf0cus commented Jan 24, 2024 •

edited

Loading