Fix ONNX export for MobileBERT #1539

anmarques · 2023-04-26T17:41:53Z

Simple fixes for ONNX export of MobileBERT.

Before the fix, a MatMul in the Embedding section of MobileBERT was not being converted to MatMulInteger, even though the inputs are quantized.

In short, a DequantizeLinear node used as part of the embedding quantization must be propagated down a few Slice and Concat nodes such that it can sit next to the MatMul node. This allows the proper pattern matching to convert that MatMul to MatMulInteger.

This behavior was already present in the onnx export logic, but there is a data type check on the embedding weights and it expected uint8. However, the weights were defined as int8 (conversion to uint8 happens at a later step). This PR adds logic to support int8, and also accounts for non-zero zero-point.

Testing plan:
Did the onnx export for the model below and checked that the MatMul is converted to ConvInteger. Checked that deepsparse now supports 99.35% of ops. Accuracy matches value reported in the zoo.

zoo:nlp/question_answering/mobilebert-none/pytorch/huggingface/squad/14layer_pruned50_quant-none-vnni

…8 format (conversion to uint8 occurs later)

eldarkurtic · 2023-04-26T18:16:21Z

I have just tested export to onnx and eval-downstream of the exported model, and everything looks as expected

mgoin

Excellent fix, thanks!

anmarques added 2 commits April 26, 2023 12:34

Account for the possibility of the quantized embedddings to be in int…

be8a6d5

…8 format (conversion to uint8 occurs later)

Set the padding value to match to the zero point accordingly.

1bfcada

anmarques requested review from eldarkurtic and mgoin April 26, 2023 17:42

Style and quality fixes

c4326b1

eldarkurtic approved these changes Apr 26, 2023

View reviewed changes

mgoin approved these changes Apr 26, 2023

View reviewed changes

mgoin linked an issue Apr 26, 2023 that may be closed by this pull request

Failure of reproducing sparsification of MobileBERT-oBERT-SQuAD #1534

Closed

anmarques merged commit d360f12 into main Apr 26, 2023

anmarques deleted the fix/mobilebert/onnx_export branch April 26, 2023 20:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix ONNX export for MobileBERT #1539

Fix ONNX export for MobileBERT #1539

anmarques commented Apr 26, 2023

eldarkurtic commented Apr 26, 2023

mgoin left a comment

Fix ONNX export for MobileBERT #1539

Fix ONNX export for MobileBERT #1539

Conversation

anmarques commented Apr 26, 2023

eldarkurtic commented Apr 26, 2023

mgoin left a comment

Choose a reason for hiding this comment