[FQ2I] Support converting `dense` -> `add` to `qnn.dense` -> `add` -> `requantize` #13578

masahi · 2022-12-07T21:26:10Z

The pattern of dense -> add, where the add is really bias addition, can appear often as the result of converting ONNX Gemm op:

tvm/python/tvm/relay/frontend/onnx.py

Line 1409 in edfeba5

out = out + _expr.const(beta, dtype=dtype) * inputs[2]

Currently, FQ2I tries to convert this add to qnn.add. But if this add is being used for bias addition, out_t.scale and out_t.zero_point variables in fake_quantization_to_integer.py, which are used to initialize the output scale and zp of the QNN binary operators, can be tensors rather than scalars. QNN binary operators do not support such output qparams, which led to the error reported in #13545.

For this reason, apparently we haven't supported converting dense -> add to qnn.dense -> add -> requantize, when add is a bias add, in FQ2I. The pattern of dense -> nn.bias_add can be converted to qnn.dense -> nn.bias_add -> requantize, but we never use nn.bias_add after dense.

So I added a code path in the FQ2I QNN binary op converter, to identify such patterns and use regular binary ops rather than QNN ones.

cc @AndrewZhaoLuo @Icemist @elvin-n

tvm-bot · 2022-12-07T21:26:14Z

Thanks for contributing to TVM! Please refer to the contributing guidelines https://tvm.apache.org/docs/contribute/ for useful information and tips. Please request code reviews from Reviewers by @-ing them in a comment.

No users to tag found in teams: fq2i _{See #10317 for details}

_{Generated by tvm-bot}

Icemist

LGTM, a little code remark.

python/tvm/relay/frontend/onnx.py

AndrewZhaoLuo · 2022-12-08T20:57:41Z

python/tvm/relay/frontend/onnx.py

@@ -1391,7 +1391,7 @@ def _impl_v1(cls, inputs, attr, params):
        dtype = input0_state.checked_type.dtype
        # Y = alpha * A * B + beta * C
        alpha = float(attr.get("alpha", 1.0))
-        beta = float(attr.get("beta", 1.0))
+        beta = float(attr.get("beta"))


I would keep the original line of .get('beta', 1.0) since you cannot call float() on None which attr.get can return.

Then below on L1409, you can just do if beta is None --> if 'beta' not in attr.keys() or something.

Though the change on L1409 might not be needed since if beta == 1, it can be removed with constant folding.

Constant folding doesn't work when beta is multiplying an output of qnn ops, since we cannot fold over them. The model in #13545 has multiply(1f, dequantize(bias) after dense, which was also causing some issues.

Moved float(beta) to the else block of if beta is None.

Actually the whole purpose of this change was to avoid multiplying by 1.0, since multiply(1f, dequantize(bias) would be converted to qnn.mul(quantize(1), bias) by FQ2I. So I restored the original code cc @Icemist

An alternative would be to add algebraic simplification to the SimpliyfyExpr pass.

python/tvm/relay/transform/fake_quantization_to_integer.py

AndrewZhaoLuo

LGTM! Thanks, though i will wait on @Icemist

… `requantize` (apache#13578) * wip * hack to convert size-1 scale and zp tensors to scalar * fix to binary op fast path * check output zp * add assert * add comment * lint * clean up beta handling * use regular binary op only for 32 bit add (bias addition) * do float(beta) when we know that beta is not None * restore original beta handling code to avoid mul by 1 * add comment on overflow

github-actions bot requested a review from AndrewZhaoLuo December 7, 2022 21:35

Icemist reviewed Dec 8, 2022

View reviewed changes

python/tvm/relay/frontend/onnx.py Show resolved Hide resolved

AndrewZhaoLuo reviewed Dec 8, 2022

View reviewed changes

masahi force-pushed the fq2i-dense-add-fix branch from d411b86 to da99fa5 Compare December 8, 2022 21:55

AndrewZhaoLuo approved these changes Dec 8, 2022

View reviewed changes

masahi added 12 commits December 9, 2022 07:49

wip

ffa05b1

hack to convert size-1 scale and zp tensors to scalar

7b74cb3

fix to binary op fast path

0782fac

check output zp

cab3ed8

add assert

a91c64b

add comment

b0315c3

lint

b8115d9

clean up beta handling

1571d4d

use regular binary op only for 32 bit add (bias addition)

a3bcd08

do float(beta) when we know that beta is not None

a3852c1

restore original beta handling code to avoid mul by 1

6d516cc

add comment on overflow

9c8d26e

masahi force-pushed the fq2i-dense-add-fix branch from bd6e2d2 to 9c8d26e Compare December 8, 2022 22:49

Icemist approved these changes Dec 9, 2022

View reviewed changes

masahi merged commit 02820ad into apache:main Dec 9, 2022

leandron mentioned this pull request Feb 1, 2023

TVM v0.11.0 Release Candidate Notes #13899

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FQ2I] Support converting `dense` -> `add` to `qnn.dense` -> `add` -> `requantize` #13578

[FQ2I] Support converting `dense` -> `add` to `qnn.dense` -> `add` -> `requantize` #13578

masahi commented Dec 7, 2022 •

edited

Loading

tvm-bot commented Dec 7, 2022

Icemist left a comment

AndrewZhaoLuo Dec 8, 2022

AndrewZhaoLuo Dec 8, 2022

masahi Dec 8, 2022

masahi Dec 8, 2022

masahi Dec 8, 2022 •

edited

Loading

AndrewZhaoLuo left a comment •

edited

Loading

[FQ2I] Support converting dense -> add to qnn.dense -> add -> requantize #13578

[FQ2I] Support converting dense -> add to qnn.dense -> add -> requantize #13578

Conversation

masahi commented Dec 7, 2022 • edited Loading

tvm-bot commented Dec 7, 2022

Icemist left a comment

Choose a reason for hiding this comment

AndrewZhaoLuo Dec 8, 2022

Choose a reason for hiding this comment

AndrewZhaoLuo Dec 8, 2022

Choose a reason for hiding this comment

masahi Dec 8, 2022

Choose a reason for hiding this comment

masahi Dec 8, 2022

Choose a reason for hiding this comment

masahi Dec 8, 2022 • edited Loading

Choose a reason for hiding this comment

AndrewZhaoLuo left a comment • edited Loading

Choose a reason for hiding this comment

[FQ2I] Support converting `dense` -> `add` to `qnn.dense` -> `add` -> `requantize` #13578

[FQ2I] Support converting `dense` -> `add` to `qnn.dense` -> `add` -> `requantize` #13578

masahi commented Dec 7, 2022 •

edited

Loading

masahi Dec 8, 2022 •

edited

Loading

AndrewZhaoLuo left a comment •

edited

Loading