-
Notifications
You must be signed in to change notification settings - Fork 489
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
"nodes-to-exclude" in quantization doesnt work #1418
Comments
Hi @marziye-A I took a look and I think the issue is that your import tempfile
from pathlib import Path
from onnx import load as onnx_load
from optimum.onnxruntime import ORTModelForAudioClassification, ORTQuantizer
from optimum.onnxruntime.configuration import QuantizationConfig, QuantFormat, QuantizationMode, QuantType
qconfig = QuantizationConfig(
is_static=False,
format=QuantFormat.QDQ,
mode=QuantizationMode.IntegerOps,
per_channel=True,
weights_dtype=QuantType.QUInt8,
nodes_to_exclude=["/wav2vec2/feature_extractor/conv_layers.0/conv/Conv"] # <-- Node from ONNX graph
)
with tempfile.TemporaryDirectory() as tmp_dir:
output_dir = Path(tmp_dir)
model = ORTModelForAudioClassification.from_pretrained("hf-internal-testing/tiny-random-wav2vec2", export=True)
quantizer = ORTQuantizer.from_pretrained(model)
quantizer.quantize(
save_dir=output_dir,
quantization_config=qconfig,
)
quantized_model = onnx_load(output_dir.joinpath("model_quantized.onnx"))
node_list = [node.name for node in quantized_model.graph.node]
print(node_list) I don't know what your exact use-case is, but if you want to exclude the |
Hi @marziye-A, apology for the late reply and thank you @baskrahmer for the correct answer! I will improve the documentation in this regard. An alternative to make it easier to exclude part of a model from quantization may be using |
thank you very much for your answer, |
As far as I know, you can add all nodes of the model. |
hi
why "nodes-to-exclude" in quantization doesn't work?
my code is as below:
does any one know the reason?
any help is really appreciated!
The text was updated successfully, but these errors were encountered: