You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, we have developed a tool called onnxslim, which can help slim exported onnx model.
pip install onnxslim
# bash
onnxslim raw_onnx_model slimmed_onnx_model --skip_fusion_patterns FusionGelu # low onnxruntime version may not support Gelu.
# python
import onnx
from onnxslim import slim
onnx_model = "your_onnx_model.onnx"
slimmed_model = slim(onnx_model)
onnx.save(slimmed_model, "slimmed_onnx_model.onnx")
Motivation
I want to slim onnx model so we can achieve better performance, I have tested provided cases, and after onnxslim, we can achieve about 3% performance gain.
import time
import requests
from PIL import Image
from optimum.onnxruntime import ORTModelForImageClassification
from transformers import AutoFeatureExtractor
url = "http://images.cocodataset.org/val2017/000000039769.jpg"
image = Image.open(requests.get(url, stream=True).raw)
preprocessor = AutoFeatureExtractor.from_pretrained("optimum/vit-base-patch16-224")
model = ORTModelForImageClassification.from_pretrained("optimum/vit-base-patch16-224")
inputs = preprocessor(images=image, return_tensors="pt")
warmup_runs = 5
actual_runs = 100
for _ in range(warmup_runs):
outputs = model(**inputs)
# Actual timing phase
start_time = time.time()
for _ in range(actual_runs):
outputs = model(**inputs)
end_time = time.time()
# Calculate average time per run
total_time = end_time - start_time
average_time_per_run = total_time / actual_runs
print("Average time per run: {:.6f} seconds".format(average_time_per_run))
logits = outputs.logits
with slimmed model: Average time per run: 0.246237 seconds
without slimmed model: Average time per run: 0.253707 seconds
Your contribution
I can submit a pr, and help slim existed onnx models.
The text was updated successfully, but these errors were encountered:
Feature request
Hi, we have developed a tool called onnxslim, which can help slim exported onnx model.
Motivation
I want to slim onnx model so we can achieve better performance, I have tested provided cases, and after onnxslim, we can achieve about 3% performance gain.
with slimmed model: Average time per run: 0.246237 seconds
without slimmed model: Average time per run: 0.253707 seconds
Your contribution
I can submit a pr, and help slim existed onnx models.
The text was updated successfully, but these errors were encountered: