STVM docs

octoml · Dec 14, 2021 · 9579b7a · 9579b7a
1 parent 792817b
commit 9579b7a
Showing 1 changed file with 102 additions and 0 deletions.
diff --git a/docs/STVM_EP.md b/docs/STVM_EP.md
@@ -0,0 +1,102 @@
+# Singletone TVM (STVM) Execution Provider
+
+## Contents
+
+- [Introduction](#introduction)
+- [Build](#build)
+- [Configuration options](#configuration-option)
+- [Performance Tuning](#performance-tuning)
+- [Samples](#samples)
+- [Known issues](#known-issues)
+
+
+## Introduction
+Some description
+
+## Build
+There are two steps to build ONNX runtime with STVM EP. Initially TVM should be build, after that ONNX runtime should.
+Important note is that both TVM and ORT with STVM use python API, therefore the python packages shoud be reinstall or PYTHONPATH should be changed accordingly for corкect work.
+
+### Prerequisites
+Initially TVM and it's dependencies should be installed:<br />
+
+TVM dependencies installation:<br />
+`apt-get install -y python3 python3-dev python3-pip python3-setuptools gcc libtinfo-dev zlib1g-dev build-essential cmake libedit-dev libxml2-dev llvm`<br />
+`pip3 install numpy decorator attrs`<br />
+TVM installation from tvm_update folder:
+```
+cd onnxruntime/cmake/external/tvm_update/
+mkdir build
+cd ./build
+cmake -DCMAKE_BUILD_TYPE=Release -DUSE_LLVM=ON (or -DUSE_CUDA=ON) ..
+make -j8
+```
+Add correct python path to environment for TVM python API:
+```
+export TVM_HOME=<path_to_msft_onnxrt>/onnxruntime/cmake/external/tvm_update
+export PYTHONPATH=$TVM_HOME/python:${PYTHONPATH}
+```
+More detailed TVM install instructions can be seen [here](https://tvm.apache.org/docs/install/from_source.html)
+
+### Build ONNX runtime with STVM Execution Provider
+Build onnxruntime:
+```
+./build.sh --config Release --enable_pybind --build_wheel --skip_tests --parallel --use_stvm --skip_onnx_tests
+```
+Build python API for onnxruntime instead of current one from standard package. This step can be skipped if shell variables export is used (see below)
+```
+pip3 uninstall onnxruntime onnxruntime-stvm -y
+whl_path=$(find ./onnxruntime/build/Linux/Release/dist -name "*.whl")
+python3 -m pip install $whl_path
+```
+It can be used instead of whl-package install:
+```
+export ORT_PYTHON_HOME=<path_to_msft_onnxrt>/onnxruntime/build/Linux/Release
+export PYTHONPATH=$ORT_PYTHON_HOME:${PYTHONPATH}
+```
+
+## Configuration options
+Model compilation by TVM inside ONNX runtime can be controlled by provider options:
+```
+po = [dict(target=client_target,
+           target_host=client_target_host,
+           opt_level=client_opt_level,
+           freeze_weights=freeze,
+           tuning_file_path=client_tuning_logfile,
+           input_names = input_names_str,
+           input_shapes = input_shapes_str)]
+stvm_session = onnxruntime.InferenceSession(model_path, providers=["StvmExecutionProvider"], provider_options=po)
+```
+- `target` and `target_host` are strings like in TVM (e.g. "llvm --mcpu=avx2")
+- `opt_level` is TVM optimization level. It is 3 by default
+- `freeze_weights` means that all model weights are kept on compilation stage otherwise they are downloaded each inference. True is recommended value for the best performance. It is true by default.
+- `tuning_file_path` is path to AutoTVM tuning file which gives specifications for given model and target for the best performance.
+- TVM supports models with fixed graph only. If you have model with unknown dimensions in input shape excluding batch size you can insert fixed values by `input_names` and `input_shapes` provider options. Due to specific of provider options parser inside ORT they are string with the following format:
+```
+input_names = "input_1 input_2"
+input_shapes = "[1 3 224 224] [1 2]"
+```
+
+## Performance Tuning
+As it was said above for the best model performance the tuning log file can be used. But due to some preprocessing of onnx model inside ONNX runtime before TVM gets it there can be differences between tuned model and obtained one. To avoid ONNX runtime preprocessing stage the session options can be used:
+```
+so = onnxruntime.SessionOptions()
+so.graph_optimization_level = onnxruntime.GraphOptimizationLevel.ORT_DISABLE_ALL
+
+stvm_session = onnxruntime.InferenceSession(model_path, sess_options=so, providers=["StvmExecutionProvider"], provider_options=po)
+```
+
+## Samples
+There is hyperlink to python notebook with ResNet50 inference.
+
+## Known issues
+- Now ONNX runtime can be build on UNIX/Linux system.
+- TVM strongly follows to ops supported by ONNX, but can not support some newest ones.
+- There can be issue related to compatibility of onnx and Google protobuf. `AttributeError: module 'google.protobuf.internal.containers' has no attribute 'MutableMapping'` error can be caught during `import onnx` in any python scripts for protobuf version >= 3.19.0 and onnx version <= 1.8.1. To resolve the issue Google protobuf and onnx can be reinstalled separately or together due to conflict between onnx and protobuf versions:
+```
+pip3 uninstall onnx -y
+pip3 install onnx==1.10.1
+pip3 uninstall protobuf -y
+pip3 install protobuf==3.19.1
+```
+Two pairs of compartible protobuf and onnx versions: 3.17.3 and 1.8.0, and 1.19.1 and 1.10.1, correspondingly.