From a9fef30fd866f430dee452add9262df66e7ff629 Mon Sep 17 00:00:00 2001 From: AllentDan <41138331+AllentDan@users.noreply.github.com> Date: Mon, 7 Mar 2022 17:22:14 +0800 Subject: [PATCH] Torchscript support (#159) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit * support torchscript * add nms * add torchscript configs and update deploy process and dump-info * typescript -> torchscript * add torchscript custom extension support * add ts custom ops again * support mmseg unet * [WIP] add optimizer for torchscript (#119) * add passes * add python api * Torchscript optimizer python api (#121) * add passes * add python api * use python api instead of executable * Merge Master, update optimizer (#151) * [Feature] add yolox ncnn (#29) * add yolox ncnn * add ncnn android performance of yolox * add ut * fix lint * fix None bugs for ncnn * test codecov * test codecov * add device * fix yapf * remove if-else for img shape * use channelshuffle optimize * change benchmark after channelshuffle * fix yapf * fix yapf * fuse continuous reshape * fix static shape deploy * fix code * drop pad * only static shape * fix static * fix docstring * Added mask overlay to output image, changed fprintf info messages to … (#55) * Added mask overlay to output image, changed fprintf info messages to stdout * Improved box filtering (filter area/score), make sure roi coordinates stay within bounds * clang-format * Support UNet in mmseg (#77) * Repeatdataset in train has no CLASSES & PALETTE * update result for unet * update docstring for mmdet * remove ppl for unet in docs * fix ort wrap about input type (#81) * Fix memleak (#86) * delete [] * fix build error when enble MMDEPLOY_ACTIVE_LEVEL * fix lint * [Doc] Nano benchmark and tutorial (#71) * add cls benchmark * add nano zh-cn benchmark and en tutorial * add device row * add doc path to index.rst * fix typo * [Fix] fix missing deploy_core (#80) * fix missing deploy_core * mv flag to demo * target link * [Docs] Fix links in Chinese doc (#84) * Fix docs in Chinese link * Fix links * Delete symbolic link and add links to html * delete files * Fix link * [Feature] Add docker files (#67) * add gpu and cpu dockerfile * fix lint * fix cpu docker and remove redundant * use pip instead * add build arg and readme * fix grammar * update readme * add chinese doc for dockerfile and add docker build to build.md * grammar * refine dockerfiles * add FAQs * update Dpplcv_DIR for SDK building * remove mmcls * add sdk demos * fix typo and lint * update FAQs * [Fix]fix check_env (#101) * fix check_env * update * Replace convert_syncbatchnorm in mmseg (#93) * replace convert_syncbatchnorm with revert_sync_batchnorm from mmcv * change logger * [Doc] Update FAQ for TensorRT (#96) * update FAQ * comment * [Docs]: Update doc for openvino installation (#102) * fix docs * fix docs * fix docs * fix mmcv version * fix docs * rm blank line * simplify non batch nms (#99) * [Enhacement] Allow test.py to save evaluation results (#108) * Add log file * Delete debug code * Rename logger * resolve comments * [Enhancement] Support mmocr v0.4+ (#115) * support mmocr v0.4+ * 0.4.0 -> 0.4.1 * fix onnxruntime wrapper for gpu inference (#123) * fix ncnn wrapper for ort-gpu * resolve comment * fix lint * Fix typo (#132) * lock mmcls version (#131) * [Enhancement] upgrade isort in pre-commit config (#141) * [Enhancement] upgrade isort in pre-commit config by refering to mmflow pr #87 * fix lint * remove .isort.cfg and put its known_third_party to setup.cfg * Fix ci for mmocr (#144) * fix mmocr unittests * remove useless * lock mmdet maximum version to 2.20 * pip install -U numpy * Fix capture_output (#125) Co-authored-by: hanrui1sensetime <83800577+hanrui1sensetime@users.noreply.github.com> Co-authored-by: Johannes L Co-authored-by: RunningLeon Co-authored-by: VVsssssk <88368822+VVsssssk@users.noreply.github.com> Co-authored-by: lvhan028 Co-authored-by: AllentDan <41138331+AllentDan@users.noreply.github.com> Co-authored-by: Yifan Zhou Co-authored-by: 杨培文 (Yang Peiwen) <915505626@qq.com> Co-authored-by: Semyon Bevzyuk * configs for all tasks * use torchvision roi align * remote unnecessary code * fix ut * fix ut * export * det dynamic * det dynamic * add ut * fix ut * add ut and docs * fix ut * skip torchscript ut if no ops available * add torchscript option to build.md * update benchmark and resolve comments * resolve conflicts * rename configs * fix mrcnn cuda test * remove useless * add version requirements to docs and comments to codes * enable empty image exporting for torchscript and accelerate ORT inference for MRCNN * rebase * update example for torchscript.md * update FAQs for torchscript.md * resolve comments * only use torchvision roi_align for torchscript * fix ut * use torchvision roi align when pool model is avg * resolve comments Co-authored-by: grimoire Co-authored-by: grimoire Co-authored-by: hanrui1sensetime <83800577+hanrui1sensetime@users.noreply.github.com> Co-authored-by: Johannes L Co-authored-by: RunningLeon Co-authored-by: VVsssssk <88368822+VVsssssk@users.noreply.github.com> Co-authored-by: lvhan028 Co-authored-by: Yifan Zhou Co-authored-by: 杨培文 (Yang Peiwen) <915505626@qq.com> Co-authored-by: Semyon Bevzyuk --- .../mmdet/models/roi_heads/single_level_roi_extractor.py | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/mmdeploy/codebase/mmdet/models/roi_heads/single_level_roi_extractor.py b/mmdeploy/codebase/mmdet/models/roi_heads/single_level_roi_extractor.py index f91ca48ad0..57a007443f 100644 --- a/mmdeploy/codebase/mmdet/models/roi_heads/single_level_roi_extractor.py +++ b/mmdeploy/codebase/mmdet/models/roi_heads/single_level_roi_extractor.py @@ -147,6 +147,11 @@ def single_roi_extractor__forward(ctx, device=target_lvls.device) target_lvls = torch.cat((_tmp, _tmp, target_lvls)) for i in range(num_levels): + # use the roi align in torhcvision to accelerate the inference + # roi_align in MMCV is same as torchvision when pool mode is 'avg' + if backend == Backend.TORCHSCRIPT or self.roi_layers[ + i].pool_mode == 'avg': + self.roi_layers[i].use_torchvision = True mask = target_lvls == i inds = mask.nonzero(as_tuple=False).squeeze(1) rois_t = rois[inds]