-
Notifications
You must be signed in to change notification settings - Fork 643
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
* support torchscript * add nms * add torchscript configs and update deploy process and dump-info * typescript -> torchscript * add torchscript custom extension support * add ts custom ops again * support mmseg unet * [WIP] add optimizer for torchscript (#119) * add passes * add python api * Torchscript optimizer python api (#121) * add passes * add python api * use python api instead of executable * Merge Master, update optimizer (#151) * [Feature] add yolox ncnn (#29) * add yolox ncnn * add ncnn android performance of yolox * add ut * fix lint * fix None bugs for ncnn * test codecov * test codecov * add device * fix yapf * remove if-else for img shape * use channelshuffle optimize * change benchmark after channelshuffle * fix yapf * fix yapf * fuse continuous reshape * fix static shape deploy * fix code * drop pad * only static shape * fix static * fix docstring * Added mask overlay to output image, changed fprintf info messages to … (#55) * Added mask overlay to output image, changed fprintf info messages to stdout * Improved box filtering (filter area/score), make sure roi coordinates stay within bounds * clang-format * Support UNet in mmseg (#77) * Repeatdataset in train has no CLASSES & PALETTE * update result for unet * update docstring for mmdet * remove ppl for unet in docs * fix ort wrap about input type (#81) * Fix memleak (#86) * delete [] * fix build error when enble MMDEPLOY_ACTIVE_LEVEL * fix lint * [Doc] Nano benchmark and tutorial (#71) * add cls benchmark * add nano zh-cn benchmark and en tutorial * add device row * add doc path to index.rst * fix typo * [Fix] fix missing deploy_core (#80) * fix missing deploy_core * mv flag to demo * target link * [Docs] Fix links in Chinese doc (#84) * Fix docs in Chinese link * Fix links * Delete symbolic link and add links to html * delete files * Fix link * [Feature] Add docker files (#67) * add gpu and cpu dockerfile * fix lint * fix cpu docker and remove redundant * use pip instead * add build arg and readme * fix grammar * update readme * add chinese doc for dockerfile and add docker build to build.md * grammar * refine dockerfiles * add FAQs * update Dpplcv_DIR for SDK building * remove mmcls * add sdk demos * fix typo and lint * update FAQs * [Fix]fix check_env (#101) * fix check_env * update * Replace convert_syncbatchnorm in mmseg (#93) * replace convert_syncbatchnorm with revert_sync_batchnorm from mmcv * change logger * [Doc] Update FAQ for TensorRT (#96) * update FAQ * comment * [Docs]: Update doc for openvino installation (#102) * fix docs * fix docs * fix docs * fix mmcv version * fix docs * rm blank line * simplify non batch nms (#99) * [Enhacement] Allow test.py to save evaluation results (#108) * Add log file * Delete debug code * Rename logger * resolve comments * [Enhancement] Support mmocr v0.4+ (#115) * support mmocr v0.4+ * 0.4.0 -> 0.4.1 * fix onnxruntime wrapper for gpu inference (#123) * fix ncnn wrapper for ort-gpu * resolve comment * fix lint * Fix typo (#132) * lock mmcls version (#131) * [Enhancement] upgrade isort in pre-commit config (#141) * [Enhancement] upgrade isort in pre-commit config by refering to mmflow pr #87 * fix lint * remove .isort.cfg and put its known_third_party to setup.cfg * Fix ci for mmocr (#144) * fix mmocr unittests * remove useless * lock mmdet maximum version to 2.20 * pip install -U numpy * Fix capture_output (#125) Co-authored-by: hanrui1sensetime <[email protected]> Co-authored-by: Johannes L <[email protected]> Co-authored-by: RunningLeon <[email protected]> Co-authored-by: VVsssssk <[email protected]> Co-authored-by: lvhan028 <[email protected]> Co-authored-by: AllentDan <[email protected]> Co-authored-by: Yifan Zhou <[email protected]> Co-authored-by: 杨培文 (Yang Peiwen) <[email protected]> Co-authored-by: Semyon Bevzyuk <[email protected]> * configs for all tasks * use torchvision roi align * remote unnecessary code * fix ut * fix ut * export * det dynamic * det dynamic * add ut * fix ut * add ut and docs * fix ut * skip torchscript ut if no ops available * add torchscript option to build.md * update benchmark and resolve comments * resolve conflicts * rename configs * fix mrcnn cuda test * remove useless * add version requirements to docs and comments to codes * enable empty image exporting for torchscript and accelerate ORT inference for MRCNN * rebase * update example for torchscript.md * update FAQs for torchscript.md * resolve comments * only use torchvision roi_align for torchscript * fix ut * use torchvision roi align when pool model is avg * resolve comments Co-authored-by: grimoire <[email protected]> Co-authored-by: grimoire <[email protected]> Co-authored-by: hanrui1sensetime <[email protected]> Co-authored-by: Johannes L <[email protected]> Co-authored-by: RunningLeon <[email protected]> Co-authored-by: VVsssssk <[email protected]> Co-authored-by: lvhan028 <[email protected]> Co-authored-by: Yifan Zhou <[email protected]> Co-authored-by: 杨培文 (Yang Peiwen) <[email protected]> Co-authored-by: Semyon Bevzyuk <[email protected]>
- Loading branch information
1 parent
636a97f
commit 1d1bab8
Showing
62 changed files
with
1,582 additions
and
245 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
backend_config = dict(type='torchscript') |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
ir_config = dict( | ||
type='torchscript', | ||
save_file='end2end.pt', | ||
input_names=['input'], | ||
output_names=['output'], | ||
input_shape=None) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
_base_ = [ | ||
'../_base_/torchscript_config.py', '../_base_/backends/torchscript.py' | ||
] | ||
|
||
ir_config = dict(input_shape=None) | ||
codebase_config = dict(type='mmcls', task='Classification') |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
_base_ = ['./base_torchscript.py'] | ||
|
||
ir_config = dict(output_names=['dets', 'labels', 'masks']) | ||
codebase_config = dict(post_processing=dict(export_postprocess_mask=False)) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,16 @@ | ||
_base_ = ['../../_base_/torchscript_config.py'] | ||
|
||
ir_config = dict(output_names=['dets', 'labels']) | ||
codebase_config = dict( | ||
type='mmdet', | ||
task='ObjectDetection', | ||
model_type='end2end', | ||
post_processing=dict( | ||
score_threshold=0.05, | ||
confidence_threshold=0.005, # for YOLOv3 | ||
iou_threshold=0.5, | ||
max_output_boxes_per_class=200, | ||
pre_top_k=5000, | ||
keep_top_k=100, | ||
background_label_id=-1, | ||
)) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
_base_ = [ | ||
'../_base_/base_torchscript.py', '../../_base_/backends/torchscript.py' | ||
] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
_base_ = [ | ||
'../_base_/base_instance-seg_torchscript.py', | ||
'../../_base_/backends/torchscript.py' | ||
] |
7 changes: 7 additions & 0 deletions
7
configs/mmedit/super-resolution/super-resolution_torchscript.py
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
_base_ = [ | ||
'../../_base_/torchscript_config.py', | ||
'../../_base_/backends/torchscript.py' | ||
] | ||
|
||
ir_config = dict(input_shape=None) | ||
codebase_config = dict(type='mmedit', task='SuperResolution') |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
_base_ = [ | ||
'../../_base_/torchscript_config.py', | ||
'../../_base_/backends/torchscript.py' | ||
] | ||
|
||
ir_config = dict(input_shape=None) | ||
codebase_config = dict(type='mmocr', task='TextDetection') |
7 changes: 7 additions & 0 deletions
7
configs/mmocr/text-recognition/text-recognition_torchscript.py
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
_base_ = [ | ||
'../../_base_/torchscript_config.py', | ||
'../../_base_/backends/torchscript.py' | ||
] | ||
|
||
ir_config = dict(input_shape=None) | ||
codebase_config = dict(type='mmocr', task='TextRecognition') |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
_base_ = [ | ||
'../_base_/torchscript_config.py', '../_base_/backends/torchscript.py' | ||
] | ||
|
||
ir_config = dict(input_shape=None) | ||
codebase_config = dict(type='mmseg', task='Segmentation') |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
94 changes: 94 additions & 0 deletions
94
csrc/backend_ops/common/modulated_deform_conv/common_cuda_helper.cuh
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,94 @@ | ||
// Copyright (c) OpenMMLab. All rights reserved. | ||
#ifndef COMMON_CUDA_HELPER | ||
#define COMMON_CUDA_HELPER | ||
|
||
#include <cublas_v2.h> | ||
#include <cuda.h> | ||
|
||
#include <algorithm> | ||
|
||
#define CUDA_1D_KERNEL_LOOP(i, n) \ | ||
for (int i = blockIdx.x * blockDim.x + threadIdx.x; i < (n); i += blockDim.x * gridDim.x) | ||
|
||
#define THREADS_PER_BLOCK 512 | ||
|
||
#define DIVUP(m, n) ((m) / (n) + ((m) % (n) > 0)) | ||
inline int GET_BLOCKS(const int N) { | ||
int optimal_block_num = DIVUP(N, THREADS_PER_BLOCK); | ||
int max_block_num = 4096; | ||
return std::min(optimal_block_num, max_block_num); | ||
} | ||
|
||
#define cudaCheckError() \ | ||
{ \ | ||
cudaError_t e = cudaGetLastError(); \ | ||
if (e != cudaSuccess) { \ | ||
printf("Cuda failure %s:%d: '%s'\n", __FILE__, __LINE__, cudaGetErrorString(e)); \ | ||
exit(0); \ | ||
} \ | ||
} | ||
|
||
/** | ||
* Returns a view of the original tensor with its dimensions permuted. | ||
* | ||
* @param[out] dst pointer to the destination tensor | ||
* @param[in] src pointer to the source tensor | ||
* @param[in] src_size shape of the src tensor | ||
* @param[in] permute The desired ordering of dimensions | ||
* @param[in] src_dim dim of src tensor | ||
* @param[in] stream cuda stream handle | ||
*/ | ||
template <class scalar_t> | ||
void memcpyPermute(scalar_t* dst, const scalar_t* src, int* src_size, int* permute, int src_dim, | ||
cudaStream_t stream = 0); | ||
|
||
template <typename scalar_t> | ||
cublasStatus_t cublasGemmWrap(cublasHandle_t handle, cublasOperation_t transa, | ||
cublasOperation_t transb, int m, int n, int k, const scalar_t* alpha, | ||
const scalar_t* A, int lda, const scalar_t* B, int ldb, | ||
const scalar_t* beta, scalar_t* C, int ldc); | ||
|
||
template <typename scalar_t> | ||
__device__ scalar_t bilinear_interpolate(const scalar_t* input, const int height, const int width, | ||
scalar_t y, scalar_t x) { | ||
// deal with cases that inverse elements are out of feature map boundary | ||
if (y < -1.0 || y > height || x < -1.0 || x > width) return 0; | ||
|
||
if (y <= 0) y = 0; | ||
if (x <= 0) x = 0; | ||
|
||
int y_low = (int)y; | ||
int x_low = (int)x; | ||
int y_high; | ||
int x_high; | ||
|
||
if (y_low >= height - 1) { | ||
y_high = y_low = height - 1; | ||
y = (scalar_t)y_low; | ||
} else { | ||
y_high = y_low + 1; | ||
} | ||
|
||
if (x_low >= width - 1) { | ||
x_high = x_low = width - 1; | ||
x = (scalar_t)x_low; | ||
} else { | ||
x_high = x_low + 1; | ||
} | ||
|
||
scalar_t ly = y - y_low; | ||
scalar_t lx = x - x_low; | ||
scalar_t hy = 1. - ly, hx = 1. - lx; | ||
// do bilinear interpolation | ||
scalar_t v1 = input[y_low * width + x_low]; | ||
scalar_t v2 = input[y_low * width + x_high]; | ||
scalar_t v3 = input[y_high * width + x_low]; | ||
scalar_t v4 = input[y_high * width + x_high]; | ||
scalar_t w1 = hy * hx, w2 = hy * lx, w3 = ly * hx, w4 = ly * lx; | ||
|
||
scalar_t val = (w1 * v1 + w2 * v2 + w3 * v3 + w4 * v4); | ||
|
||
return val; | ||
} | ||
|
||
#endif // COMMON_CUDA_HELPER |
82 changes: 82 additions & 0 deletions
82
csrc/backend_ops/common/modulated_deform_conv/modulated_deform_conv_cpu.h
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,82 @@ | ||
#include <cmath> | ||
#include <cstdint> | ||
|
||
template <typename T> | ||
T bilinear_interpolate_2d(const T *src, const int64_t src_h, const int64_t src_w, const T h, | ||
const T w) { | ||
if (h <= -1 || src_h <= h || w <= -1 || src_w <= w) { | ||
return 0; | ||
} | ||
|
||
int64_t h_low = floor(h); | ||
int64_t w_low = floor(w); | ||
int64_t h_high = h_low + 1; | ||
int64_t w_high = w_low + 1; | ||
|
||
T lh = h - h_low; | ||
T lw = w - w_low; | ||
T hh = 1 - lh; | ||
T hw = 1 - lw; | ||
|
||
T v1 = 0; | ||
if (h_low >= 0 && w_low >= 0) v1 = src[h_low * src_w + w_low]; | ||
T v2 = 0; | ||
if (h_low >= 0 && w_high <= src_w - 1) v2 = src[h_low * src_w + w_high]; | ||
T v3 = 0; | ||
if (h_high <= src_h - 1 && w_low >= 0) v3 = src[h_high * src_w + w_low]; | ||
T v4 = 0; | ||
if (h_high <= src_h - 1 && w_high <= src_w - 1) v4 = src[h_high * src_w + w_high]; | ||
|
||
T w1 = hh * hw, w2 = hh * lw, w3 = lh * hw, w4 = lh * lw; | ||
|
||
T val = (w1 * v1 + w2 * v2 + w3 * v3 + w4 * v4); | ||
return val; | ||
} | ||
|
||
// output: (channels * kernel_h * kernel_w, dst_h * dst_w) | ||
template <typename T> | ||
void deformable_im2col_2d(const T *input, const T *offset, const T *mask, const int64_t src_h, | ||
const int64_t src_w, const int64_t kernel_h, const int64_t kernel_w, | ||
const int64_t pad_h, const int64_t pad_w, const int64_t stride_h, | ||
const int64_t stride_w, const int64_t dilation_h, | ||
const int64_t dilation_w, const int64_t channels, | ||
const int64_t offset_groups, const int64_t dst_h, const int64_t dst_w, | ||
const bool use_mask, T *columns) { | ||
const int64_t workload = channels * dst_h * dst_w; | ||
for (int64_t index = 0; index != workload; ++index) { | ||
const int64_t ow = index % dst_w; | ||
const int64_t oh = (index / dst_w) % dst_h; | ||
const int64_t ic = index / (dst_w * dst_h); | ||
const int64_t oc = ic * kernel_h * kernel_w; | ||
|
||
int64_t c_per_offset_grp = channels / offset_groups; | ||
const int64_t grp_idx = ic / c_per_offset_grp; | ||
|
||
auto columns_ptr = columns + (oc * (dst_h * dst_w) + oh * dst_w + ow); | ||
auto input_ptr = input + ic * (src_h * src_w); | ||
auto offset_ptr = offset + grp_idx * 2 * kernel_h * kernel_w * dst_h * dst_w; | ||
auto mask_ptr = mask; | ||
if (use_mask) { | ||
mask_ptr += grp_idx * kernel_h * kernel_w * dst_h * dst_w; | ||
} | ||
|
||
for (int64_t kh = 0; kh < kernel_h; ++kh) { | ||
for (int64_t kw = 0; kw < kernel_w; ++kw) { | ||
const int64_t mask_idx = kh * kernel_w + kw; | ||
const int64_t offset_idx = 2 * mask_idx; | ||
|
||
T mask_value = 1; | ||
if (use_mask) { | ||
mask_value = mask_ptr[mask_idx * (dst_h * dst_w) + oh * dst_w + ow]; | ||
} | ||
|
||
const T offset_h = offset_ptr[offset_idx * (dst_h * dst_w) + oh * dst_w + ow]; | ||
const T offset_w = offset_ptr[(offset_idx + 1) * (dst_h * dst_w) + oh * dst_w + ow]; | ||
const T ih = (oh * stride_h - pad_h) + kh * dilation_h + offset_h; | ||
const T iw = (ow * stride_w - pad_w) + kw * dilation_w + offset_w; | ||
*columns_ptr = mask_value * bilinear_interpolate_2d<T>(input_ptr, src_h, src_w, ih, iw); | ||
columns_ptr += dst_h * dst_w; | ||
} | ||
} | ||
} | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.