Skip to content

Commit

Permalink
Merge branch 'develop' into resnext-atss
Browse files Browse the repository at this point in the history
  • Loading branch information
jaegukhyun authored Jul 7, 2023
2 parents f54dd48 + 7027132 commit 761806d
Show file tree
Hide file tree
Showing 206 changed files with 4,544 additions and 1,961 deletions.
5 changes: 3 additions & 2 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ All notable changes to this project will be documented in this file.
- Add per-class XAI saliency maps for Mask R-CNN model (https://github.com/openvinotoolkit/training_extensions/pull/2227)
- Add new object detector Deformable DETR (<https://github.com/openvinotoolkit/training_extensions/pull/2249>)
- Add new object detector DINO(<https://github.com/openvinotoolkit/training_extensions/pull/2266>)
- Add new visual prompting task (https://github.com/openvinotoolkit/training_extensions/pull/2203)
- Add new visual prompting task (https://github.com/openvinotoolkit/training_extensions/pull/2203), (https://github.com/openvinotoolkit/training_extensions/pull/2274)
- Add new object detector ResNeXt101-ATSS (<https://github.com/openvinotoolkit/training_extensions/pull/2309>)

### Enhancements
Expand All @@ -22,6 +22,7 @@ All notable changes to this project will be documented in this file.
- Set persistent_workers and pin_memory as True in detection task (<https://github.com/openvinotoolkit/training_extensions/pull/2224>)
- New algorithm for Semi-SL semantic segmentation based on metric lerning via class prototypes (https://github.com/openvinotoolkit/training_extensions/pull/2156)
- Self-SL for classification now can recieve just folder with any images to start contrastive pretraining (https://github.com/openvinotoolkit/training_extensions/pull/2219)
- Update OpenVINO version to 2023.0, and NNCF verion to 2.5 (<https://github.com/openvinotoolkit/training_extensions/pull/2090>)
- Improve XAI saliency map generation for tiling detection and tiling instance segmentation (https://github.com/openvinotoolkit/training_extensions/pull/2240)

### Bug fixes
Expand All @@ -31,7 +32,7 @@ All notable changes to this project will be documented in this file.

### Known issues

- OpenVINO(==2022.3) IR inference is not working well on 2-stage models (e.g. Mask-RCNN) exported from torch==1.13.1
- OpenVINO(==2023.0) IR inference is not working well on 2-stage models (e.g. Mask-RCNN) exported from torch==1.13.1

## \[v1.3.1\]

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -4,14 +4,14 @@ Models Optimization
OpenVINO™ Training Extensions provides two types of optimization algorithms: `Post-training Optimization Tool (POT) <https://docs.openvino.ai/latest/pot_introduction.html#doxid-pot-introduction>`_ and `Neural Network Compression Framework (NNCF) <https://github.com/openvinotoolkit/nncf>`_.

*******************************
Post-training Optimization Tool
Post-training Optimization Tool
*******************************

POT is designed to optimize the inference of models by applying post-training methods that do not require model retraining or fine-tuning. If you want to know more details about how POT works and to be more familiar with model optimization methods, please refer to `documentation <https://docs.openvino.ai/latest/pot_introduction.html#doxid-pot-introduction>`_.

To run Post-training optimization it is required to convert the model to OpenVINO™ intermediate representation (IR) first. To perform fast and accurate quantization we use ``DefaultQuantization Algorithm`` for each task. Please, see the `DefaultQuantization Parameters <https://docs.openvino.ai/latest/pot_compression_algorithms_quantization_default_README.html#doxid-pot-compression-algorithms-quantization-default-r-e-a-d-m-e>`_ for further information about configuring the optimization.

POT parameters can be found and configured in ``template.yaml`` and ``configuration.yaml`` for each task. For Anomaly and Semantic Segmentation tasks, we have separate configuration files for POT, that can be found in the same directory with ``template.yaml``, for example for `PaDiM <https://github.com/openvinotoolkit/training_extensions/blob/develop/src/otx/algorithms/anomaly/configs/classification/padim/pot_optimization_config.json>`_, `OCR-Lite-HRNe-18-mod2 <https://github.com/openvinotoolkit/training_extensions/blob/develop/src/otx/algorithms/segmentation/configs/ocr_lite_hrnet_18_mod2/pot_optimization_config.json>`_ model.
POT parameters can be found and configured in ``template.yaml`` and ``configuration.yaml`` for each task. For Anomaly and Semantic Segmentation tasks, we have separate configuration files for POT, that can be found in the same directory with ``template.yaml``, for example for `PaDiM <https://github.com/openvinotoolkit/training_extensions/blob/develop/src/otx/algorithms/anomaly/configs/classification/padim/ptq_optimization_config.py>`_, `OCR-Lite-HRNe-18-mod2 <https://github.com/openvinotoolkit/training_extensions/blob/develop/src/otx/algorithms/segmentation/configs/ocr_lite_hrnet_18_mod2/ptq_optimization_config.py>`_ model.

************************************
Neural Network Compression Framework
Expand All @@ -23,8 +23,8 @@ The process of optimization is controlled by the NNCF configuration file. A JSON
You can refer to configuration files for default templates for each task accordingly: `Classification <https://github.com/openvinotoolkit/training_extensions/blob/develop/src/otx/algorithms/classification/configs/efficientnet_b0_cls_incr/compression_config.json>`_, `Object Detection <https://github.com/openvinotoolkit/training_extensions/blob/develop/src/otx/algorithms/detection/configs/detection/mobilenetv2_atss/compression_config.json>`_, `Semantic segmentation <https://github.com/openvinotoolkit/training_extensions/blob/develop/src/otx/algorithms/segmentation/configs/ocr_lite_hrnet_18_mod2/compression_config.json>`_, `Instance segmentation <https://github.com/openvinotoolkit/training_extensions/blob/develop/src/otx/algorithms/detection/configs/instance_segmentation/efficientnetb2b_maskrcnn/compression_config.json>`_, `Anomaly classification <https://github.com/openvinotoolkit/training_extensions/blob/develop/src/otx/algorithms/anomaly/configs/classification/padim/compression_config.json>`_, `Anomaly Detection <https://github.com/openvinotoolkit/training_extensions/blob/develop/src/otx/algorithms/anomaly/configs/detection/padim/compression_config.json>`_, `Anomaly segmentation <https://github.com/openvinotoolkit/training_extensions/blob/develop/src/otx/algorithms/anomaly/configs/segmentation/padim/compression_config.json>`_. Configs for other templates can be found in the same directory.


NNCF tends to provide better quality in terms of preserving accuracy as it uses training compression approaches.
Compression results achievable with the NNCF can be found `here <https://github.com/openvinotoolkit/nncf#nncf-compressed-model-zoo>`_ .
NNCF tends to provide better quality in terms of preserving accuracy as it uses training compression approaches.
Compression results achievable with the NNCF can be found `here <https://github.com/openvinotoolkit/nncf#nncf-compressed-model-zoo>`_ .
Meanwhile, the POT is faster but can degrade accuracy more than the training-enabled approach.

.. note::
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -58,15 +58,21 @@ Models

We support the following ready-to-use model templates:

+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------+---------------------+-----------------+
| Template ID | Name | Complexity (GFLOPs) | Model size (MB) |
+================================================================================================================================================================================================================================================+============================+=====================+=================+
| `Custom_Counting_Instance_Segmentation_MaskRCNN_EfficientNetB2B <https://github.com/openvinotoolkit/training_extensions/blob/develop/src/otx/algorithms/detection/configs/instance_segmentation/efficientnetb2b_maskrcnn/template.yaml>`_ | MaskRCNN-EfficientNetB2B | 68.48 | 13.27 |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------+---------------------+-----------------+
| `Custom_Counting_Instance_Segmentation_MaskRCNN_ResNet50 <https://github.com/openvinotoolkit/training_extensions/blob/develop/src/otx/algorithms/detection/configs/instance_segmentation/resnet50_maskrcnn/template.yaml>`_ | MaskRCNN-ResNet50 | 533.80 | 177.90 |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------+---------------------+-----------------+
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------+---------------------+-----------------+
| Template ID | Name | Complexity (GFLOPs) | Model size (MB) |
+============================================================================================================================================================================================================================================+============================+=====================+=================+
| `Custom_Counting_Instance_Segmentation_MaskRCNN_EfficientNetB2B <https://github.com/openvinotoolkit/training_extensions/blob/develop/src/otx/algorithms/detection/configs/instance_segmentation/efficientnetb2b_maskrcnn/template.yaml>`_ | MaskRCNN-EfficientNetB2B | 68.48 | 13.27 |
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------+---------------------+-----------------+
| `Custom_Counting_Instance_Segmentation_MaskRCNN_ResNet50 <https://github.com/openvinotoolkit/training_extensions/blob/develop/src/otx/algorithms/detection/configs/instance_segmentation/resnet50_maskrcnn/template.yaml>`_ | MaskRCNN-ResNet50 | 533.80 | 177.90 |
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------+---------------------+-----------------+
| `Custom_Counting_Instance_Segmentation_MaskRCNN_ConvNeXt <https://github.com/openvinotoolkit/training_extensions/blob/develop/src/otx/algorithms/detection/configs/instance_segmentation/convnext_maskrcnn/template.yaml>`_ | MaskRCNN-ConvNeXt | 266.78 | 192.4 |
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------+---------------------+-----------------+

``MaskRCNN-ResNet50`` uses `ResNet-50 <https://arxiv.org/abs/1512.03385>`_ as the backbone network for the image features extraction. It has more parameters and FLOPs and needs more time to train, meanwhile providing superior performance in terms of accuracy. ``MaskRCNN-EfficientNetB2B`` uses `EfficientNet-B2 <https://arxiv.org/abs/1905.11946>`_ as the backbone network. It is a good trade-off between accuracy and speed. It is a better choice when training time and computational cost are in priority.
MaskRCNN-ResNet50 utilizes the `ResNet-50 <https://arxiv.org/abs/1512.03385>`_ architecture as the backbone network for extracting image features. This choice of backbone network results in a higher number of parameters and FLOPs, which consequently requires more training time. However, the model offers superior performance in terms of accuracy.

On the other hand, MaskRCNN-EfficientNetB2B employs the `EfficientNet-B2 <https://arxiv.org/abs/1905.11946>`_ architecture as the backbone network. This selection strikes a balance between accuracy and speed, making it a preferable option when prioritizing training time and computational cost.

Recently, we have made updates to MaskRCNN-ConvNeXt, incorporating the `ConvNeXt backbone <https://arxiv.org/abs/2201.03545>`_. Through our experiments, we have observed that this variant achieves better accuracy compared to MaskRCNN-ResNet50 while utilizing less GPU memory. However, it is important to note that the training time and inference duration may slightly increase. If minimizing training time is a significant concern, we recommend considering a switch to MaskRCNN-EfficientNetB2B.

.. In the table below the `mAP <https://en.wikipedia.org/wiki/S%C3%B8rensen%E2%80%93Dice_coefficient>`_ metric on some academic datasets using our :ref:`supervised pipeline <instance_segmentation_supervised_pipeline>` is presented. The results were obtained on our templates without any changes. We use 1024x1024 image resolution, for other hyperparameters, please, refer to the related template. We trained each model with single Nvidia GeForce RTX3090.
Expand All @@ -77,6 +83,8 @@ We support the following ready-to-use model templates:
.. +---------------------------+--------------+------------+-----------------+
.. | MaskRCNN-ResNet50 | N/A | N/A | N/A |
.. +---------------------------+--------------+------------+-----------------+
.. | MaskRCNN-ConvNeXt | N/A | N/A | N/A |
.. +---------------------------+--------------+------------+-----------------+
.. *******************
.. Tiling Pipeline
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -136,6 +136,7 @@ The list of supported templates for instance segmentation is available with the
+-----------------------+----------------------------------------------------------------+--------------------------+---------------------------------------------------------------------------------------------------+
| INSTANCE_SEGMENTATION | Custom_Counting_Instance_Segmentation_MaskRCNN_ResNet50 | MaskRCNN-ResNet50 | src/otx/algorithms/detection/configs/instance_segmentation/resnet50_maskrcnn/template.yaml |
| INSTANCE_SEGMENTATION | Custom_Counting_Instance_Segmentation_MaskRCNN_EfficientNetB2B | MaskRCNN-EfficientNetB2B | src/otx/algorithms/detection/configs/instance_segmentation/efficientnetb2b_maskrcnn/template.yaml |
| INSTANCE_SEGMENTATION | Custom_Counting_Instance_Segmentation_MaskRCNN_ConvNeXt | MaskRCNN-ConvNeXt | src/otx/algorithms/detection/configs/instance_segmentation/convnext_maskrcnn/template.yaml |
+-----------------------+----------------------------------------------------------------+--------------------------+---------------------------------------------------------------------------------------------------+
2. We need to create
Expand Down
2 changes: 1 addition & 1 deletion requirements/base.txt
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ natsort>=6.0.0
prettytable
protobuf>=3.20.0
pyyaml
datumaro==1.3.2
datumaro@ git+https://github.com/openvinotoolkit/datumaro@3e77b3138d063db68a4efba3c03a6bac7df086b1#egg=datumaro
psutil
scipy>=1.8
bayesian-optimization>=1.2.0
Expand Down
8 changes: 4 additions & 4 deletions requirements/openvino.txt
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
# # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # #
# OpenVINO Requirements. #
nncf==2.4.0
nncf==2.5.0
onnx==1.13.0
openmodelzoo-modelapi==2022.3.0
openvino==2022.3.0
openvino-dev==2022.3.0
openvino-model-api==0.1.2
openvino==2023.0
openvino-dev==2023.0
openvino-telemetry>=2022.1.0
1 change: 1 addition & 0 deletions src/otx/algorithms/action/adapters/mmaction/task.py
Original file line number Diff line number Diff line change
Expand Up @@ -467,6 +467,7 @@ def dummy_dump_saliency_hook(model, inp, out):

def _export_model(self, precision: ModelPrecision, export_format: ExportType, dump_features: bool):
"""Main export function."""
self._data_cfg = None
self._init_task(export=True)

cfg = self.configure(False, "test", None)
Expand Down
9 changes: 4 additions & 5 deletions src/otx/algorithms/action/adapters/openvino/dataloader.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,13 +18,12 @@
from typing import Dict, List

import numpy as np
from compression.api import DataLoader

from otx.api.entities.annotation import AnnotationSceneEntity
from otx.api.entities.datasets import DatasetEntity, DatasetItemEntity


def get_ovdataloader(dataset: DatasetEntity, task_type: str, clip_len: int, width: int, height: int) -> DataLoader:
def get_ovdataloader(dataset: DatasetEntity, task_type: str, clip_len: int, width: int, height: int):
"""Find proper dataloader for dataset and task type.
If dataset has only a single video, this returns DataLoader for online demo
Expand All @@ -49,7 +48,7 @@ def _is_multi_video(dataset: DatasetEntity) -> bool:
return False


class ActionOVDemoDataLoader(DataLoader):
class ActionOVDemoDataLoader:
"""DataLoader for online demo purpose.
Since it is for online demo purpose it selects background frames from neighbor of key frame
Expand Down Expand Up @@ -91,7 +90,7 @@ def add_prediction(self, data: List[DatasetItemEntity], prediction: AnnotationSc
dataset_item.append_annotations(prediction.annotations)


class ActionOVClsDataLoader(DataLoader):
class ActionOVClsDataLoader:
"""DataLoader for evaluation of action classification models.
It iterates through clustered video, and it samples frames from given video
Expand Down Expand Up @@ -151,7 +150,7 @@ def add_prediction(self, dataset: DatasetEntity, data: List[DatasetItemEntity],
dataset_item.append_labels(prediction.annotations[0].get_labels())


class ActionOVDetDataLoader(DataLoader):
class ActionOVDetDataLoader:
"""DataLoader for evaluation of spatio-temporal action detection models.
It iterates through DatasetEntity, which only contains non-empty frame(frame with actor annotation)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -19,22 +19,16 @@
from typing import Any, Dict, List

import numpy as np
from openvino.model_api.adapters import OpenvinoAdapter
from openvino.model_api.models.model import Model
from openvino.model_api.models.utils import (
RESIZE_TYPES,
Detection,
InputTransform,
)

from otx.api.entities.datasets import DatasetItemEntity

try:
from openvino.model_zoo.model_api.adapters import OpenvinoAdapter
from openvino.model_zoo.model_api.models.model import Model
from openvino.model_zoo.model_api.models.utils import (
RESIZE_TYPES,
Detection,
InputTransform,
)
except ImportError as e:
import warnings

warnings.warn(f"{e}, ModelAPI was not found.")


def softmax_numpy(x: np.ndarray):
"""Softmax numpy."""
Expand Down
Loading

0 comments on commit 761806d

Please sign in to comment.