Changelog
v2.14
-
Update to use Dataflow Compiler v3.30.0 (
developer-zone <https://hailo.ai/developer-zone/>
_) -
Update to use HailoRT 4.20.0 (
developer-zone <https://hailo.ai/developer-zone/>
_) -
New cascade API (experimental)
-
Currently supports PETRv2, bird-eye-view network for 3D object detection, see
petrv2_repvggB0.yaml
for configurations. -
The user needs existing hars/hefs: both
petrv2_repvggB0_backbone_pp_800x320
&petrv2_repvggB0_transformer_pp_800x320
-
full_precision evaluation:
hailomz cascade eval petrv2
-
hardware evaluation:
hailomz cascade eval petrv2 --override target=hardware
-
-
New task:
-
Human Action Recognition
-
Added support for (partial) Kinetics-400 dataset
-
Added r3d_18 to support this task
-
-
-
New Models:
YOLOv11 <https://arxiv.org/pdf/2410.17725>
_ - nano, small, medium, large, x-large - Latest YOLO detectorsCLIP <https://arxiv.org/pdf/2103.00020>
_ ViT-Large-14-Laion2B - Contrastive Language-Image Pre-training model [H15H and H10H only]SWIN <https://arxiv.org/pdf/2103.14030>
_ - tiny, small - Shifted-Windows Transformer based classification modelDaViT <https://arxiv.org/pdf/2204.03645>
_ - tiny - Dual Attention Vision Transformer classification model [H15H and H10H only]LeViT <https://arxiv.org/pdf/2104.01136>
_ - levit128, levit192, levit384 - Transformer based classification modelEfficientFormer <https://arxiv.org/pdf/2212.08059>
_ - l1 - Transformer based classification modelReal-ESRGAN <https://arxiv.org/pdf/2107.10833>
_ - x2 - Super Resolution modelR3D_18 <https://pytorch.org/vision/stable/models.html#video-classification>
_ - r3d_18 - Video Classification network for Human Action Recognition [H8 only]
-
Bug fixes
v2.13
-
Update to use Dataflow Compiler v3.29.0 (
developer-zone <https://hailo.ai/developer-zone/>
_) -
Update to use HailoRT 4.19.0 (
developer-zone <https://hailo.ai/developer-zone/>
_) -
Using jit_compile which reduces dramatically the emulation inference time of the Hailo Model Zoo models.
-
New tasks:
-
BEV: Multi-View 3D Object Detection
-
Added support for NuScenes dataset
-
Added PETRv2 with the following configuration:
-
Backbone: RepVGG-B0 (800x320 input resolution)
-
Transformer: 3 decoder layers, detection queries=304, replaced LN with UN
-
-
-
-
New Models:
CAS-ViT <https://arxiv.org/pdf/2408.03703>
_ - S, M, T - Convolutional-Attention based classification modelYOLOv10 <https://arxiv.org/pdf/2405.14458>
_ - base, x-large - Latest YOLO detectorsCLIP <https://arxiv.org/pdf/2103.00020>
_ Text Encoders - ResNet50x4, ViT-Large
-
New retraining Docker containers for:
- PETR - Multi-View 3D Object Detection
-
Introduced new flags for hailomz CLI:
--ap-per-class
for measuring average-precision per-class. Relevant for object detection and instance segmentation tasks.
-
Bug fixes