We've pre-trained YOLO-World-T/M/L from scratch and evaluate on the MSCOCO2017 val
.
model | Params | FLOPs | ||||||
---|---|---|---|---|---|---|---|---|
Mamba YOLO-T | 5.8M | 13.2G | 44.5 | 61.2 | 48.2 | 24.7 | 48.8 | 62.0 |
Mamba YOLO-M | 19.1M | 45.4G | 49.1 | 66.5 | 53.5 | 30.6 | 54.0 | 66.4 |
Mamba YOLO-L | 57.6M | 156.2G | 52.1 | 69.8 | 56.5 | 34.1 | 57.3 | 68.1 |
Mamba YOLO is developed based on torch==2.3.0
pytorch-cuda==12.1
and CUDA Version==12.6
.
git clone https://github.com/HZAI-ZJNU/Mamba-YOLO.git
conda create -n mambayolo -y python=3.11
conda activate mambayolo
pip3 install torch===2.3.0 torchvision torchaudio
pip install seaborn thop timm einops
cd selective_scan && pip install . && cd ..
pip install -v -e .
Make sure your dataset structure as follows:
├── coco
│ ├── annotations
│ │ ├── instances_train2017.json
│ │ └── instances_val2017.json
│ ├── images
│ │ ├── train2017
│ │ └── val2017
│ ├── labels
│ │ ├── train2017
│ │ ├── val2017
python mbyolo_train.py --task train --data ultralytics/cfg/datasets/coco.yaml \
--config ultralytics/cfg/models/mamba-yolo/Mamba-YOLO-T.yaml \
--amp --project ./output_dir/mscoco --name mambayolo_n
This repo is modified from open source real-time object detection codebase Ultralytics. The selective-scan from VMamba.
If you find Mamba-YOLO is useful in your research or applications, please consider giving us a star 🌟 and citing it.
@misc{wang2024mambayolossmsbasedyolo,
title={Mamba YOLO: SSMs-Based YOLO For Object Detection},
author={Zeyu Wang and Chen Li and Huiying Xu and Xinzhong Zhu},
year={2024},
eprint={2406.05835},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2406.05835},
}