Follow the process of UPT.
The downloaded files should be placed as follows. Otherwise, please replace the default path to your custom locations.
|- EZ-HOI
| |- hicodet
| | |- hico_20160224_det
| | |- annotations
| | |- images
| |- vcoco
| | |- mscoco2014
| | |- train2014
| | |-val2014
: :
-
Follow the environment setup in UPT.
-
Follow the environment setup in ADA-CM.
-
run the python file to obtain the pre-extracted CLIP image features
python CLIP_hicodet_extract.py
Remember to make sure the correct path for annotation files and datasets.
|- EZ-HOI
| |- hicodet_pkl_files
| | |- clip336_img_hicodet_test
| | |- clip336_img_hicodet_train
| | |- clipbase_img_hicodet_test
| | |- clipbase_img_hicodet_train
| |- vcoco_pkl_files
| | |- clip336_img_vcoco_train
| | |- clip336_img_vcoco_val
: :
bash scripts/hico_train_vitB_zs.sh
bash scripts/hico_test_vitB_zs.sh
Dataset | Setting | Backbone | mAP | Unseen | Seen |
---|---|---|---|---|---|
HICO-DET | UV | ResNet-50+ViT-B | 32.32 | 25.10 | 33.49 |
HICO-DET | UV | ResNet-50+ViT-L | 36.84 | 28.82 | 38.15 |
HICO-DET | RF | ResNet-50+ViT-B | 33.13 | 29.02 | 34.15 |
HICO-DET | RF | ResNet-50+ViT-L | 36.73 | 34.24 | 37.35 |
HICO-DET | NF | ResNet-50+ViT-B | 31.17 | 33.66 | 30.55 |
HICO-DET | NF | ResNet-50+ViT-L | 34.84 | 36.33 | 34.47 |
HICO-DET | UO | ResNet-50+ViT-B | 32.27 | 33.28 | 32.06 |
HICO-DET | UO | ResNet-50+ViT-L | 36.38 | 38.17 | 36.02 |
Dataset | Setting | Backbone | mAP | Rare | Non-rare |
---|---|---|---|---|---|
HICO-DET | default | ResNet-50+ViT-L | 38.61 | 37.70 | 38.89 |
If you find our paper and/or code helpful, please consider citing :
@inproceedings{
lei2024efficient,
title={EZ-HOI: VLM Adaptation via Guided Prompt Learning for Zero-Shot HOI Detection},
author={Lei, Qinqian and Wang, Bo and Robby T., Tan},
booktitle={The Thirty-eighth Annual Conference on Neural Information Processing Systems},
year={2024}
}
We gratefully thank the authors from UPT and ADA-CM for open-sourcing their code.