Teaching VLMs to Localize Specific Objects from In-context Examples (IPLoc)

Environment Setup:

Prepare the Qwen2-VL environment as shown in Qwen2VL Env

Download and place the images in the respective folders:
- per_seg: Download PerSeg images
- LASOT: Download LASOT images
- frames(PDM): Download PDM images

The folder structure should look like this:

data/
├── ICL_tracking/
    └── video/
        ├── frames/
        ├── per_seg/
        └── LASOT/

Download our model from QWEN2-VL-ICL-LOC

To evaluate the ICLoc model, use the following command:

python Loc_Qwen2VL7B.py --data_path ./data/test_data_path.json --name IPLocEval --lora_weights_path lora_pth_to_model

Test data JSON files are in the data directory, including ICL - PDM, LASOT, and PerSeg.

Name		Name	Last commit message	Last commit date
Latest commit History 47 Commits
data		data
docs		docs
images		images
Loc_Qwen2VL7B.py		Loc_Qwen2VL7B.py
README.md		README.md
loc_dataset.py		loc_dataset.py
utils_qwen.py		utils_qwen.py