Harnessing the Power of Large Vision Language Models for Synthetic Image Detection

This repository is an official implementation of the ICASSP 2024 paper "Harnessing the Power of Large Vision Language Models for Synthetic Image Detection".

☀️ If you find this work useful for your research, please kindly star our repo and cite our paper! ☀️

Low Rank Adaptation

Requirements

pip install -r requirements.txt

SOTA Detection Methods

We use the codes of detection methods provided in the corresponding paper.

Training (Optional)

This step can be skipped, and you can directly test the model in the following section with a pre-trained model.

To train your own model:

python blip2_detect.py --dataset ./data/train.csv --epochs 20 --lr 5e-5

Evaluation

To run the test on specific dataset, use the following command:

python blip2_test.py --model_path ./weights/ldmFineTune --dataset ./data/test.csv

Or Run the test on all the testing subset

sh evaluation.sh

Performance

After training for 20 epochs, you will obtain accuracy and F1-score scores close to the percentages below:

{'LDM' : 99.12/99.13, 'ADM' : 85.24/82.97, 'DDPM' : 98.47/98.47, 'IDDPM' : 97.02/96.97, 'PNDM' : 99.22/99.23, 'SD v1.4' 77.68/71.79: , 'GLIDE' : 97.09/97.05}

Dataset

The dataset used in this project is sourced from the work of Towards the Detection of Diffusion Model Deepfakes, available at Link to Original Dataset Repository.

📖 Citation

if you make use of our work, please cite our papers

@article{keita2024harnessing,
  title={Harnessing the Power of Large Vision Language Models for Synthetic Image Detection},
  author={Keita, Mamadou and Hamidouche, Wassim and Bougueffa, Hassen and Hadid, Abdenour and Taleb-Ahmed, Abdelmalik},
  journal={arXiv preprint arXiv:2404.02726},
  year={2024}
}

@article{keita2024bi,
  title={Bi-LORA: A Vision-Language Approach for Synthetic Image Detection},
  author={Keita, Mamadou and Hamidouche, Wassim and Eutamene, Hessen Bougueffa and Hadid, Abdenour and Taleb-Ahmed, Abdelmalik},
  journal={arXiv preprint arXiv:2404.01959},
  year={2024}
}

--- Thanks for your interest! ---

statistics

Name		Name	Last commit message	Last commit date
Latest commit History 47 Commits
SaveFineTune		SaveFineTune
assets		assets
weights		weights
README.md		README.md
blip2_detect.py		blip2_detect.py
blip2_test.py		blip2_test.py
dataset.py		dataset.py
evaluation.sh		evaluation.sh
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Harnessing the Power of Large Vision Language Models for Synthetic Image Detection

Low Rank Adaptation

Requirements

SOTA Detection Methods

Training (Optional)

Evaluation

Performance

Dataset

📖 Citation

--- Thanks for your interest! ---

About

Releases

Packages

Languages

Mamadou-Keita/VLM-DETECT

Folders and files

Latest commit

History

Repository files navigation

Harnessing the Power of Large Vision Language Models for Synthetic Image Detection

Low Rank Adaptation

Requirements

SOTA Detection Methods

Training (Optional)

Evaluation

Performance

Dataset

📖 Citation

--- Thanks for your interest! ---

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages