FakeSound：Deepfake General Audio Detection

Here we present our framework for Deepfake General Audio Detection, which aims to identify whether the audio is genuine or deepfake and to locate deepfake regions. Specifically, we:

Propose the task of deepfake general audio detection and established a benchmark for evaluation.
Design an audio manipulation pipeline to regenerate key regions, resulting in a large quantity of convincingly realistic deepfake general audio.
Provide a dataset, FakeSound, for training and evaluation for deepfake general audio detection task.
Propose a deepfake detection model which outperforms the state-of-the-art models in previous speech deepfake competitions and human beings.

Install dependencies

Install dependencies:

git clone https://github.com/FakeSoundData/FakeSound
conda install --yes --file requirements.txt

Install pre-trained models EAT into the model/ directory.

cd models
mkdir EAT
cd EAT
git clone https://github.com/pytorch/fairseq
cd fairseq
pip install --editable ./
git clone https://github.com/cwx-worst-one/EAT

Data Preparation

Due to copyright issues, we are unable to provide the original AudioCaps audio data. You can download the raw audio from AudioCaps. The manipulated audio can be downloaded from (1) HuggingfaceDataset or (2) FakeSound, with the extraction code "fake".

We provide the results of the Grounding model for key region detection. You can also reproduce FakeSound dataset by regenerating key regions based on the results of the grounding, using audio generation models AudioLDM/AudioLDM2 and super resolution model AudioSR.
The metadata for the training and test sets is contained in the file "deepfake_data/{}.json", where

the "audio_id" format is {AudioCaps_id}{onset}{offset} or {AudioCaps_id},
the "label" is "0" for deepfake audio, with reconstructed regions indicated as "onset_offset".

Train & Inference

The training and testing codes are named train.py and inference.py, respectively. You need to modify the WORKSPACE_PATH inside them to match your own directory path.

  python train.py --train_file FakeSound/meta_data/train.json
  python inference.py

Acknowledgement

Our code referred to the DKU speech deepfake detection, EAT, AudioLDM, and AudioLDM2. We appreciate their open-sourcing of their code.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

FakeSound：Deepfake General Audio Detection

Install dependencies

Data Preparation

Train & Inference

Acknowledgement

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
FakeSound/meta_data		FakeSound/meta_data
models		models
README.md		README.md
eval_util.py		eval_util.py
inference.py		inference.py
requirements.txt		requirements.txt
train.py		train.py

FakeSoundData/FakeSound

Folders and files

Latest commit

History

Repository files navigation

FakeSound：Deepfake General Audio Detection

Install dependencies

Data Preparation

Train & Inference

Acknowledgement

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages