Please visit the challenge website for more information about the Challenge.
The anonymization and evaluation scripts should have generated the files and the directories with the explained format of $anon_data_suffix
suffix.
For data submission, the following command submit everything given a $anon_data_suffix
argument:
VPC_DROPBOX_KEY=XXX VPC_DROPBOX_SECRET=YYY VPC_DROPBOX_REFRESHTOKEN=ZZZ VPC_TEAM=TEAM_NAME ./03_upload_submission.sh $anon_data_suffix
VPC_DROPBOX_KEY
, VPC_DROPBOX_SECRET
, VPC_DROPBOX_REFRESHTOKEN
, and VPC_TEAM=TEAM_NAME
are sent individually to each team upon receiving their system description.
git clone https://github.com/Voice-Privacy-Challenge/Voice-Privacy-Challenge-2024.git
./00_install.sh
source env.sh
./01_download_data_model.sh
A password is required; please register to get the password.
You can modify the librispeech_corpus
variable of ./01_download_data_model.sh
to avoid downloading LibriSpeech 360.
You have to modify the iemocap_corpus
variable of ./01_download_data_model.sh
to where it is located on your server.
Important
The IEMOCAP corpus must be downloaded on your own by submitting a request at https://sail.usc.edu/iemocap/iemocap_release.htm. The waiting time may take up to 7-9 days.
There are two options:
-
Run anonymization and evaluation:
./02_run.sh configs/anon_mcadams.yaml
.
For each anonymization baseline, there is a corresponding config file:-
configs/anon_mcadams.yaml
A fast CPU-only signal processing-based system (default). -
configs/anon_sttts.yaml
A system based on unmodified phone sequence, modified prosody, modified speaker embedding representations and speech synthesis. -
configs/anon_asrbn.yaml
A fast system based on vector quantized acoustic bottleneck, pitch, and one-hot speaker representations and a HiFi-GAN speech synthesis model. -
anonymization scripts from https://github.com/Voice-Privacy-Challenge/Voice-Privacy-Challenge-2022 can be used to obtain anonymized data for B1. To perform utterance-level (in contrast to speaker-level) anonymization of the enrollment and trial data for B1, the corresponding parameters should be setup in https://github.com/Voice-Privacy-Challenge/Voice-Privacy-Challenge-2022/blob/master/baseline/config.sh:
anon_level_trials=utt
andanon_level_enroll=utt
(https://github.com/Voice-Privacy-Challenge/Voice-Privacy-Challenge-2022/blob/d72b50c44677aa9a1ba37b7f0c383c4fde13e05f/baseline/config.sh#L59-L60)
-
-
Run anonymization and evaluation separately in two steps:
python run_anonymization.py --config configs/anon_mcadams.yaml #Computational time varies from 30 minutes to 10 hours, depending on the number of cores, for other methods it may be longer and depending on the available hardware.
The anonymized audios will be saved in $data_dir=data
into 9 folders corresponding to datasets.
The names of the created dataset folders for anonymized audio files are appended with the suffix, i.e. $anon_data_suffix=_mcadams
data/libri_dev_enrolls${anon_data_suffix}/wav/*wav
data/libri_dev_trials_m${anon_data_suffix}/wav/*wav
data/libri_dev_trials_f${anon_data_suffix}/wav/*wav
data/libri_test_enrolls${anon_data_suffix}/wav/*wav
data/libri_test_trials_m${anon_data_suffix}/wav/*wav
data/libri_test_trials_f${anon_data_suffix}/wav/*wav
data/IEMOCAP_dev${anon_data_suffix}/wav/*wav
data/IEMOCAP_test${anon_data_suffix}/wav/*wav
data/train-clean-360${anon_data_suffix}/wav/*wav
For the next evaluation step, you should replicate the corresponding directory structure when developing your anonymization system.
Evaluation metrics include:
- Privacy: Equal error rate (EER) for ignorant, lazy-informed, and semi-informed attackers (only results from the semi-informed attacker will be used in the challenge ranking)
- Utility:
- Word Error Rate (WER) by an automatic speech recognition (ASR) model (trained on LibriSpeech)
- Unweighted Average Recall (UAR) by a speech emotion recognition (SER) model (trained on IEMOCAP).
To run evaluation for arbitrary anonymized data:
- prepare 9 anonymized folders each containing the anonymized wav files:
data/libri_dev_enrolls${anon_data_suffix}/wav/*wav
data/libri_dev_trials_m${anon_data_suffix}/wav/*wav
data/libri_dev_trials_f${anon_data_suffix}/wav/*wav
data/libri_test_enrolls${anon_data_suffix}/wav/*wav
data/libri_test_trials_m${anon_data_suffix}/wav/*wav
data/libri_test_trials_f${anon_data_suffix}/wav/*wav
data/IEMOCAP_dev${anon_data_suffix}/wav/*wav
data/IEMOCAP_test${anon_data_suffix}/wav/*wav
data/train-clean-360${anon_data_suffix}/wav/*wav
- perform evaluations
python run_evaluation.py --config configs/eval_pre.yaml --overwrite "{\"anon_data_suffix\": \"$anon_data_suffix\"}" --force_compute True
python run_evaluation.py --config configs/eval_post.yaml --overwrite "{\"anon_data_suffix\": \"$anon_data_suffix\"}" --force_compute True
- get the final results for ranking
results_summary_path_orig=exp/results_summary/eval_orig${anon_data_suffix}/results_orig.txt # the same value as $results_summary_path in configs/eval_pre.yaml
results_summary_path_anon=exp/results_summary/eval_anon${anon_data_suffix}/results_anon.txt # the same value as $results_summary_path in configs/eval_post.yaml
results_exp=exp/results_summary
{ cat "${results_summary_path_orig}"; echo; cat "${results_summary_path_anon}"; } > "${results_exp}/result_for_rank${anon_data_suffix}"
zip ${results_exp}/result_for_submission${anon_data_suffix}.zip -r exp/asr/*${anon_data_suffix} exp/asr/*${anon_data_suffix}.csv exp/ser/*${anon_data_suffix}.csv exp/results_summary/*${anon_data_suffix}* exp/asv_orig/*${anon_data_suffix} exp/asv_orig/*${anon_data_suffix}.csv exp/asv_anon${anon_data_suffix}
All of the above steps are automated in 02_run.sh.
The result file with all the metrics and all datasets for submission will be generated in:
- Summary results:
./exp/results_summary/result_for_rank$anon_data_suffix
- Additional information for submission:
./exp/results_summary/result_for_submission${anon_data_suffix}.zip
Please see the RESULTS folder for the provided anonymization baselines:
For more details about the baseline and data, please see The VoicePrivacy 2024 Challenge Evaluation Plan - Updated on 1st April 2024
Final list of models and data for training anonymization systems.
Participants are requested to register for the evaluation. Registration should be performed once only for each participating entity using the following form: Registration.
$ASV_{eval}^{anon}$ training is slow
Training of the
- OOM problem when decoding by
$ASR_{eval}$
Reduce the $eval_bachsize in config/eval_pre.yaml
- The
$ASR_{eval}$ is a pretrained wav2vec+ctc trained on LibriSpeech-960h
- Error on
utils.prepare_results_in_kaldi_format
means something bad happened when running the anonymization pipeline.
Remove all data/*$anon_data_suffix
directories and re-run anonymization and evaluation steps (if $anon_data_suffix=suff
, also remove the directories that share a matching suffix: $anon_data_suffix=something_suff
). Check that your anonymization pipeline produces a wav file for each dataset entry, every original wav should have its anonymized counterpart.
- Pierre Champion - Inria, France
- Nicholas Evans - EURECOM, France
- Sarina Meyer - University of Stuttgart, Germany
- Xiaoxiao Miao - Singapore Institute of Technology, Singapore
- Michele Panariello - EURECOM, France
- Massimiliano Todisco - EURECOM, France
- Natalia Tomashenko - Inria, France
- Emmanuel Vincent - Inria, France
- Xin Wang - NII, Japan
- Junichi Yamagishi - NII, Japan
Contact: [email protected]
Copyright (C) 2024
This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with this program. If not, see https://www.gnu.org/licenses/.
@article{tomashenko2024voiceprivacy,
title={The {VoicePrivacy} 2024 Challenge Evaluation Plan},
author={Natalia Tomashenko and Xiaoxiao Miao and Pierre Champion and Sarina Meyer and Xin Wang and Emmanuel Vincent and Michele Panariello and Nicholas Evans and Junichi Yamagishi and Massimiliano Todisco},
year={2024},
eprint={2404.02677},
archivePrefix={arXiv},
primaryClass={eess.AS}
}
Some parts of the code and structure are based on VoicePAT (Paper: https://ieeexplore.ieee.org/document/10365329)