Official implementation of "Membership Inference Attacks Against Self-supervised Speech Models" [arXiv]. Published at Interspeech 2022
In this work, we demonstrate that existing self-supervised speech model such as HuBERT, wav2vec 2.0, CPC and TERA are vulnerable to membership inference attack (MIA) and thus could reveal sensitive informations related to the training data.
- Python >= 3.6
- Install sox on your OS
- Install s3prl on your OS
git clone https://github.com/s3prl/s3prl
cd s3prl
pip install -e ./
- Install the specific fairseq
pip install fairseq@git+https://github.com//pytorch/fairseq.git@f2146bdc7abf293186de9449bfa2272775e39e1d#egg=fairseq
First, extract the self-supervised feature of utterances in each corpus according to your needs.
Currently, only LibriSpeech is available.
BASE_PATH=/path/of/the/corpus
OUTPUT_PATH=/path/to/save/feature
MODEL=wav2vec2
SPLIT=train-clean-100 # you should extract train-clean-100, dev-clean, dev-other, test-clean, test-other
python preprocess_feature_LibriSpeech.py \
--base_path $BASE_PATH \
--output_path $OUTPUT_PATH \
--model $MODEL \
--split $SPLIT
After extracting the features, you can apply the attack against the models using either basic attack and improved attack.
Noted that you should run the basic attack to generate the .csv file with similarity scores before performing improved attack.
SEEN_BASE_PATH=/path/you/save/feature/of/seen/corpus
UNSEEN_BASE_PATH=/path/you/save/feature/of/unseen/corpus
OUTPUT_PATH=/path/to/output/results
MODEL=wav2vec2
python predefined-speaker-level-MIA.py \
--seen_base_path $SEEN_BASE_PATH \
--unseen_base_path $UNSEEN_BASE_PATH \
--output_path $OUTPUT_PATH \
--model $MODEL \
python train-speaker-level-similarity-model.py \
--seen_base_path $SEEN_BASE_PATH \
--output_path $OUTPUT_PATH \
--model $MODEL \
--speaker_list "${OUTPUT_PATH}/${MODEL}-customized-speaker-level-attack-similarity.csv"
python customized-speaker-level-MIA.py \
--seen_base_path $SEEN_BASE_PATH \
--unseen_base_path $UNSEEN_BASE_PATH \
--output_path $OUTPUT_PATH \
--model $MODEL \
--similarity_model_path "${OUTPUT_PATH}/customized-speaker-similarity-model-${MODEL}.pt"
The process for utterance-level MIA is similar to that of speaker-level:
SEEN_BASE_PATH=/path/you/save/feature/of/seen/corpus
UNSEEN_BASE_PATH=/path/you/save/feature/of/unseen/corpus
OUTPUT_PATH=/path/to/output/results
MODEL=wav2vec2
python predefined-utterance-level-MIA.py \
--seen_base_path $SEEN_BASE_PATH \
--unseen_base_path $UNSEEN_BASE_PATH \
--output_path $OUTPUT_PATH \
--model $MODEL \
python train-utterance-level-similarity-model.py \
--seen_base_path $SEEN_BASE_PATH \
--output_path $OUTPUT_PATH \
--model $MODEL \
--speaker_list "${OUTPUT_PATH}/${MODEL}-customized-utterance-level-attack-similarity.csv"
python customized-utterance-level-MIA.py \
--seen_base_path $SEEN_BASE_PATH \
--unseen_base_path $UNSEEN_BASE_PATH \
--output_path $OUTPUT_PATH \
--model $MODEL \
--similarity_model_path "${OUTPUT_PATH}/customized-utterance-similarity-model-${MODEL}.pt"
If you find our work useful, please cite:
@article{tseng2021membership,
title={Membership Inference Attacks Against Self-supervised Speech Models},
author={Tseng, Wei-Cheng and Kao, Wei-Tsung and Lee, Hung-yi},
journal={arXiv preprint arXiv:2111.05113},
year={2021}
}