WFlib is a Pytorch-based open-source library for website fingerprinting attacks, intended for research purposes only.
Website fingerprinting is a type of network attack in which an adversary attempts to deduce which website a user is visiting based on encrypted traffic patterns, even without directly seeing the content of the traffic.
We provide a neat code base to evaluate 11 advanced DL-based WF attacks on multiple datasets. This library is derived from our ACM CCS 2024 paper. If you find this repo useful, please cite our paper.
@inproceedings{deng2024wflib,
title={Robust and Reliable Early-Stage Website Fingerprinting Attacks via Spatial-Temporal Distribution Analysis},
author={Deng, Xinhao and Li, Qi and Xu, Ke},
booktitle={Proceedings of the 2024 ACM SIGSAC Conference on Computer and Communications Security},
year={2024}
}
Contributions via pull requests are welcome and appreciated.
The code library includes 11 DL-based website fingerprinting attacks.
We implemented all attacks using the same framework (Pytorch) and a consistent coding style, enabling researchers to evaluate and compare existing attacks easily.
git clone [email protected]:Xinhao-Deng/Website-Fingerprinting-Library.git
pip install --user .
Note
- Python 3.8 is required.
mkdir datasets
- Download datasets (link) and place it in the folder
./datasets
Datasets | # of monitored websites | # of instances | Intro |
---|---|---|---|
CW.npz | 95 | 105730 | Closed-world dataset. Details |
OW.npz | 95 | 146446 | Open-world dataset. Details |
WTF-PAD.npz | 95 | 105730 | Dataset with WTF-PAD defense. Details |
Front.npz | 95 | 95000 | Dataset with Front defense. Details |
Walkie-Talkie.npz | 100 | 90000 | Dataset with Walkie-Talkie defense. Details |
TrafficSliver.npz | 95 | 95000 | Dataset with TrafficSliver defense. Details |
NCDrift_sup.npz | 93 | 21430 | Network condition drift dataset, including superior traces. Details |
NCDrift_inf.npz | 93 | 6882 | Network condition drift dataset, including inferior traces. Details |
Closed_2tab.npz | 100 | 58000 | 2-tab dataset in the closed-world scenario. Details |
Closed_3tab.npz | 100 | 58000 | 3-tab dataset in the closed-world scenario. Details |
Closed_4tab.npz | 100 | 58000 | 4-tab dataset in the closed-world scenario. Details |
Closed_5tab.npz | 100 | 58000 | 5-tab dataset in the closed-world scenario. Details |
Open_2tab.npz | 100 | 64000 | 2-tab dataset in the open-world scenario. Details |
Open_3tab.npz | 100 | 64000 | 3-tab dataset in the open-world scenario. Details |
Open_4tab.npz | 100 | 64000 | 4-tab dataset in the open-world scenario. Details |
Open_5tab.npz | 100 | 64000 | 5-tab dataset in the open-world scenario. Details |
-
The extracted dataset is in npz format and contains two values: X and y. X represents the cell sequence, with values being the direction (e.g., 1 or -1) multiplied by the timestamp. y corresponds to the labels. Note that the input of some datasets consists only of direction sequences.
-
Divide the dataset into training, validation, and test sets.
# For single-tab datasets
python exp/dataset_process/dataset_split.py --dataset CW
# For multi-tab datasets
python exp/dataset_process/dataset_split.py --dataset Closed_2tab --use_stratify False
We provide all experiment scripts for WF attacks in the folder ./scripts/
. For example, you can reproduce the DF attack on the CW dataset by executing the following command.
bash scripts/DF.sh
The ./scripts/DF.sh
file contains the commands for model training and evaluation.
dataset=CW
python -u exp/train.py \
--dataset ${dataset} \
--model DF \
--device cuda:1 \
--feature DIR \
--seq_len 5000 \
--train_epochs 30 \
--batch_size 128 \
--learning_rate 2e-3 \
--optimizer Adamax \
--eval_metrics Accuracy Precision Recall F1-score \
--save_metric F1-score \
--save_name max_f1
python -u exp/test.py \
--dataset ${dataset} \
--model DF \
--device cuda:1 \
--feature DIR \
--seq_len 5000 \
--batch_size 256 \
--eval_metrics Accuracy Precision Recall F1-score \
--load_name max_f1
The meanings of all parameters can be found in the exp/train.py
and exp/test.py
files. WFlib supports modifying parameters to easily implement different attacks. Moreover, you can use WFlib to implement combinations of different attacks or perform ablation analysis.
If you have any questions or suggestions, feel free to contact:
- Xinhao Deng ([email protected])
- Yixiang Zhang ([email protected])
We would like to thank all the authors of the referenced papers.