This is the official PyTorch implementation for the paper:
Zhen Tian, Ting Bai, Zibin Zhang, Zhiyuan Xu, Kangyi Lin, Ji-Rong Wen and Wayne Xin Zhao. Directed Acyclic Graph Factorization Machines for CTR Prediction via Knowledge Distillation. WSDM 2023.
we propose a Directed Acyclic Graph Factorization Machine (KD-DAGFM) to learn the high-order feature interactions from existing complex interaction models for CTR prediction via Knowledge Distillation. The proposed lightweight student model DAGFM can learn arbitrary explicit feature interactions from teacher networks, which achieves approximately lossless performance and is proved by a dynamic programming algorithm.
tensorflow==2.4.1
python==3.7.3
cudatoolkit==11.3.1
pytorch==1.11.0
Please download the datasets from Criteo, Avazu and MovieLens-1M, put them in the /DataSource folder.
Pre-process the data.
python DataSource/[dataset]_parse.py
Then divide the dataset.
python DataSource/split.py
python train.py --config_files=[dataset]_kd_dagfm.yaml --phase=teacher_training
python train.py --config_files=[dataset]_kd_dagfm.yaml --phase=distillation --warm_up=/Saved/[teacher_file]
python train.py --config_files=[dataset]_kd_dagfm.yaml --phase=finetuning --warm_up=/Saved/[Student_file]
Zhen Tian. If you have any questions, please contact [email protected].
If you find DAGFM useful for your research or development, please cite the following papers: DAGFM.
@inproceedings{tian2023directed,
title={Directed Acyclic Graph Factorization Machines for CTR Prediction via Knowledge Distillation},
author={Tian, Zhen and Bai, Ting and Zhang, Zibin and Xu, Zhiyuan and Lin, Kangyi and Wen, Ji-Rong and Zhao, Wayne Xin},
booktitle={Proceedings of the Sixteenth ACM International Conference on Web Search and Data Mining},
pages={715--723},
year={2023}
}