Skip to content

Latest commit

 

History

History
43 lines (34 loc) · 3 KB

vitkd.md

File metadata and controls

43 lines (34 loc) · 3 KB

ViTKD

Paper: ViTKD: Practical Guidelines for ViT feature knowledge distillation

architecture

Train

#multi GPU
bash tools/dist_train.sh configs/distillers/imagenet/deit-s3_distill_deit-t_img.py 4

Transfer

# Tansfer the Distillation model into mmcls model
python pth_transfer.py --dis_path $dis_ckpt --output_path $new_mmcls_ckpt

Test

#multi GPU
bash tools/dist_test.sh configs/deit/deit-tiny_pt-4xb256_in1k.py $new_mmcls_ckpt 8 --metrics accuracy

Results

Model Teacher T_weight Baseline ViTKD weight ViTKD+NKD weight dis_config
DeiT-Tiny DeiT III-Small baidu/one drive 74.42 76.06 (+1.64) baidu/one drive 77.78 (+3.36) baidu/one drive config
DeiT-Small DeiT III-Base baidu/one drive 80.55 81.95 (+1.40) baidu/one drive 83.59 (+3.04) baidu/one drive config
DeiT-Base DeiT III-Large baidu/one drive 81.76 83.46 (+1.70) baidu/one drive 85.41 (+3.65) baidu/one drive config

Citation

@article{yang2022vitkd,
  title={ViTKD: Practical Guidelines for ViT feature knowledge distillation},
  author={Yang, Zhendong and Li, Zhe and Zeng, Ailing and Li, Zexian and Yuan, Chun and Li, Yu},
  journal={arXiv preprint arXiv:2209.02432},
  year={2022}
}