Skip to content

Latest commit

 

History

History
10 lines (7 loc) · 845 Bytes

ModelCard.md

File metadata and controls

10 lines (7 loc) · 845 Bytes

Model Card

This page lists the MaskAlign model weights. CLIP-L/14* denotes input 196 × 196 resolution image to CLIP-L/14. This will keep the same feature map size as the student model. PT epochs and FT Acc denotes pre-training epochs and fine-tuning accuracy on ImageNet-1K, respectively.

Model Teacher Model PT epochs Link FT Acc.
ViT-B/16 CLIP-B/16 200 gdrive 85.4
ViT-L/16 CLIP-B/16 200 gdrive 86.5
ViT-L/16 CLIP-L/14* 200 gdrive 87.4