You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thanks for your code. I have studied your code. The training process contains two stages. Firstly, the teacher and student networks are separately trained based on the same dataset. Then, the distillation is performed based on the pre-trained teacher and student networks.
Could you tell me this is right?
Thank you.
The text was updated successfully, but these errors were encountered:
First, the Teacher netowrk is trained based on the dataset, and then we will freeze the Teacher's weight, finally the random initialized student will be trained by the GT and teacher.
Thanks for your code. I have studied your code. The training process contains two stages. Firstly, the teacher and student networks are separately trained based on the same dataset. Then, the distillation is performed based on the pre-trained teacher and student networks.
Could you tell me this is right?
Thank you.
The text was updated successfully, but these errors were encountered: