-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reproduce the results of the paper #4
Comments
I set the target rate 0.7 and follow the standard ResNet training procedure. |
Yes, the code should be able to produce the results from the paper. I assume you trained a model based upon ResNet-50? Can you please provide more details? For example, what is the average execution rate of your trained model? |
average execution rate is 0.8585,my batch size is 2048. I remove all the fc1bn. I find that fc1bn will degrade result. top1 error is 25.324 with fc1bn. |
I get 25.32 top-1 error and the average execution rate is 0.8452. The batch size is 512 without fc1bn. |
I can not reproduce the result, either. |
@Goingqs @PerdonLiu the readme says "Specifically, for the results in the paper the following target rate schedules are used for ResNet 50: [1, 1, 0.8, 1, t, t, t, 1, t, t, t, t, t, 1, 0.7, 1] for t in [0.4, 0.5, 0.6, 0.7] " Did you do that, or use target rate 0.7 for all gates? I do not understand how this code allows to have different target rates per layer, the arg parser expects a float and I also cant see adjustment for layer-specifid target rates in other parts of the code where I would expect it. |
Can this code reproduce the results of the paper?I got 24.61% top-1 error.
The text was updated successfully, but these errors were encountered: